Вы находитесь на странице: 1из 991

Preface

The field of statistics is growing at a rapid pace and the rate of publication of the books and papers on applied and theoretical aspects of statistics has been increasing steadily. The last decade has also witnessed the emergence of several new statistics journals to keep pace with the increase in research activity in statistics. With the advance of computer technology and the easy accessibility to statistical packages, more and more scientists in many disciplines have been using statistical techniques in data analysis. Statistics seems to be playing the role of a common denominator among all the scientists besides having profound influence on such matters like public policy. So, there is a great need to have comprehensive self-contained reference books to disseminate information on various aspects of statistical methodology and applications. The series Handbook of Statistics is started in an attempt to fulfill this need. Each volume in the series is devoted to a particular topic in statistics. The material in these volumes is essentially expository in nature and the proofs of the results are, in general, omitted. This series is addressed to the entire community of statisticians and scientists in various disciplines who use statistical methodology in their work. At the same time, special emphasis will be made on applications-oriented techniques with the applied statisticians in mind as the primary audience. It is believed that every scientist interested in statistics will be benefitted by browsing through these volumes. The first volume of the series is devoted to the area of analysis of variance (ANOVA). The field of the ANOVA was developed by R. A. Fisher and others and has emerged as a very important branch of statistics. An attempt has been made to cover most of the useful techniques in univariate and multivariate ANOVA in this volume. Certain other aspects of the ANOVA not covered in this volume due to limitation of space are planned to be included in subsequent volumes since various branches of statistics are interlinked. It is quite fitting that this volume is dedicated to the memory of the late H. Scheff~ who made numerous important contributions to the field of vii

viii

Preface

ANOVA. Scheff6's book The Analysis of Variance has significant impact on the field and his test for multiple comparisons of means of normal populations has been widely used. I wish to thank Professors S. Das Gupta, N. L. Johnson, C. G. Khatfi, K. V. Mardia and N. H. T i m m for serving as members of the editorial board of this volume. Thanks are also due to the contributors to this volume and North-Holland Publishing Company for their excellent cooperation. Professors R. D. Bock, K. C. Chanda, S. Geisser, R. Gnanadesikan, S. J. Haberman, J. C. Lee, G. S. Mudholkar, M. D. Perlman, J. N. K. Rao, P. S. S. Rao and C. R. Rao were kind enough to review various chapters in this volume. I wish to express my appreciation to my distinguished colleague, Professor C. R. Rao, for his encouragement and inspiration. P. R. Krishnaiah

Contributors

T. A. Bancroft, Iowa State University, Ames (Ch. 13) V. P. Bhapkar, University of Kentucky, Lexington (Ch. 11) R. D. Bock, University of Chicago, Chicago (Ch. 23) D. Brandt, University of Chicago, Chicago (Ch. 23) D. R. Brillinger, University of California, Berkeley (Ch. 8) H. Bunke, Akademie der Wissenschaften der D. D. R., Berlin (Ch. 18) S. Das Gupta, University of Minnesota, Minneapolis (Ch. 6) D. A. S. Fraser, University of Toronto, Toronto (Ch. 12) S. Geisser, University of Minnesota, Minneapolis (Ch. 3) R. Gnanadesikan, Bell Telephone Laboratories, Murray Hill (Ch. 5) C. -P. Han, Iowa State University, Ames (Ch. 13) H. L. Harter, Wright-Patterson Air Force Base, Ohio (Ch. 19) P. K. Ito, Nanzan University, Nagoya (Ch.7) A. J. Izenman, Colorado State University, Fort Collins (Ch. 17) G. Kaskey, Univac, Blue Bell (Ch. 10) C, G. Khatri, Gujarat University, Ahmedabad (Ch. 14) J. Kleffe, Akademie der Wissenschaften der D. D. R., Berlin (Ch. 1) B. Kolman, Drexel University, Philadelphia (Ch. 10) P. R. Krishnaiah, University of Pittsburgh, Pittsburgh (Chs. 10, 16,21,24,25) J. C. Lee, Wright State University, Dayton (Ch. 16) K. V. Mardia, University of Leeds, Leeds (Ch. 9) S. K. Mitra, lndian Statistical Institute, New Delhi (Ch. 15) G. S. Mudholkar, University of Rochester, Rochester (Ch. 21) S. J. Press, University of California, Riverside (Ch. 4) C. R. Rao, University of Pittsburgh, Pittsburgh (Ch. 1) A. R. Sampson, University of Pittsburgh, Pittsburgh (Ch. 20) P. K. Sen, University of North Carolina, Chapel Hill (Ch. 22) L. Steinberg, Temple University, Philadelphia (Ch. 10) P. Subbaiah, Oakland University, Rochester (Ch. 21) N. H. Timm, University of Pittsburgh, Pittsburgh (Ch. 2) Mn Yochmowitz, Brooks Air Force Base, Texas (Ch. 25) xvii

P. R. Krishnaiah, ed., Handbook of Statistics, VoL 1 @North-Holland Publishing C o m p a n y (1980) 1-40

1
1

Estimation of Variance Components


C. R a d h a k r i s h n a Rao* and Jiirgen Kleffe

1.

Introduction

The usual mixed linear model discussed in the literature on variance components is
Y = X i ~ + U l t ~ l + + Upep+~
(1.1)

where X, U 1..... Up are known matrices, B is a fixed unknown vector parameter and 91 ..... eOp, e are unobservable random variables (r.v.'s) such that

e(q,i) = o , E(ee') = o2In, E(~?,*;) = o2I,~.

= o,

=o,
(1.2)

The unknown parameters o 0, 2 o2 2 are called variance components. 1..... oj; Some of the early uses of such models are due to Yates and Zacopancy 0935) and Cochran (1939) in survey sampling, Yates (1940) and Rao (1947, 1956) in combining intra and imerblock information in design of experiments, Fairfield Smith (1936), Henderson (1950), Panse (1946) and Rao (1953) in the construction of selection indices in genetics, and Brownlee (1953) in industrial applications. A systematic study of the estimation of variance components was undertaken by Henderson (1953) who proposed three methods of estimation. The general approach in all these papers was to obtain p + 1 quadratic functions of Y, say Y' Qi Y, i = 1.... ,p + 1, which are invariant for translation of Y by X a where a is arbitrary, and solve the equations

Y' Qi Y = E( Y' Qi Y ) = aioo~ + ailo~ + . . . + ao, o~,

i = 0 , 1 ..... p. (1.3)

The work of this author is sponsored by the Air Force Office of Scientific Research, Air Force Systems C o m m a n d under Contract F49620-79-C-0161. Reproduction in whole or in part is permitted for any purpose of the United States Government.

C Radhakrishna Rao and Ji~rgen Kleffe

The method of choosing the quadratic forms was intuitive in nature (see Henderson, 1953) and did not depend on any stated criteria of estimation. The entries in the ANOVA table giving the sums of squares due to different effects were considered as good choices of the quadratic forms in general. The ANOVA technique provides good estimators in what are called balanced designs (see Anderson, 1975; Anderson and Crump, 1967) but, as shown by Seely (1975) such estimators may be inefficient in more general linear models. For a general discussion of Henderson's methods and their advantages (computational simplicity) and limitations (lack of uniqueness, inapplicability and inefficiency in special cases) the reader is referred to papers by Searle (1968, 1971), Seely (1975), Olsen et al. (1976) and Harville (1977, p.335). A completely different approach is the ML (maximum likelihood) method initiated by Hartley and Rao (1967). They considered the likelihood of the unknown parameters fl, o02 ..... o~ based on observed Y and obtained the likelihood equations by computing the derivatives of likelihood with respect to the parameters. Patterson and Thompson (1975) considered the marginal likelihood based on the maximal invariant of Y, i.e., only on B ' Y where B = X (matrix orthogonal to X) and obtained what are called marginal maximum likelihood (MML) equations. Harville (1977) has given a review of the ML and MML methods and the computational algorithms associated with them. ML estimators, though consistent may be heavily biased in small samples so that some caution is needed when they are used as estimates of individual parameters for taking decisions or for using them in the place of true values to obtain an efficient estimate of ft. The problem is not acute if the exact distribution of the ML estimators is known, since in that case appropriate adjustments can be made in the individual estimators before using them. The general large sample properties associated with ML estimators are misleading in the absence of studies on the orders of sample sizes for which these properties hold in particular cases. The bias in MML estimators may not be large even in small samples. As observed earlier, the MML estimator is, by construction, a function of B ' Y the maximal invariant of Y. It turns out that even the full ML estimator is a function of B ' Y although the likelihood is based on Y. There are important practical cases where reduction of Y to B' Y results in non-identifiability of individual parameters, in which case neither the ML nor the MML is applicable. The details are given in Section 5. Rao (1970, 1971a,b, 1972, 1973) proposed a general method ~alled MINQE (minimum norm quadratic estimation) the scope of which has been extended to cover a variety of situations by Focke and Dewess (1972), Kleffe (1975, 1976, 1977a, b, 1978, 1979), J.N.K. Rao (1973), Fuller

Estimation of variance components

and J.N.K. Rao (1978), P.S.R.S. Rao and Chaubey (1978), P.S.R.S. Rao (1977), Pukelsheim (1977, 1978a), Sinha and Wieand (1977) and Rao (1979). The method is applicable to a general linear model
Y~--:~f~'~-E, E(EE/)=O1VI-~- * . . -~-OpVp

(1.4)

where no structure need be imposed on e and no restrictions are placed on 0i or Vi. (In the model (1.1), 0i >10 and V/are non-negative definite.) In the MINQE theory, we define what is called a natural estimator of a linear f u n c t i o n f ' O of 0 in terms of the unobservable r.v. e in (1.4), say e'Ne. Then the estimator Y ' A Y in terms of the observable r.v. Y is obtained by minimizing the norm of the difference between the quadratic forms e'Ne and Y ' A Y = (Xfl + e)'A ( X B + e). The universality of the M I N Q E method as described in Rao (1979) and in this article arises from the following observations: (a) It offers a wide scope in the choice of the norm depending on the nature of the model and prior information available. (b) One or more restrictions such as invariance, unbiasedness and non-negative definiteness can be placed on Y ' A Y depending on the desired properties of the estimators. (c) The method is applicable in situations where ML and M M L fail. (d) There is an automatic provision for incorporating available prior information on the unknown parameters fi and 0. (e) Further, M L and M M L estimators can be exhibited as iterated versions of suitably chosen MINQE's. (f) The M I N Q E equation provides a natural numerical algorithm for computing the ML or M M L estimator. (g) For a suitable choice of the norm, the M I N Q estimators provide minimum variance estimators of 0 when Y is normally distributed. It has been mentioned by some reviewers of the M I N Q E theory that the computations needed for obtaining the M I N Q estimators are somewhat heavy. It is true that the closed form expressions given for MINQE's contain inverses of large order matrices, but they can be computed in a simple way in special cases that arise in practice. The computations in such cases are of the same order of magnitude as obtaining sums of squares in the ANOVA table appropriate for the linear model. It is certainly not true that the computation of MLE or M M L E is simpler than that of MINQE. Both may have the same order of complexity in the general case. Recently, simple numerical techniques for computing MINQE's have been developed by Ahrens (1978), Swallow and Searl (1978) and Ahrens et al. (1979) for the unbalanced random ANOVA model and by Kleffe (1980) for several unbalanced two way classification models. Similar results for

C. Radhakrishna Rao and Jiirgen Kleffe

simple regression models with heteroscedastic variances have been given by Rao (1970) and Kleffe and Z611ner (1978). Infante (1978) investigated the calculation of MINQE's for the random coefficient regression model.

2.

Models of variance and covariance components

2.L

General mode[

There is a large variety of models of variance and covariance components used in research work in biological and behavioral sciences. They can all be considered in a unified frame work under a general G a u s s Markoff (GM),model Y=XB+e (2.1.1)

where Y is n-vector random variable, X is n m matrix, 13 is m-vector parameter and e is n-vector variable. The models differ mainly in the structure imposed on e. The most general formulation is E(e)=0, D(e)=OIV, + . . . +OpV,= V ( 0 ) = Vo (2.1.2) (2.1.3)

where D stands for the dispersion (variance covariance) matrix, 0 ' = (01 ..... 0p) is unknown vector parameter and Vl ..... V e are known symmetric matrices. We let/? ~ R m and 0 ~ ~ (open set) c R p such that V(O)>10 (i.e., nonnegative definite) for each 0 E ~-. In the representation (2.1.3) we have not imposed any restriction such as 0i/> 0 or V, is nonnegative definite. It may be noted that any arbitrary n n dispersion matrix O--(0~/) can be written in the form (2.1.3) ~] ~2 00 VU (2.1.4)

involving a maximum of p = n(n + 1) unknown parameters 0,7 and known matrices V~j, but in models of practical interest p has a relatively small value compared to n. 2.2. Variance components

A special case of the variance components model is when e has the structure e= UI~ 1+ . . . + UpOp (2.2.1)

Estimation of variance components

where U~ is n rni given matrix and ~ is mi-vector r.v. such that E(~i)=0; In such a case E(+i~j)=0,

i=/=j;

E(,#iq~;)= o21,~.

(2.2.2)

V(O) = 01V 1 + . . .

+ Op ~

(2.2.3)

where V,= U i U / > 0 and 0 i = o 2 > 0 . Most of the models discussed in literature are of the type (2.2.1) leading to (2.2.3). The complete G M model when e has the structure (2.2.1) is
Y=x+ u,,~,+ . . - + ~ % ,

E(Oi) = 0;

E(OiOj) = 0,

ivaj;

E(O~O;)=a~I,~.

(2.2.4)

The associated statistical problems are: (a) (b) (c) Estimation of fl, Estimation of a~, i = 1..... p, (2.2.5)

Prediction of q~/, i = 1..... p.

The last problem arises in the construction of selection indices in genetics, and some early papers on the subject providing a satisfactory solution are due to Fairfield Smith (1936), Panse (1946) based on an idea suggested by Fisher. See also Henderson (1950) for similar developments. A theoretical justification of the method employed by these authors and associated tests of significance are given in Rao (1953). A particular case of the model (2.2.4) is where it can be broken down into a number of submodels
Y1 = X I / ~ + E1. . . . . Yp =Xpfl+Ep

(2.2.6)

where Y,. is ni-vector variable and

E(ei) = O,

E(eie;) = Off,~,

E(eiej' ) = 0.

(2.2.7)

Note that the fl parameters are the same in all submodels, and in some situations the design matrices X 1..... Xp may also be the same. The model (2.2.6) with the covariance structure (2.2.7) is usually referred to as one with "heteroscedastic variances" and the problem of estimating fi as that of estimating a " c o m m o n mean" (see P.S.R.S. Rao et al., 1979; and J.N.K. Rao and Subrahmaniam, 1971).

6 2.3.

c. Radhakrishna Rao and Jiirgen Kleffe Variance and covariance components

We assume the same structure (2.2.1) for e but with a more general covafiance structure for the 4~i's

E(dp,) = O, E(Oidp; ) ----a~I,,~,


E ( ~ , ~ ) = 0, leading to

E(OiO;) = Ai, i = k + 1 . . . . . p,

i = 1 . . . . . k,

ivaj

(2.3.1)

V ( 0 ) = U1A1 U; + - - + V k A k V ; , + o ; + | U k + l V ; + l + , 2

"* +4Up

V;'

(2.3.2) where A i >/0. tn some practical problems A i are all the same a n d there is only o n e 0 2 in which case (2.3.1) becomes V(O)---- U 1 A U ; + - + UkAU/~+a2I. (2.3.3)

2.4.

R a n d o m regression coefficients

This is a special case of the variance a n d covariance c o m p o n e n t s model considered in Section 2.3 where e has the structure e = S 0 1 + q52, E(q)10]) = A, E(q)2~;) = 021 (2.4.1)

the c o m p o u n d i n g matrix for 01 being the same as for fl leading to the G M model
Y ~-- X ~ + X I + 2 ,

D(e) = X A X ' +

o21.

(2.4.2)

In general, we have repeated observations on the m o d e l (2.4.2) with different X's i = 1..... t leading to the model with (2.4.3)

":rx'[i
Y=Xfl+ e

(2.4.4)

X 1 A X ~+ a2I

D(~) =

X t A X / + 021

(2.4.5)

Estimation of variance components

all the off diagonal blocks being null matrices. A discussion of such models is contained in Fisk (1967), Infante (1978), Rao (1965, 1967), Swamy (1971), and Spjotvoll (1977). In some cases A is known to be of diagonal form (see Hildreth and Houck, 1968).
2.5. Intraclass correlation m o d e l

We shall illustrate an intraclass correlation model with special reference to two way classified data with repeated observations in each cell
Y~jk, i = 1. . . . , p ; j = 1 . . . . . q; k = l . . . . . r.

(2.5.1)

We write ~jk = bt0k+ e,;ik where P~jk are fixed parameters with a specified structure, and (2.5.2)

e%~)=0,
g(eijre~is) =
O201,

e(~2~)= 0 2,
r~S,

E(eijreiks) =

(2.5.3)

0202,

j =/=k, r =/=s,
i--/=t,j=/=k, r=/=s.

E(eijretk,) -- 02p>

This dispersion matrix of (Yr~) can be exhibited in the form (2.1.3) with 2 2 2 v 2 four parameters a , o Ol, o 02, o 03. A model of the type (2.5.2) is given in Rao (1973, p. 258).
2.6. Multivariate model

A k-variate linear model is of the form

(v~: . . . : v~)=x(& :-.. :&)+(< : - - - : * 0 ,


(2.6.1) Denoting Y = ( Y 1..... Y), f f = ( f i l . . . . . fi), e - ( q model may be written as a univariate model
/ i * t t ! -__ !

. . . . . e~), the multivariate

E(~g)= ~ (oi re)

(2.6.2)

C. Radhakrishna Rao and Jiirgen Kleffe

where t3i are (k k) matrices of variance and covariance components o}~), r, s = 1..... k. In the multivariate regression model p = 1, in which case E(gi') = (O V). (2.6.3)

We may specify structures for e analogous to (2.2.1) in the univariate case


Ei= Ult~li'q- " "" +

Up~l~i,

i = 1.... ,k, rC~h. (2.6.4)

E(+i,~Ojm)=o~I,

E(+i/O~h)=O,

For special choices of U~, we obtain multivariate one, two,.., way mixed models. Models of the type (2.6.2) have been considered by Krishnaiah and Lee (1974). They discuss methods of estimating the covariance matrices Ol- and testing the hypothesis that a covariance matrix has the structure (2.6.2).

3. Estimability
3.1. Unbiasedness

Let us consider the univariate G M model (2.1.1) with the covariance structure (2.1.3)

r--x~+~,

D(,)--0,Vl+-.. +0,z,

(3.1.1)

and find the conditions under which linear functions f'O can be estimated by functions of Y subject to some constraints. The classes of estimators considered are as follows: = ( Y ' A Y, A symmetric}, (3.1.2) (3.1.3) (3.1.4)

%= ( g(r): e[ g(Y)] =f'o v~ ~R~,O ~ ) ,


= (g(r) = g ( r + x ~ ) w ) .
We (i) of B. (ii) I-P. (iii)

use the following notations: S ( B ) represents the linear manifold generated by the columns P=X(X'X)-X' is the projection operator onto $ (X), and M =

Pr = X(X' T - 1X)-X' T - 1

Estimation of variance components

Theorem 3.1.1 provides conditions for unbiased estimability. TrtEOm~M 3.1.t. Let the linear model be as in (3.1.1). Then: (i) The estimator Y'A Y is unbiased for y =f'O iff X ' A X = 0, trA V~= f , i = 1..... p. (3.1.5)

(ii) There exists an unbiased estimator ~efg iff f E $ (H), H = (hij), ho = tr( V/Vj- PViPVj). (3.1.6)

(iii) If Y has multivariate normal distribution, then '6,y is not empty iff eL!N ~ is no empty. The results (i) and (ii) are discussed in Seely (1970), Rao (1970, 1971a, b) and Focke and Dewess (1972), Kleffe and Pincus (1974a, b) and (iii) in Pincus (1974).
NOTE 1:

Result (ii) holds if in (3.1.6) we choose h0 = tr( V,.(I- P) Vj). (3.1.7)

NOTE 2: In the special case V,.Vj=0 for i ~ j , 0i, the ith individual parameter, is unbiasedly estimable iff M V ~ 0 where M = I - P .

LEM~A 3.1.1. of Ois

The linear space F of all unbiasedly estimable linear functions

F= { E o Y ' A Y : A ~ s p ( V , - P V I P ..... V p - P V p P ) }

(3.1.8)

Where sp(A 1..... Ap) is the set of all linear combinations of A 1..... Ap. Let us consider the multivariate model (2.6.2) written in a vector form f = ( I X ) f l + ~,

E(i~:) = ([~1 @ VI) -~-.- --b( % @ Vt~)


where Oi are k k matrix variance-covariance components.

(3.1.9)

LEMMA 3.t.2. The parametric function y = Z f / t r C O i is unbiasedly estimable from the model (3.1.9) iff f'O is so from the univariate model (3.1.1).

~0

C. RadhakrishnaRao and Jiirgen Kleffe

LEMMA 3.I.3. The class F of unbiasedly estimable linear functions of elements of 0 i, i = 1..... p in (3.1.9), is F = ( y = ~] trCiOi: C i are such that

nb=O~ Y~ b, Ci=O)
(3.1.10)

where H is as defined in (3.1.6) or (3.1.7). 3.2. Invariance

An estimator is said to be invariant for translation of the parameter fl in the linear model (3.1.1) if it belongs to the class (3.1.4). Theorem 3.2.1 provides the conditions under which estimators belonging to the class ~fV~ ~ exist. THEOREM 3.2.1. Let the linear mbdel be as in (3.1.1). Then: (i) The estimator Y ' A Y E )~LfN ~ iff
AX=O,

trAV/=f,

i = 1..... p.

(3.2.1)

(ii) There exists an unbiased estimator in ~ f3 ~ iff f c S ( HM) where


H M = (ho.),

h U= tr(MV/MVj.), M = i - P.

(3.2.2)

(iii) Under the assumption of normality of Y, the result (3.2.2) can be extended to the class ~. NOTE: In (3.2.2), we can choose
hlj = tr( B B ' ViBB' Vj)

(3.2.3)

where B is any choice of X , i.e., B is a matrix of maximum rank such that B ' X = O. LEMMA 3.2.1. The linear space of all invariantly unbiasedly estimable linear functions of 0 is
r l = ( E o Y ' M A M Y : A E s p ( V 1 - P V l P ..... V p - P V p P ) ) .

(3.2.4)

LEMMa 3.2.2. I f f'O is invariantly unbiasedly estimable from the model (3.1.1) then so is ~:=X f.trCOi from the model (3.1.9).

Estimation of variance components

~1

LBM~ 3.2.3. All invariantly unbiasedly estimable linear functions of the elements of 1..... p in the model (3.1.9) belong to the set
Fez = { y = trCiOi: C i are such that HMb = 0 ~ biCi=O}.

(3.2.5) For proofs of Lemmas, 3.2.2 and 3.2.3, see Kleffe (1979).
NOTE:

We can estimate any member of the class (3.2.5) by functions of

the form

E tr(qr%r)
where Ap .... Ap are matrices arising in invariant quadratic unbiased estimation in the univariate model (3.1.1).

3.3.

Examples

Consider the model with four observations


Y1 = fll -~-ED Y2 = B1-1- e2, Y3 = B2 "~ 63, Y4 = B2 "1-e4

where ei are all uncorrelated and V(el)-~-V(E3)=02 and V(e2)= V(e4)= 02. The matrices X, V 1, V2 are easily seen to be

X=

Ii i1 [i000; , [000 il
Vl= 0 0
0

0
1

0
0

0
0

'

'

"

The matrices H and H M of Theorems 3.1.1 and 3.2.1 are

3
H=[ 1

11
7,
3'

rl
ll

1
1

Since H is of full rank, applying (3.1.6) we find that o12 and o2 z are individually unbiasedly estimable. But H M is of rank one and the unit vectors do not belong to the space g (/arM). Then (3.2.2) shows that o 2 and o2 are not individually unbiasedly estimable by invariant quadratic forms. Consider the model Y-- X/? + Xq, + e where/? is a fixed vector parameter and ~ is a vector of random effects such that E(~a)=0, E ( ~ ' ) = 02Im,

!2

C. Radhakrishna Rao and Jiirgen Kleffe

E(eoe') = O, E(ee*) = o~I n. Let Y ' A Y be an unbiased estimate of o 2. Then we must have X ' A X = O, t r A X X ' = t,

trA = 0

which is not consistent. Hence unbiased estimators of cr22do not exist.

4.
4. O.

Minimum variance unbiased estimation (normal case)


Notations

In Section 3, we obtained conditions for unbiased estimability of f ' O in the !inear model
Y=Xfl-l-e, D(e)~-~IVI +... -.I-OpVp= V0

(4.0.1)

restricting the class of estimators to quadratic functions of Y. In this section we do not put any restriction on the class of estimators but assume that
Y ~ N n ( X / 3 , Vo) , /3 ~1~ m, 0 ~ (4.0.2)

i.e., n variate normal, and V o is p.d. for 0 ~ o~. The condition that V o is p.d. is assumed to simplify presentation of results, and is satisfied in many practical situations. First, we derive the locally minimum variance unbiased estimator (LMVUE) of f'O at a chosen point (flo, Oo) in R m x6)-. If the estimator is independent of fio, Oo then we have an U M V U E (uniformly minimum variance unbiased estimator). U M V U E ' s do not exist except in simple cases. In the general case we suggest the use of L M V U E with a suitable choice of fl0, 00 based on previous experience or apriori considerations. We also indicate an iterative method which starts with an initial value (/30, 00), gets an improved set (/31,01), and provides in the limit I M V U E (iterated MVUE). LMVUE's are obtained in the class of quadratic estimators by LaMotte (1973) under the assumption of normality and by Rao (1971a, b) in the general case. Such estimators were designated by Rao as M I V Q U E (minimum variance quadratic unbiased estimator). Kleffe and Pincus (1974a, b) and Kleffe (1977a, b) extended the class of estimators to quadratic forms in ( Y - X o 0 and found that under normality assumption, M I V Q U E is L M V U E in the whole class of unbiased estimators.

Estimation of variance components

13

4.1, Locally minimum variance unbiased estimation


D~HNmOS 4.1.1. An estimator 3, is called LMVUE of its expected value at ( rio,0o) E R" x ~ iff

V( q, I /30,00) < V(~ I fio, Oo)


for all q such that
E('~,) ~- E ( ~ ) ~( 1~, 0) ~ R m X o~.

(4.1.1)

(4.1.2)

We use the following notations:

vo=o v + . . .
A,o= Vo-I(V~- PoV~Pd)Vo-',
Ko = (trAio Vj),
=

e o = X ( X ' V ~ l X ) X ' V o -1,

[( r -

)'A ,o( Y - XB ) ..... ( Y - XB )%o( Y - XB ) ]'.


(4.1.3)

Let (rio, 0o) be an apriori value of (fl, 0). Then applying the result (3.1.6) of Theorem 3.1.1 we find that f'O is unbiasedly estimable iff

f E g (Koo).
Theorem 4.1.1 provides an explicit expression for the LMVUE.

(4.1.4)

T~IEOX~M 4.i.1. Let f satisfy the condition (4.1.4) and Ko, k, o be as defined in (4.1.3). Then the LMVUE of f'O at (flo, Oo) is "7= X'k&,oo= E )~( Y - Xflo)'A,( Y - )(rio) (4.1.5)

where ~ is any solution of KooX=f.


Theorem 4.1.1 is established by showing that

cov(g(Y),

&, o0) = 0

for all g(Y) such that E[g(Y)lfl, O]=O Vfl E R " , 0 ~ , and using the theorem on minimum variance estimation given in C.R. Rao (1973, p.317).

~4

C Radhakrishna Rao and Jiirgen Kleffe

NOTE 1: For any X, X'kao,e is LMVUE of its expected value which is a linear function of 0. Thus (4.1.5) characterizes all LMVUE's Of linear functions of 0 at (rio, 0o). NOTE 2: The variance of ~ as defined in (4.1.5) is V({ I/3, O) = 4(/3 -/3o)'X'AooVoAoX( fi -/3o) + 2 trAooVoAooVo (4.1.6) where Aoo= ZX, Aioo. The variance at (/30, 00) is V(91 flo, 0o) = 22t'Ko),= 2f'XoTf where Koo is any g-inverse of Koo. NOTE 3: The BLUE (best linear unbiased estimator) of Xfl at 0o is (4.1.7)

Xfl= PooY.
Substituting/~ for t0 in (4.1.5) we have

(4.1.8)

ql=X'kLoo= Y )(MVooM ' +( ~ V i ) ( M V o o M ) + Y

(4.1.9)

where M= I - X ( X ' X ) - X ' , and C + is the Moore Penrose inverse of C (see Rao and Mitra, 1972). The statistic "71 which is independent of the apriori value of fl is an alternative estimator of f'O but it may not be unbiased for f'0. NOTE 4: Theorem 4.1.1 can be stated in a different form as follows. Iff'O is unbiasedly estimable then its LMVUE at /30,00 is f ' 0 where 0 is any solution of the consistent equation

KooO= kazoo

(4.1.10)

NOTE 5: (Estimation of fl and 0.) Let 0 (i.e., each component of 0) be estimable in which case 1(2oois nonsingular and the solution of (4.1.10) is ~l--Koolk~o, Oo. Let /?l be a solution of Xfl=PooY. We may use 01,fl 1 the LMVUE of 0,/3 as initial values and obtain second stage estimates t~2 and /32 of 0 and fi as solutions of

K40=k~,,d ,,

XB= Pd Y.

(4.1.11)

Estimation of variance components

15

The process may be repeated and if the solutions converge they satisfy the equations

KeO=kp, o,

X[3=PeY.

(4.1.12)

The solution (/~,0) of (4.1.12) may be called IMVUE (iterated minimum variance unbiased estimator) of ([3,0). The IMVUE is not necessarily unbiased.

4.2. Invariant estimation


Let us restrict the class of estimators to invariant unbiased (IU) estimators, i.e., estimators g(Y) such that

g(Y+Xfl)=g(Y) E[ g( Y)I fl, O] =f'O

V[3,
(4.2.1)

and find the locally minimum variance invariant unbiased estimator (LMVIUE). Let
M
~

I-P,

P=X(X'X)-X',

Hul(O ) = (tr[ (MVoM) + Vi( MVoM ) + Vj ])

OrE

e;)vo

hi(Y,8)= ( Y'( MVoM) + VI( MVoM) + Y,

.... Y'( MVoM) + Vp( MVoM) + Y)' [ Y'Vo-~(I-eo) V l ( I - t ~ ) V o -1Y, .... Y' Vo-1( I - Po) Vp(I - P~) Ve- 1r] ,.
THEOREM 4.2.1. (i) f'O is invariantly unbiasedly estimable iff (4.2.3) (4.2.2)

f E S (Hul(O)) for any choice of 0 such that Ve is nonsingular. (ii) The LMVIUE of f'O at 0o is ~=)Chx( Y, Oo) where X is any solution of [HuI(Oo)])~=f.

(4.2.4)

16

C. Radhakrishna Rao and Jiirgen Kleffe

The resuRs of T h e o r e m 4.2.1 are obtained by transforming the model = X/3 + e to a model involving the maximal invariant of Y,

= B' Y = B'e = e,

(4.2.5)

where B = X J-, which is independent of/3, and applying Theorem 4.1.1. NOT~ 1: Theorem 4.2.1 can be stated in a different form as follows. I f f ' 0 is invariantly unbiasedly estimable, then its L M V I U E at 00 is f't~ where 0 is a solution of

[ Hu,(Oo) ] 0 = hi( Y, 0o)


where Hux(O ) and hi(Y, 0) are defined in (4.2.2).

(4.2.6)

NoT~ 2: If 0 admits invariant unbiased estimation, then as in N o t e 5 following Theorem 4.1.1 we m a y obtain I M V I U E of (/3,0) as the solution of
x/3=eo,

i Hvz(O) ] 0 = hz( Y, 0).

(4.2.7)

5. 5.0.

Minimum norm quadratic estimation (MINQE-theory) MINQE-principle

In Section 4 we assumed normal distribution for the r a n d o m vector Y in the linear model and obtained the L M V U E of linear functions of variance components without imposing any restriction on the estimating function. However, we found that the estimators were all quadratic. In the present section we shall not make any distributional assumptions but confine our attention to the class of quadratic estimators and lay down some principles for deriving optimum estimators. Natural estimator: Consider a general r a n d o m effects linear model

Y=x/3+
= o,

+ . . . + u,%= = o,I.,,

+ ueo, =o

(5.o.1)

so that
D ( Y ) = O1U1U; -+. . . . -~ ~p Up ~Tpt= Ol Vl -~ O. . q- ~p Vp.

.Estimation of variance components

17

It is convenient for later developments to write the error term in the f o r m

U(~= U,4,,-- Ul,4,1,-[--.. -Jr U p , B ,

(5.0.2)

where U/.= ~ Ui a n d Oi.=4,i/V~aai~ and a i is an apriori value of Oi, so that Oi* are comparable in some sense. A natural estimator of 0i when 4,,. is k n o w n is 0i = aiq):*4,i/ri and that off'O is f ' 0 = ~b. N + , (5.0.3)

with a suitable choice of the matrix N I. Suppose that the detailed structure of 0 as in (5.0.1) is not specified but it is given that

E(4,4,') = 01F 1+ . . .
so that D(Y)=OlUFIU:

+ OeF,

(5.0.4)

+ " " +OpUFpU'=O1VI +""

+ OpVp.

It is not clear as to h o w a natural estimator o f f ' O can be defined in terms in such a case. However, using prior values oq . . . . . % of 01 ..... 0p we m a y write

UO = (UF2/2)(F-'/24,) = U,q~,
where F,
=

(5.0.5)

alF l +...

+ o~fp a n d define an estimator o f f ' O as

~---E ~i4,~*(F =l/2FiFa-1/2)(]), =

#>;N4,. (say)

(5.0.6)

where ~ are chosen to m a k e ~ unbiased for f ' 0 , i.e.,/~1 .... ,/~ is a solution of the equations

(trFiF~-tFlF~-')l~l+... + ( t r F / F ~ - l F p F ~ - l ) p v = 0 ,

i = 1. . . . . p.

A more general definition of a natural estimator in terms of e w h e n the model is Y = X 3 + e without specifying a n y structure for e is given in Section 5.4. MINQE-theory: Consider the general model (5.0.5) a n d a quadratic estimator ~ = Y'A Y of f'O. N o w

Y'A Y =

X ' A U,

X ' A X ] ~ fl ]

(5.0.7)

18

C. Radhakrishna Rao arm Ji~rgen Kleffe

while the natural estimator is q~, ' N q~, as defined in (5.0.6). The difference between Y ' A Y and ~ ; N ~ , is

X'AU,

X'AX 1\

"

(5.0.8)

The minimum norm quadratic estimator (MINQE) is the one obtained by minimizing an appropriately chosen norm of the matrix of the quadratic form in (5.0.8)

Dzi Dl21 ] U;AU,-N


O'21
D22 =

U',AX X'AX "

X'AU,

(5.0.9)

We shall consider mainly two kinds of norms, one a simple Euclidean norm trD11Dlt + 2 tr D12Dzl + tr D2zDzz and another a weighted Euclidean norm tr Dl I W D ix W+ 2 tr D ~2KD21 W + tr D22KD2zK

(5.O.lO)

(5.o.11)

where W and K are n.n.d, matrices. The norm (5.0.11) gives different weights to ~, and fl in the quadratic form (5.0.8). We impose other restrictions on A (and indicate the MINQE so obtained by adding a symbol in brackets) such as Y ' A Y (a) is unbiased: MINQE(U) (b) is invariant for translation in t : MINQE(I) (c) satisfies both (a) and (b): MINQE(U, 1) (d) is unbiased non-negative definite: MINQE(U, NND) (e) is invariant non-negative definite: MINQE(I, NND), etc. The properties of the estimator strongly depend on the norm chosen and the restrictions imposed. We also obtain a series of IMINQE's (iterated MINQE's), by repeatedly solving the MINQE equations using the solutions at any stage as prior values for transforming the model as indicated below equation (5.0.5).

Estimation of variance components 5.I. MINQE(U,I)

19

We consider the class of invariant unbiased quadratic estimators, i.e., of the form Y ' A Y where A belongs to the class
Cfl = {A: A X = O , trA Vi = f i, i = 1..... p} (5.1.1)

where X and V/ are as defined for the general model (5.0.5). We use the following notations and assumptions

T=(V,~+ XX')>O, Va=ollVl"[-'" "st'OlpG,


PT=X(X'T-'X)-X'T where a is a prior value of 0. THEOREM 5.1.1. I f Gfl is not empty, then under the Euclidean norm (5.0.10), the M I N Q E ( U , I ) of f'O is 3= ~ Y ' A i Y , A i = T - 1 M T V ~ M ~ T -~ (5.1.2) -', MT=(I-Pr)

where X=(h 1..... ~ ) ' is any solution of [ Hul(a ) ]~k= f where Hul(a ) is the matrix (trAi Vj). PROOF. Under the conditions (5.1.1), the square of the Euclidean norm in (5.0.10) becomes HU'AU-NIi2=tr(U'AUU'AU)-2trNU'AU+trNN. But N = Y , p ~ F ~ t r N U ' A U = ~ , t ~ J ~ expression trAV~AV~=trATAT (5.1.4) (5.1.3)

so that we need minimize only the

for

A@Gfz.

(5.1.5)

It is easy to show that (5.1.5) is minimized at A = A , ~ G f i such that tr D T A , T = 0 VD

E @gl.

(5.1.6)

where no , E MT : tr E MT Vi MT = O, i = 1..... p } . "~UI _ -- ( MT

20

C RadhakrishnaRao andJiirgen Kleffe

Then (5.I.6)~trEM~TA. TM~ =O when trEMrV~M~=O,i= l ..... p which ~ TA. T= Y ~ M T V,M~. which gives the solution (5.1.3). The equation for ~ is obtained by expressing the condition of unbiasedness. Note that [HuI(a)]X= f is consistent iff Uut is not empty. Also the solution (5.1.2) is independent of N.
NOTE 1:

An alternative expression for ~ given in (5.1.3) is

q=

E Y%Y,

Ai-'-(MV, M ) + VI(MV,~M) +

(5.1.7)

where M = I - X X +. Note that Hul(a ) of (5.1.3) can be written as

Hul(a ) = (tr( i V , M ) + Vi( M V , M ) + Vj).


NOTE 2: When V~ is nonsingular, T can be replaced by V~ in Theorem 5. I. 1. Then

.~= E~i.Y,AiY,
in which case

Ai = V~, - - 1 MvoViMv t V~ --1

(5.1.8)

H v l ( a ) --- ( t r M 'VaVv ot - 1 v" M ' v - 1 v" ~ \ i Va" a j]"


NOTE 3: If Y is normally distributed, M I N Q E ( U , I ) is LMVIUE off'O at values of 0 where XOiV~ is proportional to V~ (see Theorem 4.1.1). NOTE 4: If in (5.1.4) we use the weighted Euclidean norm (5.0.11)

][U'AU-N[]2=tr(U'AU-N)W(U'AU-N)W

(5.1.9)

where W is p.d., the solution may not be independent of N. The expression (5.1.9) can be written as

t r A G A G - 2 t r A H + tr N W N W

(5.1.10)

where G - - - ( U W U ' + X X ' ) and H = UWNWU'. If G is nonsingular, then the minimum of (5.1.10) is attained at A equal to

A,=

G-~(EhiMGViM+ + MGHM+)G-t

= E Xi(MGM) + V,(MGM) + + ( M G M ) + H ( M G M ) +
(5.1.11) where ~ are determined from the equations trA, V/=f/, i = 1 ..... p.

Estimation of variance components

21

NOTE 5:

It is seen from (5.1.2) that the estimate off'O can be written in the form f ' 0 where t) is a solution of

[/-IUI(~) ] 0 = hi(

r,

Oi)

(5. I. 12)

where the ith element of hi(Y, ~) is

Y'A,Y= Y'T-'MrV~M~.T-W

(5.1.13)

and Hm(c 0 is as defined in (5.1.3). If each component of O admits invariant unbiased estimation then Hui(O 0 is non-singular and the M I N Q E ( U , I ) of 0 is 0 = [ H v , ( ~ ) ] - l h / ( Y, ~). (5.1.14)

NorE 6: The computation of M I N Q E ( U , I ) of 0 involves the use of a an aprior value of 0. If we have no prior information on 0, there are two possibilities. We may take ~ as a vector with all its elements as unity. An alternative is to choose some o~, compute (5.1.14), consider it (say t)l) as an apriori value of 0 and repeat the computation of (5.1.14). The second round value, say t)2 is an appropriate estimate of 0, which may be better than tT~ if the initial choice a is very much different from t)r We may repeat the process and obtain 03 choosing 02 as an apriori value and so on. The limiting value which satisfies the equation

[ Hu,(O)] O= h , ( r , O)

(5.1.15)

is the IMINQE( U, I), the iterated MINQE( U, I), which is the same as IMVIUE defined in (4.2.7). It is shown in Section 6 that eq. (5.1.15) is the marginal maximum likelihood (MML) equation considered by Patterson and Thompson (1975).

5.2,

MINQE(U)

We drop invariance and consider only unbiasedness, as in problems such as those mentioned by Focke and Dewess (1972) where invariant estimates do not exist. In such problems it is advisable to use an apriori value fi0 of j? and change Y to Y - X f i 0 and 13 to ( f l - r i o ) and work with the transformed model in addition to the transformation indicated in (5.0.5). For unbiased estimators Y'A Y of f'O the matrix A belongs to

~Y~ = { A: X ' A X = 0, trA Vi =f., i = 1..... p )


where X and V, are as in the general model (5.0.5).

(5.2.1)

22

C. RadhakrishnaRao andJiirgen Kleffe

THEOREM 5.2.1. Let T= V~+ XX' be p.d. If GYv is not empty then the MINQE(U) under Eue#dean norm (5.0.10) is

~= ~X,.Y'A,Y,

Ai-= T - I ( V i - PrVjP~r)T -1

(5.2.2)

where ~ = (~1..... ha)' is any solution of [ Htr( a) ]X= f where Hu(a ) is the matrix (trAi Vj).
PROOF. Under (5.0.10) we have to minimize (5.2.3)

I[U.A U. - U 112"4-2 II U~AX II2


which, using (5.2.1), reduces to

(5.2.4)

trAV~AV~+2trAV, A X X ' = t r A T A T ,

T= V~+XX'.

(5.2.5)

The expression (5.2.5) attains a minimum at A = A. iff

trDTA.T=O

VD ~QO.

(5.2.6)

Observing that D ~ Q ~ D = E - Pr, EPr and following the arguments of Theorem 5.1.1, the expression for A . is obtained as in (5.2.2). NOTE 1: We shall consider a few alternatives to the simple Euclidean norm. Focke and Dewess (1972) give different weights to the two terms in (5.2.4) as in (5.0.11). Choosing W = I and K = r2I, (5.2.5) becomes trA V~A V~ + 2r z trA V~AXX' = tr [ A ( V~ + r2XX')A ( V~ + r2XX') ]. (5.2.7) The constant r 2 determines the relative weights to be attached to fl and q~. The solution obtained by minimizing (5.2.7) is called r-MINQE(U) which is the same as (5.2.2) with T replaced by (V~ + r2XX'). NOTE 2: The iterated estimates of fl and MINQE(U) of 0 are solutions of the equations

x'vo-'x3= x'v;-~r,
[ Hu(O)lO= hu(Y,O )
(5.2.8)

Estimation of variance components where hu( Y,O )-~( Y ' A I Y ..... YIApY)',

23
(5.2.9)

Hv(O ) and A i are as defined in Theorem (5.2.1). The solution of (5.2.8) is represented by I M I N Q E ( U ) . 5.3. m-MINQE(U)

In (5.2.7) we defined r - M I N Q E ( U ) which uses a weighted Euclidean norm to provide differential weights to fl and ~ and also suggested a translation in Y using a prior value of ft. Actually we may consider a transformation which changes g--> y - Xflo , fl---~r-lK-1/2fl

where/3 0 and r2K correspond to apriori mean and dispersion of t3. Then the Euclidean norm of (5.0.10) becomes

irA( Va + r 2 X K X ' ) A ( Va + r2XKX') =


=trATAT+2(r 2- 1)trATAXKX' (5.3.1)

where T = V~ + X K X ' . Let us denote the optimal solution in such a case by A r and define A0=limA r as r---~oe. If A 0 exists, we call the corresponding estimator Y'AoY, the ce-MINQE(U). The following theorem due to Focke and Dewess (1972) establishes the existence of ~ - M I N Q E ( U ) . THEOREM 5.3.1. Let c'V be the set of linear combinations of V 1..... Vp. Then: (i) ~ - M I N Q E ( U ) exists iff ~ is not empty. (ii) A o is the unique matrix which minimizes t r A T A T in the class: G = (A: A ~ vCf and minimizes trA T A X K X ' subject to A E GYu}.

(5.3.2)
Theorem 5.3.1 characterizes m - M I N Q E ( U ) but does not provide a method of calculating it. Theorem 5.3.2 due to Kleffe (1977b) gives the necessary formula. THEOREM 5.3.2. Let Cf be not empty and (5.3.3)

B = ( t r ( M V M ) + Vi(XKX' ) + 5 )

24

C. Radhakrishna Rao and Jiirgen Kleffe

where ( X K X ' ) , = T - 1/2(T- I/2XKX' T - 1/2)+T - 1/2. The oo-MINQE(U) of f'O is Y ' A . Y where A . = ( X K X ' ) . V . ( M V . M ) + + (MV,~M) + Va(XKX ') + ( M V ~ M ) + Vb(MV,~M) +, ga = ~ ai Vi, Vb = E b, Vi (5.3.4)

and a = ( a I..... ap)' and b = ( b 1..... bp)' satisfy the equations Qb+ 2 B a = f , where Q = (tr(MV~M) + Vi(MV~M ) + Vj) = Hu,(a ). (5.3.5) Qa=O

NOTE 1: It is interesting to note that oe-MINQE(U) is the same if instead of the sequence r2K, we consider (A+rEK) for any A>~0 (see Kleffe, 1977b). NOTE 2: oe-MINQE(U) (see Kleffe, 1979). 5.4. coincides with M I N Q E ( U , I ) if it exists

MINQE without unbiasedness

Let us consider the linear model

r=xB+,,

+o,v,= vo

(5.4.1)

where Vo is p.d. for each 0 EY. Choosing a prior value a of 0, (5.4.1) can be written Y= Xfl + V2/2e. (5.4.2)

where e, = Vff 1/2e and V~ = a 1V~ + . + 0%Vp. Using the definition (5.0.6) with e, as q~, a natural estimator f ' 0 is

V: '/:

(5.4.3)

where ) k = ( ~ 1. . . . . )~p)' is chosen such that e'.Ne, is unbiased for f'O, i.e., X satisfies the equation [ H ( a ) ] X = f where H ( a ) = (tr Vi. Vj.) = (tr V~-'Vi V.-'Vj). It is seen that (5.4.3) is LMVUE of 0 at 0 = a (5.4.4) when e is normally

Estimation of variance components distributed. The MINQE of f'O is Y ' A Y where A is chosen to minimize V2/2A V 2 / 2 - N X ' A V2/2 .

25

V'./2Ax

X'Ax

(5.4.5)

In Sections 5.1-5.3 we imposed the condition of unbiasedness on Y'A Y. We withdraw this condition but consider some alternative restrictions on the symmetric matrix A as defined by the following classes. C = (A }, Gt,U = ( A : X ' A X = 0}, C~ = ( A : A X = O } . (5.4.6) (5.4.7) (5.4.8)

It is seen that when A ~ Gev, the bias in the estimator Y'A Y is independent of the location parameter fl, and is thus partially unbiased (PU). The MINQE's obtained subject to the restrictions (5.4.6)-(5.4.8) are represented by MINQE, MINQE(PU), MINQE(I) respectively. The following general results are reported in Rao (1979). THEOREM 5.4.1. Consider the model (5.0.5) and let V, = a I V 1+. + ap Vp be p.d. Further, let W = Y ~ V i where %=(3,1..... Xp)' satisfies the equation [H(a)]% = f , where H(a) = (tr V- 1V,.V~- IV j). Then under the Euclidean norm in (5.4.5), the optimal matrices A . providing MINQE's are as follows.

(i) (ii)

MINQE: A . = ( V~ + X X ' ) -1 W( V~ + X X ' ) - 1 ,

(5.4.9)

M I N Q E ( P U ) : A . = ( V~ + X X ' ) -1( W - P~ WP,)( V, + X X ' ) -', P: = X ( X ' V , X ) - X ' Vd- 1, (5.4.10)

(iii)

M I N Q E ( I ) : A . = ( M V , M ) + W( M V , M ) + = V ~ - I ( I - P ~ ) W ( I - P,,) V~-1 (5.4.11)

where M = I - X ( X ' X ) - X ' . PROOF. Under Euclidean norm, the square of (5.4.5) is tr( V~ 1/2"AV2/2 - N) 2 + 2 tr(X'A V~AX) + tr( X ' A X ) 2. (5.4.12)

26

C. Radhakrishna Rao and Jiirgen Kleffe

Without any restriction on A, the minimum of (5.4.12) is attained at A , iff tr( V2/2A, V2/2- N) V2/2BV1/2 + 2 tr(X'A, V.BX)

+tr(X'A,XX'BX)=O
for any symmetric matrix B. Then A , satisfies the equation

(5.4.13)

V2/2( T Z 2 /'~ A , , , , V2/2- N) V2/2 + X X ' A , V~ + V,A , X X ' + X X ' A , X X ' = 0


or

(Vo+XX')A,(V:+XX')= v2 ,/2Nv2 1/2=Zxiv~=w, A,=(V: + XX')-'W(V. + XX')-'


which is the matrix given in (5.4.9). If A is subject to the restriction X ' A X = O, then (5.4.13) must hold when B is replaced by B - P~BP, where P~ is defined in (5.4.10). Then arguing as above and noting that P~ V~ = V~P~, the equation for A , is

( vo + XX')A,( V~+ XX') = Y L( Vi- eo V,e~)


or

A , = ( V~ + XX') -1( W - P. WP')( V. + XX') -1


which is the matrix given in (5.4.10). If A is subject to the condition A X = O, then (5.4.13) must hold when B is replaced by MBM where M = I - P. Then A, satisfies the equation

(MV~,M)A,(MV~,M)= M W M
or

A , = (MV,~M) + W(MV~M) +
= V- '(I - P.) W ( I - P~,) V~which is the matrix given in (5.4.11). NOTE 1: MINQE in (5.4.9) and MINQE(I) in (5.4.11) are automatically non-negative when the natural estimator is non-negative while MINQE(PU) may not be. NOTE 2: The MINQE(I) of f'O given in (5.4.11) can be written as f'O where 0 is a solution of [ H(a) ! 0 = hi( Y, a) (5.4.14)

Estimation of variance components

27

where H(a) is as defined in (5.4.4) and the ith element of

h1(Y,e0 is
(5.4.15)

WV~-~(I- P.)Vi(I- P~)V~-Iy.

The eq. (5.4. ~4) is consistent. If 0 is identifiable, then H(~) is non-singular, in which case t}= [H(a)l-lhl(Y,e O. This form of the solution enables us to obtain I M I N Q E ( I ) , i.e., iterated M I N Q E ( I ) , by writing t}1 = [H(a)]-lhl(Y,e 0 and obtaining a second stage estimate 0 with a replaced by t~. The limiting solution, if the process converges, satisfies the equation

[u(0)]0=h,(r,0)

.(5.4.16)

which is shown to be the maximum likelihood equation in Section 6. The estimators (5.4.9)-(5.4.11) depend on the choice of the natural estimator (5.4.3) unlike the unbiased MINQE's considered in Sections 5.1-5.3. The condition of unbiasedness eliminated the terms which depended on the natural estimator in the norm to be minimized and provided estimators free of the choice of the natural estimator, although the concept of a natural estimator was useful in formulating the MINQE principle. In choosing the natural estimator (5.4.3) we did not consider any structure for the error term in the linear model (5.4.1). Now suppose that e= U+ where E(++')=O1FI+... + OpFp as considered in (5.0.1) and we choose the natural estimator as in (5.0.3), +*NI+* = +'*( Z

I~iF~ - 1/2FiF~1/2)+,

(5.4.17)

where +, = F~- I/2+ and ~' --- (/q ..... Pr) satisfies the equation

Itr(FiF~-'FjF~-l) ] lX=f.
In such a case the norm to be minimized is
I U~A U , - N l X'A U.

(5.4.18)

U;,AX
=

X'AX

(5.4.19)

where U, UF2/2. The expressions for the MINQE's obtained by minimizing (5.4.19) are the same as those given in (5.4.9)-(5.4.11) except that W = ~ #i V~ instead of Y.X i V,. It may be noted that X satisfies the equation [H(a)]X = f where H(a) is as defined in (5.4.4) and )t may not be equal to/~ which is a solution of (5.4.18). In some problems like the estimation of

28

C. Radhakrishna Rao and Ji~'rgenKleffe

heteroscedastic variances considered by P.S.R.S. Rao and Chaubey (1978), ~,-- #. The properties of estimators based on X and/~ need investigation. 5.5. M I N Q E ( N N D ) - - N o n - n e g a t i v e definite estimator

In the general variance components model, we admitted the possibility of some of the parameters being negative. But there are cases such as the random effects model where the variance components are non-negative and it may be desirable to have non-negative estimators for them. The estimators considered so far except some of those in Section 5.4 can assume negative values although the parametric function is non-negative. In this section we explore the possibility of obtaining unbiased quadratic estimators -~= Y ' A Y with A 1>0 of parametric functions f'O which are non-negative in 0 ~ ey for a general model. A M I N Q E in this class is denoted by MINQE(U, NND), where N N D stands for non-negative definiteness of the quadratic estimator. The following lemma characterizes the nature of the matrix A if ~ has to be unbiased and non-negative (see Pukelsheim, 1977 for proofs of various results in this section). LEMMA 5.5.1. A non-negative and unbiased quadratic estimator satisfies the invariance condition, i.e., A X = O . Y'AY

PROOF. Unbiasedness ~ X ' A X = O ~ A X = 0 since A >/0. In view of Lemma 5.5.1 we need only consider the class of matrices ~YVD=(A:A>~0, AX=O, trAV/=f/, i = 1 ..... p ) . (5.5.1)

Further, because of invariance we can work with a transformed model t= Z'Y=e, E(t) --- O, E(tC) = 0181 + ' "

+ OpB~

(5.5.2)

where Z = X (with full rank say s) and B i = Z ' V/Z, i = 1..... p. We need consider quadratic estimators -~ = t' Ct where C belongs to the class C~D = { C : C / > 0 , LEMMA 5.5.2. fe trCBi--fi}. (5.5.3)

@YUD is not empty iff convex span {q(b): b~R"} (5.5.4)

where q(b) = ( b ' M V I M b ..... b'MVpMb)'.

Estimation of variance components

29

NOTE;

In terms of the model (5.5.2), the condition (5.5.4) is

f E convex span { q ( b ) , b E R s }

(5.5.5)

where q(b) = (b'Blb ..... b'Bpb). The conditions (5.5.4) and (5.5.5) are rather complicated, but simple results can be obtained if we assume V1..... Vp to be n.n.d. THEOREM 5.5.1. Let V//> 0 , i = 1..... p, V=Y, V,. and V(O= V - Vi and B i be as defined in (5.5.2). There exists an n.n.d, quadratic unbiased estimator of Oj

iff S (Bj) z s (MV+M) S (MV<+>M)

( MV<+>M) S ( MVM)c R( MV<+>M) <R( MVM ),


(5.5.6)

where R ( . ) denotes the rank of a matrix.


NOTE l: The condition (5.5.6) can also be expressed as ( I - G) V+(Z- G ) ~ 0 (5.5.7)

where G is the projection operator onto the space generated by the columns of the compound matrix (g: VI:... : 5-1; 5+1:"" : Vp). (5.5.8)

NOTE 2: If S ( V 1 ) 3 S ( M ) , then $ ( M V I M ) D S ( M V , M ) for all i, in which case, application of Theorem 5.5.1 shows that at most 01 is nonnegatively estimable. If $ ( V O 3 $ ( M ) and S ( V 2 ) 3 $ ( M ) , then none of the single components are non-negatively estimable.
NOTE 3:

(LaMotte, 1973.) If V(j9> 0, then 8j is not non-negatively estimable. Further, if V/> 0, then 8;, i:Pj is not non-negatively estimable. However, let us assume that G/up is not empty for a given f and estimate f'O by the MINQE principle. For this purpose we have to minimize
NOTE 4:

[IAII2=trAV~AV~

forA EEYUD.

(5.5.9)

This appears to be a difficult problem in the general case. Of course, if M I N Q E ( U , 1 ) turns out to be a non-negative estimator in any given situation it is automatically MINQE(U, N N D ) . It may also be noted that if

30

C. Radhakrishna Rao and Jiirgen Kleffe

sp(MV1M ..... MVpM} is a quadratic subspace with respect to (MVM) +, then the MINQE(U,I) off'O is n.n.d, iff CfD is not empty.
Since C:vn is a convex set, we proceed as follows to solve the problem (5.5.9). The minimum is attained at A , iff

trBV~AV~>~trA,V~A,V~

VB EGfz~

(5.5.10)

or writing B = A , + D, the condition (5.5.t0) becomes

trDV~A,V~>O V D ~ @ , = ( D : D X = O , A , + O ~>O,trDV/--O,i--- 1..... p}.

(5.5.11) (5.5.12)

A general solution for (5.5.11) cannot be explicitly written down, but the formula will be useful in examining whether any guessed solution for A , provides a MINQE(U, NND). We shall consider some special cases. THEOREM 5.5.2. Let V//> 0, i = 1..... p, and Oj be estimable, i.e., the condition (5.5.9) is satisfied. Then the MINQE(U, NND) of Oj is

4=

R(Aj)

Y'AjY, Aj=[(I-G)Vj(I-G)] +

(5.5.13)

where G is the projection operator onto the space generated by the columns of
( X , V 1. . . . . Vj_I, Vj+ 1. . . . , Vp).

An alternative approach to the problem (5.5.9) based on standard methods of convex programming is provided by Pukelsheim (1977, 1978a, b). We define the functional

g( B )= man ([]A][2-(A,B ) )
A ~ 6:vl

(5.5.14)

where ~ f i is the class defined in (5.1.1), ]IA[I2=trAV~AV~ and (A,B)--trAV, BV~ with V, >0, and call the problem

sup g(B)
B>0

(5.5.15)

as the dual optimization problem.

Estimation of variance components

31

LEMMA 5.5.3.

Let A , e ~ f o and B , ~ 0 be such that


(5.5.16)

![A,H2= g( B,). Then:


(i) A . and B . are optimal solutions of (5.5.9) and (5.5.15). (ii) ( A . , B . ) =0. NOTE:

(5.5.17)

g(B) is bounded above since HA,[]2>g(B)


for all B /> 0. (5.5.18)

For obtaining a satisfactory solution to the problem (5~5.9) we need an explicit expression for g(B). We obtain this in terms of A where Y ' A Y is the M I N Q E ( U , I ) of f'O. Let us note that any matrix B ( - - B ' ) can be decomposed in terms of symmetric matrices

B=B+(B-B

such that B E C x and ( B ~ B - B ) = O . The matrix B is simply the projection of B onto the subspace Gi in the space of symmetric matrices with inner product ( . , . ~ as defined in (5.5.15). We note that by construction, A is such that (A,B ) = 0 THEOREM 5.5.3. empty. Then: for any given B. (5.5.19)

Let Y ' A Y be the MINQE(U,I) of f'O and GfD be not

(i)

g(B) --t1~112-

(A,B)

- 4

IIBll 2,

(5.5.20)

(ii) B, >-0 is optimal [i.e., maximizes g(B)] /ff

3+ 2 ~B >,0 0, ~ (iii)
=0.

(A+21 BO,B,)=0,

(5.5.21) (5.5.22)

1 A, = A + ~ B o,

is a solution to (5.5.9), i.e.,provides MINQE( U, NND ) of f'O and ( A . , B . )

32

C. Radhakrishna Rao and Jilrgen Kleffe

The results of Theorem 5.5.3 do not provide a computational technique for obtaining A*. Puketsheim (1978a, b,c) proposed an iterative scheme which seems to work well in many problems.

6.

Maximum likelihood

estimation

6.1.

The general model

We consider the general G M model

Y = X f l + e, E(eg)=O~V~+... + 0 p V , = V0

(6.1.1)

and discuss the maximum likelihood estimation of 0 under the assumption

Y~(Xfl,

Vo), fl ~Rm, O ~ .

(6.1.2)

We assume that Vo is p.d. for V0 E of. Harville (1977) has given a review of the ML estimation of 0 describing the contributions made by Anderson (1973), Hartley and Rao (1967), Henderson (1977), Patterson and Thompson (1975), Miller (1977, 1979) and others. We discuss these methods and make some additional comments. The log likelihood of the unknown parameters (fl,O) is proportional to

l( fl, O, Y ) = - logl Vol - ( Y - Xfl )' V o ' ( Y - Xfl ).


The proper ML estimator of (fl, O) is a value (/~,/~) such that l(/~,t~,r)= sup l(fl, o , r ) . 3,o~

(6.1.3)

(6.1.4)

Such an estimator does not exist in the important case considered by Focke and Dewess (1972). In the simple version of their problem there are two random variables

Yl=~+el, E(e 2) = 0 2,

Y2=/~+ e2,

E(e 2) = 02,

E(e,e2) = 0.

(6.1.5)

The likelihood based on Yz and Y2 is log 02 ( r, (r2-

log

o I -

2o/2

2o22

(6.1.6)

Estimation of variance components

33

which can be made arbitrarily large by choosing/~= YI and letting al---~0, so that no proper M L E exists. The M L equations obtained by equating the derivatives of (6.1.6) to zero are 02

= ( YI -/~)2,

a2 = ( Y2-/~)2,

~t( ~1 + la2 ]] = al 2Y---L + o---~Y2


(6.1.7)

which imply ol = cr2- Thus the ML approach fails to provide acceptable estimators. However, in the example (6.1.5), all the parameters are identifiable and M I N Q E ( U ) of a 2 and o 2 exist. A similar problem arises in estimating 02 and o 2 in the model Y= Xfi + Xy + e where E ( y y ' ) = O2Im, E(ee') = a~In and E(e)/) = O. It is well-known that M L estimators of variance components are heavily biased in general and in some situations considered by N e y m a n and Scott (1948), they are not even consistent. In such cases, the use of M L estimators for drawing inferences on individual parameters may lead to gross errors, unless the exact distribution of the M L estimators is known. These drawbacks and the computational difficulties involved in obtaining the M L estimators place some limitations on the use of the M L method in practical problems.

6.2.

Maximum likelihood equations

For 0 ~ oy such that Vo > 0 (i.e., p.d.), the likelihood of (fl, 0) is

/( fl, O, Y) = - l o g [ 11ol-( Y - Xfl )' Vo- I( y - x f l ).

(6.2.1)

Taking derivatives of (6.2.1) w.r.t, to fl and 0; and equating them to zero we get the ML equations

X ' Vo- IXt~ ~- X" Vo- I Y, trVo-W~=(Y-Xfl)'Vo-W~vo-l(y-xfl),


i = 1 ..... p.

(6.2.2)

(6.2.3) Substituting for fl in (6.2.3) from (6.2.2), the equations become

X f l = PoY,

P o = X ( X ' V o - I X ) - X ' V o -',

(6.2.4) (6.2.5)

[ H ( 0 ) ] 0 = h,( Y, 0)

where H ( 0 ) = ( t r Vo- W i Vo-IVj) is the matrix defined in (5.4.4) and the ith

34

C. Radhakrishna Rao and JiJrgenKleffe

element of hj(Y,O) is

r ' ( x - P0)' vo

vi vo-1(I- Po)r

(6.2.6)

which is the same as the expression defined in (5.4.15). We make a few comments on the eqs. (6.2.4) and (6.2.5). (i) The ML equation (6.2.5) is the same as that for I M I N Q E ( I ) given in (5.4.15). (ii) The original likelihood eq. (6.2.3) is unbiased while the eq. (6.2.5) which provides a direct estimate of 0 is not so in the sense

E[ hI(Y,O) ]:=#[ H(O) ]O.

(6.2.7)

An alternative to the eq. (6.2.5) is the one obtained by equating hi( Y, 0) to its expectation

h,(r,0) = e [ h , ( r , 0 ) ] = [ Hu,(0)l 0

(6.2.8)

which is the marginal M L (MML) equation suggested by Patterson and Thompson (1975). (iii) There may be no solution to (6.2.5) in tile admissible set ~ to which 0 belongs. This may happen when the supremum of the likelihood is attained at a boundary point of oy. (iv) It is interesting to note that the M L estimate of 0 is invariant for translation of Y by Xa for any a, i.e., the M L E is a function of the maximal invariant B ' Y of Y where B = X . Suppose 0 in the model (6.1.1) is identifiable on the basis of distribution of Y in the sense:

+opG=o;v,+... +o;G

oi-o;=o foran/,

i.e., V~ are linearly independent (see Bunke and Bunke, 1974). But it may happen, as in the ease of the example of Focke and Dewess (1972), that 0 is no longer identifiable when we consider only the distribution of B ' Y, the maximal invariant of Y. Such a situation arises when B' ViB are linearly dependent while V~ are not. In such cases the M L method is not applicable while M I N Q E ( U ) developed in Section 5.2 can be used. Thus, the invariance property of M L E limits the scope of application of the ML method. (v) Computational algorithms: The eq. (6.2.5) for the estimation of 0 is, in general, very complicated and no closed form solution is possible. One has to adopt iterative procedures. Harville (1977) has reviewed some of the existing methods.

Estimationof variancecomponents

35

(a) If 0k is the kth approximation to the solution of (6.2.5), then the (k + 1)th approximation is

0 k + l = [ H(Ok)l -1 h l(Y, Ok)


^

(6.2.9)

as suggested for IMINQE(I), provided 0 is identifiable. Otherwise, the H matrix in (6.2.5) is not invertible. Iterative procedure of the type (6.2.9) is mentioned by Anderson (1973), Harville (1977), LaMotte (1973) and Rao (1972) in different contexts. However, it is not known whether the procedure (6.2.9) converges and provides a solution at which supremum of the likelihood is attained. (b) Hartley and Rao (1967), Henderson (1977) and Harville (1977) proposed algorithms suitable for the special case when one of the V~ is an identity matrix (or at least non-singular). An extension of their method for the general case is to obtain the (k + l)th approximation of the ith component of 0 as

Pok) ' Vd~ ' Vi Vd~'(I-- eak)r, tr Vd~ ' V i

i=1 ..... p.
(6.2.10)

In the special case when V~ are non-negative definite and the initial 0i are chosen as non-negative, the successive approximations of 0i using the algorithm (6.2.10)stay non-negative. This may be a "good property" of the algorithm, but it is not clear what happens when the likelihood eq. (6.2.5) does not have a solution in the admissible region. (c) Hemmerle and Hartley (1973) and Goodnight and Hemmerle (1978) developed the method of W transformation for solving the ML equations. Miller (1979) has given a different approach. Possibilities of using the variable-metric algorithms of Davidson-Fletcher-Powell described by Powell (1970) are mentioned by Harville (1977). As it stands, further research is necessary for finding a satisfactory method of solving the eq. (6.2.5) and ensuring that the solution provides a maximum of the likelihood.

6. 3.

Marginal maximum likelihood equation

As observed earlier the ML eq. (6.2.5) is not unbiased, in the sense

E[ h,( Y,O] 4= [ H(O ) ]O.

(6.3.1)

36

C. Radhakrishna Rao and Jiirgen Kleffe

If we replace the eq. (6.2.5) by

h,(g,O)=E[h~(r,o)] = [ Hui(O)] 0,
(6.3.2)

we obtain the I M I N Q E ( U, I ) defined in (5.1.14), which is the same as I M V I U E defined in (4.2.7). The eq. (6.3.2) is obtained by Patterson and T h o m p s o n (1975) by maximizing the likelihood of 0 based on T ' Y , where T is any choice of X , which is the maximal invariant of Y. N o w

l(O, T ' Y) = - I o N T'

VoT!- Y ' T ( T ' V o T ) - ' T ' Y.

(6.3.3)

Differentiating (6.3.3) w.r.t. 0i we obtain the M M L (marginal M L ) equao tion

tr(T(T'VoT)-IT'Vi)=

Y'T(T'VoT)-'T'ViT(T'VoT)-IT'y,
(6.3.4)

i = 1. . . . . p. Using the identity (C.R. Rao, 1973, p.77)

T(T'VoT)-'T'-= Vu'-

Vo-IX(X'VolX)XtVo 1
(6.3.5)

_~ V0-1(i_ Po) eq. (6.3.4) becomes

tr( Vo- ~( I - Po ) Vi ) = Y' V o ' ( I - Po ) Vi( I - P[O V o l Y,

i = 1 ..... p (6.3.6)

which is independent of the choice of T = X used in the construction of the maximal invariant of Y. It is easy to see that (6.3.6) can be written as

[ Hul(O) ]O= h,( Y,O)

(6.3.7)

which is eq. (6.3.2). (i) Both M L and M M L estimates depend on the maximal invariant T ' Y of Y. Both the methods are not applicable when 0 is not identifiable on the basis of T' Y. (ii) The bias in M M L E m a y not be as heavy as in M L E and M M L E may be more useful as a point estimator. (iii) The solution of (6.3.7) m a y not lie in the admissible set of 0 as in the case of the M L equation.

Estimation of variance components

37

(iv) If O~ is the kth approximation, then the (k + 1)th approximation can be obtained as (6.3.8) It is not known whether the process converges and yields a solution which maximizes the marginal likelihood. (v) Another algorithm for M M L E similar to (6.2.9) is to compute the (k + 1)th approximation to the ith component of 0 as

Oi,k + l'=4,k Y'( I--

Iviv

I(I- Pk) V

(6.3.9)

tr Vd - '(I-- Pd )
It is seen that both M L and M M L estimators can be obtained as iterated MINQE's, MLE being I M I N Q E ( I ) defined in (5.4.16) and M M L E being I M I N Q E ( U, I) defined in (5.1.14). There are other iterated MINQE's which can be used in cases where ML and M M L methods are not applicable. It has been remarked by various authors that M I N Q E involves heavy computations, requiring the inversion of large matrices. This argument is put forward against the use of MINQE. These authors overlook the fact that inversion of large matrices depend on the inversion of smaller order matrices in special cases. For instance, if Vo is of the form ( I + UDU'), then it is well-known that Vo-l=I -U(U'U+D 1)-lu' (6.3.10)

which can be used to compute Vo-1 if the matrix ( U ' U + D -1) is comparatively of a smaller order than Vo. It may be noted that the computational complexity is of the same order for M I N Q E and MLE, MMLE.
References
Ahrens, H. (1978). MINQUE and ANOVA estimator for one way classification--a risk comparison. Biometrical J. 20, 535-556. Ahrens, H., Kleffe, J. and Tensler, R. (1979). Mean squared error comparisons for MINQUE, ANOVA and two alternative estimators under the balanced one way random model. Tech. Rep. P-19/79, Akademie der Wissenschaften der DDR. Anderson, R. L. (1975). Designs and estimators for variance components. In: J. N. Srivastava, ed., A Survey of Statistical Design and Linear Models. pp. 1-29. Anderson, R. L. and Crump. P. P. (1967). Comparisons of designs and estimation procedures for estimating parameters in a two stage nested process. Technometrics 9, 499-516.

38

C Radhakrishna Rao and JiJrgen Kleffe

Anderson, T. W. (1973). Asymptotically efficient estimation of covariance matrices with linear structure. Ann. Statist. 1. 135-141. Brownlee, K. A. (1953). Industrial Experimentation. Chemical Publishing Co. Bunke, H. and Bunke, O. (1974). Identifiability and estimability. Math. Operationsforsch. Statist. 5, 223-233. Cochran, W. G. (1939). The use of the analysis of variance in enumeration by sampling. J. Am. Statist. Assoc. 34, 492-510. Fairfield Smith, H. (1936). A discriminant function for plant selection. Ann. Eugenics (London) 7, 240-260. Fisk, P. R. (1967). Models of the second kind in regression analysis. J. Roy. Statist. Soc. B 29, 235-244. Focke, J. and Dewess, G. (1972). Uber die Sch~/tzmethode MINQUE yon C. IL Rao and ihre Verallgemeinerung. Math. Operationforsch. Statist. 3, 129-143. Fuller, W. A. and Rao, J. N. K. (1978). Estimation for a linear regression model with unknown diagonal covariance matrix. Ann. Statist. 6, 1149-1158. Goodnight, J. H. and Hemmerle, W. J. (1978). A simplified algorithm for the W-transforma-tion in variance component estimation. SAS Tech. Rept. R-104, Raleigh, NC. Hartley, H. O. and Rao, J. N. K. (1967). Maximum likelihood estimation for the mixed analysis of variance model. Biometrika 54, 93-108. Harville, D. A. (1977) Maximum likelihood approaches to variance component estimation and to related problems. J. Am. Statist. Assoc. 72, 320-340. Hemmerle, W. J. and Hartley, H. O. (1973). Computing maximum likelihood estimates for the mixed AOV model using the W-transformation. Technometrics 15, 819-831. Henderson, C. R. (1950). Estimation of genetic parameters (Abstract). Ann. Math. Statist. 21, 309-310. Henderson, C. R. (1953)o Estimation of variance and covariance components. Biometrics 9, 226-252. Henderson, C. R. (1977). Prediction of future records. In: Proc. Int. Conf. on Quantitative Genetics. pp. 616-638. Hildreth, C. and Houck, J. P. (1968). Some estimators for a linear model with random coefficients. J. Am. Statist. Assoc. 63, 584-595. Infante, A. (1978). Die MINQUE--Schatzung bei Verlaufskurvemmodellen mat zufalligen regressionskoeffizienten. Thesis, Dortmund (FRG). Kleffe, J. (1975). Quadratische Bayes-Sch;itzungen f/Jr Lineare Parameter: der Kovarianzmatrix im Gemischten Linearen Modellen. Dissertation, Humboldt Univ., Berlin. Kleffe, J. (1976). Best qnadratic unbiased estimators for variance components in mixed linear models. Sankhya B 3~1, 179-186. Kleffe, J. (1977a). Invmiant methods for estimating variance components in mixed linear models. Math. Operationforsch. Statist. 8, 233-250. Kleffe, J. (1977b). A note on oo-MINQUE in variance covariance components models. Math. Operationforsch. Statist. 8, 337-343. Kleffe, J. (1978). Simultaneous estimation of expectation and covariance matrix in linear models. Math. Oper. Statist. Ser. Statist. 9, 443-478. Kleffe, J. (1979). C. R. Rao's MINQUE for replicated and multivariate observations. Tech. Rept. Zimm der AdW der DDR, Berlin. Kleffe, J. (1980). C. R. Rao's MINQUE under four two way ANOVA models. Biometrical J. 21, in press. Kleffe, J. and Pincus, R. (1974a). Bayes and best quadratic unbiased estimators for parameters of the covariance matrix in a normal linear model. Math. Operationsforsch. Statist. 5, 47-67.

Estimation of variance components

39

Kleffe, J. and Pincus, R. (1974b). Bayes and best quadratic unbiased estimators for variance components and heteroscedastic variances in linear models. Math. Operationsforsch. Statist. 5, 147-159. Kleffe, J. and Z~tlner, I. (1978). On quadratic estimation of heteroscedastic variances. Math. Oper. Statist. Set. Statist. 9, 27-44. Krishnaiah, P. R. and Lee, Jack C. (1974). On covariance structures. Sankhya 38A, 357-371. LaMotte, L. R. (1973). Quadratic estimation of variance components. Biometrics 29, 311-330. Miller, J. J. (1977). Asymptotic properties of maximum likelihood estimates in the mixed model of analysis of variance. Ann. Statist. 5, 746-762. Miller, J. J. (1979). Maximum likelihood estimation of variance components--a Monte Carlo Study. J. Statist. Comp. and Simulation 8, 175-190. Neyman, J. and Scott, E. (1948). Consistent estimators based on partially consistent observations. Econometrica 16, 1-32. Olsen, A., Seely, J. and Birkes, D. (1976). Invariant quadratic unbiased estimation for two variance components. Ann. Statist. 4, 878-890. Panse, V. G. (1946). An application of discriminant function for selection in poultry. J. Genetics (London) 47, 242-253. Patterson, H. D. and Thompson, R. (1975). Maximum likelihood estimation of components of variance. In: Proc~ of 8th International Biometric Conference. pp. 197-207. Pincus, R. (1974). Estimability of parameters of the covariance matrix and variance components. Math. Oper. Statist. 5, 245-248. Powell, M. J. D. (1970). A survey of numerical methods for unconstrained optimization. S I A M Rev. 12, 79-97. Pukelsheim, F. (1977). Linear models and convex programs: Unbiased non-negative estimation in variance component models. Tech. Rep. 104, Stanford University. Pukelsheim, F. (1978a). Examples for unbiased non-negative estimation in variance component models. Tecli. Rep. 113, Stanford University. Pukelsheim, F. (1978b). On the geometry of unbiased non-negative definite quadratic estimation in variance component models. In: Proc. Vl-th International Conference on Math. Statist., Poland Pukelsheim, F. (1978c). On the existence of unbiased non-negative estimates of variance components. Tech. Rep. Inst. Math. Stat., Univ. of Freiburg. Rao, C. R. (1947). General methods of analysis for incomplete block designs. J. Am. Statist. Assoc. 42, 541-561. Rao, C. R. (1953). Discriminant function for genetic differentiation and selection. Sankhya 12, 229-246. Rao, C. R. (1956). On the recovery of interblock information in varietal trials. Sankhya 17, 105-114. Rao, C. R. (1965). The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrics 52, 447-458. Rao, C. R. (1967). Least squares theory using an estimated dispersion matrix and its application to measurement of signals. In: Proc. Fifth Berkeley Symposium, Vol. 1. pp. 355-372. Rao, C. R. (1970). Estimation of heteroscedastic variances in linear models. J. Am. Statist. Assoc. 65, 161-172. Rao, C. R. (1971a). Estimation of variance and covariance components. J. Multivariate Anal. 1, 257-275. Rao, C. R. (1971b). Minimum variance quadratic unbiased estimation of variance components. J. Multivariate Anal. 1, 445-456. Rao, C. R. (1972). Estimation of variance and covariance components in linear models. J. Am. Statist. Assoc. 67, 112-115.

40

C. Radhakrishna R a t and Ji~rgen Kleffe

Rat, C. R. (1973). Linear Statistical Inference and Its Applications. Second Edition. John Wiley, New York. Rat, C. R. (1979). Estimation of variance components---MINQE theory and its relation to ML and MML estimation. Sankhya (in press). Rat, C. R. and Mitra, S. K. (1972). Generalized Inverse of Matrices and Its Applications. Johil Wiley, New York. Rat, J. N. K. (1973). On the estimation of heteroscedastic variances, Biometrics 29, 11-24. Rat, J. N. K. and Subrahmaniam, K. (1971). Combining independent estimators and estimation in linear regression with unequal variances. Biometrics 27, 971-990. Rat, P. S. R. S. and Chaubey, Y. P. (1978). Three modifications of the principle of the MINQUE. Commn. Statist. Math. A7, 767-778. Rat, P. S. R. S. (1977). Theory of the M I N Q U E - - A review. Sankhya B, 201-210. Rat, P. S. R. S., Kaplan, J. and Cochran, W. G. (1979). Estimators for the one-way random effects model with unequal error variances. Teeh. Rep. Searle, S. R. (1968). Another look at Henderson's method of estimating variance components. Biometrics 24, 749-788. Searle, S. R. (1971). Topics in variance component estimation. Biometrics 27, 1-76. Seely, J. (1970). Linear spaces and unbiased estimation--application to mixed linear model. Ann. Math. Statist. 42, 710-721. Seely, J. (1975). An example of inadmissible analysis of variance estimator for a variance component. Biometrika 62, 689-690. Sinha, B. K. and Wieand, H. S. (1977). MINQUE's of variance and covariance components of certain covariance structures. Indian Statistical Institute. Tech. Rep. 28/77. Spjotvoll, E. (1977). Random coefficients regression models. A review. Math Oper. Statist., Ser. Statist. 8, 69-93. Swallow, W. H. and Searle, S. R. (1978). Minimum variance quadratic unbiased estimation of variance components. Technometrics 20, 265-272. Swamy, P. A. B. (1971). Statistical Inference in Random Coefficients-Regression Models. Springer-Verlag, Berlin. Yates, F. (1940). The recovery of inter-block information in balanced incomplete block designs. Ann. Eugenics (London). 10, 317-325. Yates, F. and Zacopancy, I. (1935). The estimation of the efficiency of sampling with special reference to sampling for yield in cereal experiments. J. Agric. Sci. 25, 545-577.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 41-87

'~)

Multivariate Analysis of Variance of Repeated Measurements


Nell H. 7~mm

1.

Introduction

The analysis of variance of multiple observations on subjects or units over several treatment conditions or periods of time is commonly referred to in the statistical and behavioral science literature as the repeated measures situation or repeated measures analysis. Standard textbook discussions of repeated measurement designs employing mixed-model univariate analysis of variance procedures are included in Cox (1958), Federer (1955), Finny (1960), John (1971), Kempthorne (1952), Kirk (1968), Lindquist (1953), Myers (1966), Quenouille (1953) and Winer (t971), to name a few. Recently, Federer and Balaam (1972) published an extensive bibliography of repeated measurement designs and their analysis through 1967 and Hedayat and Afsarinejad (1975) discussed the construction of many of the designs. Coverage of the analysis of variance of repeated measures designs by the above authors has been limited to standard situations employing univariate techniques. The analysis of repeated measurements are discussed from a multivariate analysis of vario ance point of view in this chapter. 2. The general linear model

The generalization of the analysis of variance procedure to analyze repeated measurement designs utilizing the multivariate analysis of variance approach employs the multivariate general linear model and the testing of linear hypotheses usingp-dimensional vector observations. From a multivariate point of view, n independent p-dimensional repeated measurements are regarded as p-variate normal variates Yg, i--1,2 ..... n, with a common unknown variance-covariance matrix Y. and expectations E(Yi) = xillfl I + xi2~2 - q - . . .
"~ Xiq~q ,

i = 1 , 2 , . . . , n,

(2.1)

41

42

Nell H. Timm

where the xij's are known constants and t h e / ] f s are unknown p - c o m p o nent parameter vectors. Letting t h e p q matrix B ' = (/31/]2. /]p), t h e p n matrix Y ' = ( Y I Y 2 . . . Yn) and the n q matrix X=[xifl, expression (2.1) is written as
E(Y) = XB.

(2.2)

Since each row vector Yi of Y is sampled from a p-variate normal population with variance-covariance matrix Z , w e m a y write the variance of the matrix Y as V(Y) = I n Z (2.3)

where the symbol @ represents the direct or Kronecker product of two matrices. The combination of the formulas (2.2) and (2.3) are referred to as the multivariate G a u s s - M a r k o f f setup. To estimate the unknown parameter vectors in the matrix B, the normal equations
X'XB = X'Y

(2.4)

are solved. Letting /~ be a solution to the normal equations, the least squares estimator of an estimable parametric vector function
i~t=ctB = Cl/]l--~ c 2 / ] 2 - ] - -q- Cq/]q,

(2.5)

for known ci, is


I~=Ct/~ = C I ~ 1 "+" C 2 ~ 2 " ~ " " " "~

Cq~q.

(2.6)

To estimate the unknown elements % of the matrix E, the sum of squares and cross products (SSP) matrix due to error is computed. This matrix is obtained by evaluating
Se
=

Y' Y-

Y'XB

(2.7)

where/~ is any solution to the normal equations. Letting the rank of the design matrix be r ~<q, the degrees of freedom due to error is n - r = re, and ( 1 / V e ) S e results in an unbiased estimator of Z. To test the hypothesis H o that ~p=e'B has a specified value ~k0, we proceed by using Hotelling's generalized T 2 statistic (see Hotelling, 1931 and Bowker, 1960) and T h e o r e m 2.1 (Rao, 1973, p. 541). THEOREM 2.1. Let S have a central Wishart distribution with k degrees of freedom, represented by S ~ Wp(k,Y), and let d be normally distributed with

Multivariate analysis of variance of repeatedmeasurements

43

mean ~ and variance-covariance matrix c - I N with constant c greater than zero, represented by d ~ N p ( & c - l y . ) , such that S and d are independent. Hotelling's generalized T 2 statistic is defined by
T 2=

and

ckd'S - ld, - p + l,er2),

( k = p + l ) T2 P -~- ~ r ( p , k

which is a noncentral F distribution with noncentrality parameter Cq"2mc6'Z- ~& Since S e has a central Wishart distribution, Se~Wp(Pe, Z), and ~ NpOp, c ' ( X ' X ) - e E ), independent of Se, where ( X ' X ) - is a generalized inverse of X ' X , and e ' ( X ' X ) - c > 0 if ~p is estimable,

( l)e--pq-1 ) (l~--l~o)tge-l(~--lgbO) =F
p e'(X'X) -c

(2.8)

has a noncentral F distribution and the null hypothesis Ho: ~ = ~ o is rejected at the significance level a if F > F"(p, ee - p + 1). Alternatively, since
1+
r 2 "e ISel Iae + ahl '

where S h = ( e ' ( X ' X ) - e ) - ' ( ~ - ~Po)(ff- ~Po)'~ Wp(1,,3_2,) under the null hypothesis, the ratio
[S~[ . ~ B ( v e - p + l p ) B = ]Se~ ah I 2 '2 (2.9)

has a central beta distribution when H o is true so that rejecting for large values of F is equivalent to rejecting H o for small values of B.
To test the hypothesis H o that vh independent estimable functions have a specified value F, the null hypothesis H 0 is written as

H o: CBA = r

(2.10)

where the Ph q matrix C is of rank vh <r and A is a n y p u matrix of rank u < p < n - r . Following the test of H o that ~=q~o, we compute the matrix Sh known as the SSP matrix ("sum of squares+products") due to the hypothesis using the formula

S h = (CJBA - r ) ' ( C ( X ' X ) - C') - I(CBA - F).

(2.11)

44
Furthermore,

Neil H. Timm

S~=A'Y'[I--X(X'X)-X']YA,

(2.12)

and S h are independently distributed; Se,--Wu(n--r,A'EA ) and S h ~ Wu(vh,A'EA, .). Departure from the null hypothesis may be detected by comparing the matrices S e and S h. Having computed the matrices S e and S h for the null hypothesis (2.10), several procedures have been recommended for testing that the hypothesis H o is true. All of the procedures proposed are dependent on the roots of one of the following determinantal equations.

(a) (b) (c)

Is~ -- XSel =0, ISe-- P(Se+ SOl =0,

(2.13)

IS~-O(S~ + Se)I=O

with roots ordered from largest to smallest for i = 1 , 2 ..... s=min(vh, U). Wilks (1932) proposed testing H 0 using (2.14)
i=1

A-iSe+Sh i

i=1

v,

i=1

and to reject H 0 if A<U~(U, Vh,Ve). Lawley (1938) and Hotelling (1951) suggested the statistic
s i=1

i=1

\ -~i

Pei~=l -l~--~)~i

(2.15)

and to reject H 0 if T~ is greater than some constant k* to attain a predetermined level of significance. The symbol Tr denotes the trace of a matrix. Roy (1957) recommended the largest root statistic

O1= l + ) t I - 1 - v I

(2.16)

and to reject the hypothesis if ~)l > ~ a (s,m,n) where m = ( ] v h - u I - 1 ) / 2 , n = ( v e - u - 1 ) / 2 and s=min(v~,u). Pillai (1960) approximated the distribution of the following trace criterion proposed by Bartlett (1939) and

Multivariate analysis" of variance of repeated measurements

45

Nanda (1950):

v='rr[ sh(s~ + se) '] = , Z " e, =


"=

k
i=1

-1 - +a, -= Xi

i
i~l

(1-~-v,) (2.17)

and to reject the hypothesis if V > V '~ (s,m,n). Tables for each of the criteria are collected in Timm (1975). For a review of the literature on the distribution of A, T0 z, O, and V, the reader is referred to Krishnaiah (1978). In general no one multivariate criterion is uniformly best; we have selected to use WiNs' A-criterion to illustrate the analysis of repeated measurement designs from a multivariate analysis of variance point of view. When s = 1, all criteria are equivalent. Several alternative criteria have been proposed by authors to test the null hypothesis represented in (2.10). Of particular importance is the step-down procedure proposed by J. Roy (1958) and the finite intersection tests developed by Krishnaiah (1965). In addition, tests based on the ratio of roots are discussed in the paper by Krishnaiah and Waikar (1971). Following the test of a multivariate hypothesis of the form H0: CBA = F, simultaneous confidence intervals for the parametric estimable functions ~p= c'Ba, for vectors c in the row space of C and arbitrary vectors a, may be obtained for each of the multivariate test criteria. Evaluating the expression
tp--C 0 a t --Se a c t ( X t X ) - c Pe

(()

1 <~<~'arc 0 a'

act(X'X)-c

(2.18) 1 0 0 ( 1 - a ) % simultaneous confidence intervals for all ~p=e'Ba may be constructed where c0 is selected to maintain a ( 1 - a) confidence set. The critical constant co for each multivariate criterion has the following values (Gabriel, 1968): Wilks:

c~= ve( ~1- U'~ ),


Lawley-Hotelling:

(2.19a)

c~ = ve U~ = To2..,
Roy:

(2.19b)

On 4 = ~e( 1_--: ~ ),

(2.19c)

46

Neil H. Timm

Bartlett-Nanda-Pillai: V~ Co2= U e ( ~ ) . (2.196)

The critical values U ~, ", U~, and V ~ correspond to those procured in testing the multivariate hypothesis Ho: CBA = F at the significance level a.

3. One-sample repeated measurement design Suppose a r a n d o m sample of n subjects are measured (in the same metric scale with the same origin and unit) at p treatment levels so that the general organization of the data m a y be represented as in Table 3.1. The data in Table 3.1 m a y be analyzed as a special application of the multivariate general linear model. The p repeated measures for the ith subject is regarded as a p - v a r i a t e vector observation

Yi=l~+ei,

i = 1 , 2 ..... n,

(3.1)

where 1~ is a p l vector of treatment means and ei is a p l vector of r a n d o m errors. Furthermore, we assume that ei~INp(O,X) so that E(Yi)=~. The usual hypothesis of interest for the design is that the treatment means/x~,~t 2..... Pp, the elements of the vector ~, are equal: Ho: gJ =/~2 . . . . . Pp.

(3.2)

Representing H 0 as CBA = F , following form:


(1 x 1)

the matrices C, B, A and F take the

=[1],

(1 Xp)

= [ . , , . 2 ..... pp],

px(p-l)

=(':::/, \ --1']

l (p-l)

i0j,

where 1 denotes a vector of unities. With the np matrix Y=[Yifl and the n l design matrix X = I , expressions for S e and S h are readily obtained using (2.11) and (2.12) with = ( X ' X ) - 1X' Y. If H 0 is true,

A=

IS~ + S~I ~ C ' ( p -- 1, 1,~ -- l)

ISel

Multivariate analysis of variance of repeated measurements


Table 3.1 Data for a one-group repeated m e a s u r e m e n t design Subjects T1 SI $2 Yll Y21 T2 YI2 Y22 Treatments T~
Y lp

47

Y2p

Sn

Ynl

Yn2

Ynp

and H 0 is rejected if A < U ~ ( p - - 1 , 1 , n - l ) m i n ( 1 , p - 1)= 1, the hypothesis is rejected if n-p+l

or since

s=min(vh,U)=

or equivalently if

( n-p+ l ) T2
pZ1 since when s = 1, (n-l)

>F~(p-I'n-p+I)'
A)/A.

T2/ve=(1 -

EXAMPLE 3.1. Using the data in 'Fable 3.2, the mean reaction time of subjects to five probe words are investigated (Timm, 1975, p. 233).
Table 3.2 Sample data: one-group analysis Subjects
1 1 2 3 4 5 6 7 8 9 10 11

Probe-word positions
2 3 4 5

51 27 37 42 27 43 41 38 36 26 29

36 20 22 36 18 32 22 21 23 31 20

50 26 41 32 33 43 36 31 27 31 25

35 17 37 34 14 35 25 20 25 32 26

42 27 30 27 29 40 38 16 28 36 25

48

Nell H. Timm

Calculations show that

A-

ISe + Shl

ISel

=0.2482

or T 2= 30.29 and

(n-p+l)
p - 1

T2
( n - l)

=(7)30"29=5.30. 10

The hypothesis is rejected at the a = 0 . 0 5 level if A<U'5(4,1,10)= 0.057378 or n -p + 1


T2

p-1

(n-l)

> F'5(4, 7) = 4.12

so that H 0 is rejected. Employing the formula (2.18), confidence intervals for ~ ~-~1- ~5 and q~--/q-/*2 are easily evaluated: -7.09</q-/~5< 17.82 (N.S.), (Sig.).

0.86 </x~ -/~2 < 20.24

In the analysis of the one-group repeated measurements design from a multivariate analysis of variance point of view, no restrictions were placed on the structure of the variance-covariance matrix Z (except that n >p to ensure a positive definite estimate). If, however, mixed model univariate assumptions are established for a set of data, a univariate analysis is readily obtained from a set of multivariate calculations provided the post matrix A is orthogonalized so that A'A = I. To illustrate following Bock (1963), suppose the mean vector /x in the multivariate model has the form #=/tl+/~ where/3'= (B1,/~2..... /~p). Furthermore, suppose e i is represented by (3.3)

e i = sil + [ eij I

(3.4)

and that e i ~ I N p (0, E) so that the population variance-covariance matrix of

Multivariate analysis' of variance of repeated measurements


the repeated measures has the uniform covariance structure

49

= o ; q r + .21.
Such a decomposition yields the umvariate mixed model

(3.5)

y,j = tX + Sg + flj + e,j

(3.6)

where the subjects are a r a n d o m sample from a population in which s i , ~ l N (0, o2) jointly independent of the errors eij and eij~IN(O, 02). The parameters in (3.6) have the interpretation: /~ is an unknown constant, si is a random component associated with subject i, flj is a fixed treatment effect, and eij is a r a n d o m error component. A test of the null hypothesis /4o: fi,=,8 2 . . . . . is provided by the ratio S S H / ( p - 1) ~F(v~,Ve.)" F = S S E / ( n - 1)(p - 1) The degrees of freedom for the F ratio are obtained from the multivariate test by the formula v ~ = R ( A ) v h = ( p - 1 ) l = ( p - - 1 ) and v * = R ( A ) v e = ( p - 1)(n - 1) where R ( A ) denotes the rank of A. Furthermore, selecting A so that A'A = I, SSH = Tr(Sh) and S S E = Tr(Se). Thus, the univariate mixed model analysis is merely a special case of the more general multivariate analysis. If the variance-covariance matrix Y, has the structure given in (3.5), then the mean square ratio for testing the equality of the fixed treatment effects have an exact F-distribution. As shown by Bock (1963), a necessary and sufficient condition for an exact F-test is that the transformation of the error matrix by an orthogonal contrast matrix results is the scalar matrix o21. Box (1950) and Lee, Krishnaiah and Chang (1976) developed procedures to test for uniform covariance structure. Bock (1975) a n d H u y n h and Feldt (1970) review the general structure case. Whenever, the condition for the exact F-distribution is satisfied, the univariate model should be used to analyze repeated measurement data. When the variance-eovariance matrix X is arbitrary, Greerdlouse and Geisser (1959) and H u y n h and Feldt (1976) proposed a conservative F-test procedure for testing for the equality of treatment effects using the univariate mixed model analysis. However, as discussed by Geisser (1979), such a procedure is to be avoided when an exact multivariate procedure exists.

tg,,

50

Nell H. Timm

4. The /-sample repeated measurement design


Letting

~lf~j= ( Yijl ,Yij2 .... 'Yijp ) ' ~ INp (~gi, ~)


the p-variate observation of the j t h subject within the ith group is represented as Y~j =/** q-eo, i = 1 , 2 ..... I ; j = 1,2 ..... iV,. (4.1)

and N -._ - E l1 = i N i where #i=(/~,l,&2,.-.,/~) is a 1 p vector of means and eu is a vector of random errors. From (4.1) the data matrix Y is an N p matrix, the parameter matrix B is
1111 /21 ~12 ~22 "'" """ ~'lp] ~2p [

"
[111 ~12

....
iii

/
~Ip J

(4,2)

"'"

and the design matrix X is of the form


(NI)

=ININ,

with

i = 1 , 2 ..... 1.

The primary hypotheses of interest for t h e / - s a m p l e data are: H01: Are the profiles for the I groups parallel? //o2: Are there differences among treatments? //o3: Are there significant differences among groups? To test the hypothesis of group differences, the hypothesis in terms of the elements of B is
H 0 3 : ~t~1 = ~ll2 . . . . . ~l I

(4.3)

which is identical to the test for differences in means employing the one-way multivariate analysis of variance (MANOVA) model. Represent ing the hypothesis as CBA = I', the matrices C, A and I" for B defined in (4.2) are selected: (l-1)x~ With vh = R ( C ) = I - 1 ,
c
,,

and

r=0

(4.4) the hy-

ve = N - - R ( X ) = N - I

and u = R ( A ) = p ,

Multivariate analysis of variance of repeated measurements

51

pothesis H03 is rejected at the level a if

A= ISe+ s,,l <u+Cv'z- I ' N - O"

Isel

(4.5)

The p a r a m e t e r s for the other multivariate criteria are s=mJn(vh, u)= r a i n ( l - 1,p), m = ( I v h - u [ - 1 ) / 2 = ( t i - p - 1 [ - 1 ) / 2 and n = ( v e - u - 1)/2
= (NI-p - 1)/2.

T o test for differences in treatments, the hypothesis is stated as

~lp]
Ho2:

and the matrices C, A a n d F take the f o r m

pX(p-- 1)

:t+:-::-,,
\ --l: ]

r:o,

where the R(C)= vh = I a n d the R(A) = u = p - 1. F o r m i n g the A-ratio, the hypothesis is rejected at the level a if

A= [Se+ Sh~ [ <U"(p -I,I,N-I).

(4.6)

In addition, s = m i n ( I , p - 1), m = ([1 - p + 11 - 1)//2 a n d n = ( N - 1 -p)/2. T o test for parallelism of profiles or interaction b e t w e e n groups and treatments, the hypothesis m a y be stated as

i~11 ~12

-- t~12 ~13 I
-

~ll ~12
~l(p -- 1)

-- ~I3 ~12 l
(4.7)
-I~Ip

/4o1:
/~l(p- 1) -/~lp

a n d the matrices C, A a n d F b e c o m e

(1--1)xl
where

= ( , , 1 : , t

pX(p--1)

= , , ( I )--

aod

(I-- 1)X(p-- 1)

o0

Dp_ 1 is a ( p - l)-diagonal matrix with diagonal elements (_ {), the

52

Nell H. Timm

R ( C ) = vh = I - l, R ( A ) = u = p - 1 a n d v e = N - ' I . F o r the parallelism hypothesis, s = m i n ( I - 1,p - 1), m = ( l l - p [ - 1)/2 and n = ( N - I - p ) / 2 . For valid multivariate tests of differences in group a n d treatment m e a n vectors, we did not assume that the statistic for testing the parallelism hypothesis was nonsignificant so that the multivariate tests m a y be conf o u n d e d with interaction. If there is n o interaction between groups a n d treatments, alternative tests for group m e a n differences a n d differences in treatment means m a y be of interest that are special cases of the multivariate tests. In terms of the parameters in the matrix B, the tests for g r o u p a n d treatment m e a n difference b e c o m e
p

Ho(~): E j = d q j = . . . P
1

= ~,~=,/xo P
1

(4.8)

H(o~): ~ , i = l ~ i _ _ _ _ _ _ _ _ _ 2 = 1...

= __Z i=,/*~p I

(4.9)

Since the n u m b e r of subjects within each g r o u p are unequal, the test for treatment differences Ho(~ t) is an unweighted test that is i n d e p e n d e n t of the sample sizes N i. A n alternative test to Ho(~ ) is the weighted test
I 1

H(~w): ~ i=lNilIil ~--.. . . = E i=,Nil~ip


N N

(4.10)

D e p e n d i n g on whether the loss of subjects was due to the treatments or independent of the treatment, the weighted or unweighted tests w o u l d be selected, respectively. To test the hypothesis Ho(ff), the matrices C and A are selected to represent it in the f o r m C B A = 0: (1-1)xp

=(11_,i-1 ),

(px D

A =El~p].

To test 14(t) the matrix F = 0 and C a n d A take the form: ** 02 C


(Ix/)

1'

and

A = Dp I . pX(p-- 1) --1

(')

Alternatively, to test Ho(~ ~), A is as defined for Ho(~ ), but the matrix 1 l matrix C = ( N 1 / N . . . . . N f f N ) . Selecting A such that A ' A = I for the test of Ho(~), vh = I - 1 a n d ve =
N-I,

lSel
A= !Se+Shl

Multivariate analysis of variance of repeated measurements

53

However, if vh = 1 or s = min(,, h, u ) = 1,

re--U+|
]u-vhl+ 1
SO

I-'A

F(u--vh+l've--U+l)'
F(Ph'Pe)"

Fg= I

ve 1 - A
A

Furthermore, since A is selected so that A'A.= I,

sh = SSH = p Z
i

N,(yj..--y...)2,
j

so = SSE = p E Z (y,j _y,..)2,


i

which is identical to the hypothesis and error sum of squares obtained employing a univariate mixed model split-plot design (Geisser, 1979). For the tests of H0(~ ) or ~-0zr4(t~),the criterion A ~ U ( p - 1 , 1 , N - I ) . However, since uh = 1, A = ( 1 + T2/ue) -~ and T 2 for the tests of H0(~ ) and H0(~) become, respectively,

Tt2= i2
where

1
i=lN i

)'

y,A(A,SA)_~A,y ,
. . . .

Y..= E and

i=1

Yi./1 and

S=

Y'[1-x(x'x) 'x']r
N-I

~2 ~_ NY~.A(A'SA)-IA'Y~. t~

where

I Y..= E N i Y i . / N
i=1

Relating Tt2 oi' Ttw z to an F statistic, the formula

F=((N--I-p+2))
(p-l)

(N-I)

T 2

F(p-l,U-l-p+2)

is employed. As in the one sample repeated measurement design, the mixed model analysis of the data in Table 4.1 m a y be recovered from certain of the multivariate tests provided the post matrix A is selected such that A'A = I. As discussed by Kirk (1968, Chapter 8), the univariate mixed model m a y

54 Table 4.1 Kirk's data: two-group analysis B1 S1 Al B2 4 5 4 3 2 3 4 3

Neil H. T i m m

B3 7 8 7 6 5 6 5 6

B4 7 8 9 8 10 t0 9 11

S2 $3 S4 s{ s; s; s;

3 6 3 3 l 2 2 2

A2

be written as
Y i j k ~---~ -~" ai "t- ~ k "at- Yik + S(i)j "t- e(i)j k

i = 1,2 ..... l ; j = 1,2 ..... N~; k = 1,2 ...... p

(4.11)

where s(iv~IN(O, o~), e(iv~IN(O, 0 2) and s(iv and e(i~k are jointly independent so that Y. takes the form within each group, 51.= o~11' + o21. (4.12)

The parameters in the model are defined: /~=overall constant, a i = i t h group effect, s(0/=effect of the j t h subject measured as the ith group, flk = kth treatment effect 7ik = group by treatment interaction, and e(i)/k= subject by treatment interaction plus a r a n d o m error component. We have already seen that in the presence of no interaction the test of Hg: a I = a 2 a I is identical to the test H0(~). If Z takes the form specified in (4.12), the test of
. . . . .

B y : Yik -- Yi'k -- Yik" -~ Yi'k' = 0

m a y be recovered from the H m test of parallelism when A ' A =I. In this situation, p~ = phR(A) = ( I - 1 ) ( p - 1), v* = v e R ( A ) = ( N - I ) ( p - 1), and SSH = Tr(Sh) and SSE=Tr(Se). In the absence of interaction, the test of differences in treatments has two representations because of the unequal number of subjects in each group. H~: all flj are equal,
I

H~: flj+ Z N i a i / N are equal for all j,


i~l

where He is the unweighted test and H/~w is the weighted test. The

Multivariate analysis of variance of repeated measurements

55

univariate tests are obtained from the tests Ho(~ ) and --02'qu~), respectively, provided A is chosen so that A'A = I in the multivariate case. F o r either univariate test, the degrees of freedom are v ~ = vhR(A)= l ( p - 1 ) = p - 1 and Pe * = p e R ( A ) = ( N - I ) ( p - 1 ) . Furthermore, the S S H = T r ( S h ) and SSE =Tr(Se). However, r e m e m b e r that the matrix C for each multivariate test was defined differently. The mixed model hypotheses are not related to the multivariate tests//o2 and Ho3. EXAMPLE 4.1. Using the data in Table 4.1 taken from Kirk (1968, p. 274), a multivariate and univariate analysis are illustrated. To demonstrate how we would test Hol, Ho2, and Ho3, Kirk's data are reanalyzed. For Kirk's data,

(2X4) = \ ~21

~22

/3'23

]~24 '

(lOX2)

l(5xl )

and
(2X4)

=(3.75 \ 1.75

4.00 3.00

7.00 5.50

8.00). 10.00

To test Ho3 , the matrices C = ( 1 , the matrices

1) and A = 1 2 are selected. To test Ho2,

C = I 2 and

A=

--1

=D

0 0

--1 0

--1 -

are used. Finally, Hol m a y be tested by using

C=(1-1)

and

A=

10 ilt )
1 1 =D 3

0 0

-1 0

In all cases, F = 0. The M A N O V A table for the analysis is shown in Table 4.2. Alternatively, testing -~o3~4(~), HU)o2 and Ho~ by selecting A such that A'A = I, the following matrices are employed. //o~): C = ( 1 , - 1 ) , A'=( 2 1,1 ~1 , ~ , ) A= 0.707107 -0.707107 0.000000 0.000000 0.408248 0.408248 -0.816497 0.000000 0.288675 0.288675 0.288675 -0.866025

=(,),

56 Table 4.2 Multivariate analysis I Hypothesis

Neil H. Timm

MSP = SSP/u 8.00 (Sym)) 4.00 2.00 6.00 3.00 4.50 8.00 - 4.00 - 6.00 8.00 3.25 (Sym)) 7.75 30.50 1.75 2 8 . 5 0 42.50 2.00 (Sym) t 1.00 0.50 7.00 - 3.50 24.50] (Sym) 0.677 0.333 0.167 0.500 0.167 0.500 0.167 0.500 0.167 0.677 (Sym)) 0.833] (Sym) t 0.833]

DF l

A 0.137

p-value 0.1169

//03
-

//02 nol Error //o3

2 1

0.004 0.144

0.0002 0.0371

Ho2 Ho,

1.250 0.677 0.583 0.000 ( 0.583 - 0.250 0.083 ( 0.583 0.250 0.083
-

Table 4.3 Multivariate Anal, sis II Hypothesis SSP 3.125 (2.250 (Sym)) 10.825 52.083 17.759 85.442 140.167] [ 0.0001"000 (Sym) ] ~ 4.287 0.000 0.000 18.375] 9.378 DF 1 1 1 6 (Sym)) 1.584 2.004 1.584 2.004 6 5.790] (Sym)) 6 5.790 / A 0.250 0.027 0.144 p-value 0.2070 0.0014 0.0371

H~)

no(
/-/01 Error

no(e
Ho,

1.752 0.144 0.408 1.752 0.144 0.408

Multivariate analysis of variance of repeated measurements

57

To test H m, the matrix C defined to test H0~) and the matrix A defined to test H0(~ ) are used. The M A N O V A table for t h i s analysis is displayed in Table 4.3. F r o m the entries in Table 4.3, univariate F-ratios for testing for groups, treatments and treatment by group interactions are immediately obtained: 3.125/1 3.125 = 2 . 0 0 ~ F ( 1 , 6 ) , /~ = 9.378/6 = 1.56----3 194.50/3 64.83 = 127.88~F(3, 18),

Ft= 9.126./18 - 0.51


Fg, = 19.375/3 _ 6.46 _ 12.74~F(3, 18). 9.126/18 0.51 The hypothesis (vff) and error (re*) degrees of freedom for each univariate F-ratio are obtained by multiplying the degrees of freedom for each multivariate test by the rank of the normalized post matrix A corresponding to the test. For the F t ratio, v~' = vhR(A ) = 1- 3 = 3 and Pe = veR(A) = 6" 3 ----18. The others follow similarly.

5. Factorial design structures


In m a n y applications of repeated measurement designs, subjects receive treatments in low-order factorial combinations where the sequence of administration is randomized independently for each subject. That is, suppose groups of subjects are randomly assigned to I methods and the set of subjects receive BC treatment combinations. F o r the I = 2 group case, the data m a y be organized as in Table 5.1. To analyze the data in Table 5.1, we represent the observation vector of repeated measures as

Yij=-lxi+eij,

i = 1 , 2 ..... I , j = 1,2 ..... Ji,

(5.1)

so that the parameter matrix of means take the f o r m

B=(~II ~21

Pq2 ~22

/'t13 ~23

~14 ~24

1~15 125

~16 ~26

~17 ~27

118 ~19) (5.2) /28 ~29 '

58

Nell H. Timm

Table 5.1 A 32 factorial design structure


Subjects within Group groups A1 1 2 Bl C1 Yllll Y1211 C2 C3 C1 Yl121 Yl221 B2 C2 C3 C| YlI31 Y1231 B3 C2 C3

YlII2 YllI3 Y1212 Yl213

Y1122 Yl123 Y1222 Y1223

Y1132 Y1133 Y1232 Y1233

Jl A2 1 2

YlJll I Y2111 Y2211

YlJil2

YlJII3

YlJl21 Y2121 Y2221

Y IJi22

Y 1.I223

YlJi31 Y2131 Y223l

YlJ132 ~ IJ133 Y2132 Y2133 Y2232 Y2233

Y2112 Y2113 Y2212 Y2213

Y2122 Y2123 Y2222 Y2223

if2

Y2J211 Y2J212 Y2J213

Y2J221 Y2J222 Y2J223

Y2J231 Y2J232 Y2J233

and X--

for the two group case. Furthermore, we assume that e~j~INp(O, Y~). Corresponding to the multivariate formulation is the classical univariate mixed model.

+ (~Br)i~,, + s<oa+ ( Bs)(,v~ + (vs)(iv,, + ~(iv~,,,


i = 1 , 2 ..... I; k = l , 2 ..... K; j = 1 , 2 ..... 4 ; m = 1 , 2 .....

(5.3)

M,

where s(ili~IN(O, 002), (fls)(i):k~lN(0,002), (ys)(i)/m~IN(O, po2), e(i)ikm IN(0,(1--p)o2), and e(i)jkm, (Vs)(iVm, (fls)(,V are jointly independent. As in the preceding two sections, we will illustrate how from the multivariate analysis the standard univariate results m a y be recovered. The first hypothesis of interest employing the multivariate model is to see whether there is an interaction between A and the levels of B, C and BC which as in Section 4.1 we call the test of parallelism. Following the formulation for testing interaction (parallelism) for the design in Section 4.1, the matrices needed to test for parallelism, with the hypothesis stated

Multivariate anatysis of variance of repeated measurements

59

in the form CBA = 0, are


1 1 1 0 0 0 1 1 1 -1 -1 -1 1 -1 0 1 -1 0 1 -1 0 1 -1 0 1 -l 0 1 1 -1 0 1 -1 0 -1 1 0 0 0 0 0 0 1 -1 0 0 0 0 1 -1 0 -1 0

0 -1
1 0 0 0 0

c=(1,-1),

a=

-1 -1 -1

0 0 0 B

-1 0 C

0 --1 1 0
BC

(5.4)

The first two columns of the post matrix A are formed to evaluate A B , the next two are used to investigate A C, and the last four, constructed from the first two by taking H a d a m a r d vector products, are used to test A B C . Normalizing the post matrix A so that A ' A = I and separating out the submatrices associated with A B , A C and A B C , we sum certain of the diagonal elements of the SSP matrix and the error matrix in the test of parallelism to construct univariate F-ratios. To test B C , given that the parallelism hypothesis is tenable, the matrices

C = ( 1 , - 1),

A--

0
-A l

Al
-A1

0]

and

Al=

I 71
0 -1

(5.5)
are used. The post matrix A is constructed by arranging the elements of B in table form (Table 5.2) and forming linearly independent contracts such that
~ i j -- ~i'J" - - ~ij" "t- ~i'j' ~--- O.

Normalizing A, the univariate F-ratio for testing B C is immediately obtained. If the parallelism hypothesis is not tenable, we m a y test B C with Table 5.2 Rearranged means
C1 Bl B2 B3 Pll = ' ~ l l ~14 ~ ~21 #17 = 1731 C2 ~ 1 2 = 1~/12 /~15 = 1~23 /A18~ ~32 C3 / t 1 3 ~ I]13 /'t16 = 1~23 /'tl9 = '~33 BI B2 B3 C1 /~21 ~'r/l i /~24 = T~21 P'27 = ~31 C2 P,22 ~ ~12 //25 ~ 1]12 /~t28= ~32 C3 P-23 ~ 1"/13 /t26 = ~23 //'29 ~ '~33

60

Neil H. Timm

(BC)* by using C = I 2 and the post matrix A defined in (5.5). The univariate test of B C is not obtained from testing (BC)*.

T o test the main effect hypothesis A, B, and C under parallelism and no


B C interaction, we use the following matrices for C and the post matrices A when the hypotheses are expressed in the form CBA = 0:

(A):

C = ( 1 , - 1),

a=19,

13 03]
(B): C=(1,1), A=
--

03
13 --

13 ,

13
(5.6)

(C):

C=(1,1),

A=

a
a

where a ' = ( 1 , 0 , 1 ) and b ' = ( 0 , 1 , - 1 ) . Normalizing the post matrix A in (5.6), univariate tests are immediately obtained from the multivariate tests. Tests of A, B and C which do not require parallelism are denoted by A*, B*, and C*. In each case, the post matrix A is identical to the corresponding matrices in (5.6), however, the matrices C take the form: (A*): C = ( 1 , - 1 ) , (B*): C=I2, and (C*): C = I 2.

To write each of the multivariate hypotheses in terms of the elements of B given in (5.2), we merely substitute the hypothesis test matrix C and the post matrix A into the general expression CBA = 0 for each hypothesis. To test each of the preceding hypotheses, the expressions S h = (CB~A),[C(X,X) - IC, ]- 1 ( C B A ) , S e = A t Y [ I - X ( X ' X ) - IX'] YA are evaluated where/~ is a matrix of means. EXAMPLE 5.1. Using the data in Table 5.3 the hypotheses discussed and the relationship to the univariate mixed model are illustrated. To obtain univariate tests from appropriate multivariate tests, we stated that one may take the Tr(Sh) and the Tr(Se) and divide the result by p~ and re*, respectively. That is, M S H = Tr(Sh) / uh.R(A), and MSE = Tr(Se) / Pe R(A). Since M S P = S S P / u , we see that the hypothesis mean square and the error mean square can alternatively be obtained by averaging the diagonal elements of MSP matrices. This approach is illustrated for the example. T o illustrate the construction of the multivariate tests discussed, the MSP matrices are displayed in Table 5.4; the post matrix A for each hypothesis without an asterisk (*) has been normalized so that A ' A = I. Averaging the diagonal elements of the hypothesis test matrices of AB, A C and A B C within Paral, BC, C, B and A in Table 5.4 and the diagonal

Multivariate analysis of variance of repeated measurements


Table 5.3 Factorial structure data

61

B1
C1 C2 C3 CI

B2 C2 C3 Cl

B3 C2 C3

Al

s1 s2 s3 s4 s5 s6 s7 s8 s9 Sio s] s~ s~ s~ s; s~ s-~ s; st S~o

20 67 37 42 57 39 43 35 41 39 47 53 38 60 37 59 67 43 64 41

21 48 31 40 45 39 32 34 32 32 36 43 35 51 36 48 50 35 59 38

21 29 25 38 32 38 20 34 23 24 25 32 33 41 35 37 33 27 53 34

32 43 27 37 27 46 33 39 37 30 31 40 38 54 40 45 47 32 58 41

42 56 28 36 21 54 46 43 51 35 36 48 42 67 45 52 61 36 62 47

37 48 30 28 25 43 44 39 39 31 29 47 45 60 40 44 46 35 51 42

32 39 31 19 30 31 42 35 27 26 21 46 48 53 34 36 31 33 40 37

32 40 33 27 29 29 37 39 28 29 24 50 48 52 40 44 41 33 42 41

32 41 34 35 29 28 31 42 30 32 27 54 49 50 46 52 50 32 43 46 split-split plot

A2

elements of the correspondent error matrices, univariate F-ratios are immediately constructed. To illustrate

FABC(4,72) = ( 5 . 5 1 3 + 2 . 6 0 4 +

11.704+47.535)/4 ( 2 0 . 3 5 7 + 3 1 . 4 9 7 + 8.778 + 2 7 . 7 0 0 ) / 4

16.84 = 2 2 . 0 8 = 0.76, 610.22 =27.64, 22.0-----8

FBc(4, 7 2 ) =

(1872.113+0.104+270.937+297.735)/4 ( 4 4 . 4 2 4 + 0.699 + 34.188 + 9 . 6 2 2 ) / 4 (4.033+2.178)/2 3.11 -- 14.31 = 0 . 2 2 ,

FAC(2'36)= ( 7 . 8 1 5 + 2 0 . 8 0 1 ) / 2

Fc(2, 3 6 ) = ( 2 6 1 . 0 7 5 + 1 6 6 . 7 3 6 ) / 2 = 2 1 3 . 9 1 _ 14.95,
(23.475+5.141)/2 F A a ( 2 , 3 6 ) _(0.033+ 18.678)/2 (96.313 + 97.551)/2 = 14.31 9.3___~6= 0 . 1 0 , 96.93 317.42 96.--93- = 3.29,

Fs(2, 36) = (154.133+480.711)/2 (119.417 + 74.448)/2 F A (1, 18) = 3 0 4 2 . 2 2 356.0-------~= 8 . 5 4 .

62

Nell H. Timrn

++++++V V

L~_-f s "'r

"

"

//_

/=_~

l ... + ..........
~ 0 @ 0 0 0 0

/~I
~ ~-~ ~-.~~ ~ o I ~

Multivariate analysis of variance of repeatedmeasurements

63

In Example 5.1, we illustrated the analysis of a repeated measurement experiment with a 32 factorial design within the vector (subject) observa-tion and a simple one-way design for the subjects (vectors). Numerous alternatives to this arrangement are possible and easily analyzed employ ing multivariate procedures.

6.

Crossover/changeover design

Implicit in the vector valued analysis of repeated measurement data has been the assumption that the experimenter randomized the order of treatments for each subject independently to eliminate sequence effects and that there was sufficient delay between treatments to minimize residual or carryover treatment effects. Suppose that in a repeated measurement experiment that sufficient time existed between the administration of two treatments, but it was felt that a sequence effect may be present. T o assess it, a one sample multivariate design may be modified as shown in Table 6.1 to analyze sequence, treatment (represented by a and b) and period effects. The setup for the data in Table 6.1 is identical to the design discussed in Section 4. The parameter matrix is

B = (/111
~21

/12~ ~22 ]"

(6.1)

The test for treatments becomes

HT: ~11 -I" 11122=~12 + l21


Table 6.1 Two-periodcrossoverdesign Sequence
AB

(6.2)

Subjects within sequence


Si $2 S3 S4 S5

Pl
a a a a a b b b b b

Periods /'2
b b b b b a a a a a

BA

S1 S2 S3 S4 S5

64

Neil H. Timm

the primary hypothesis of interest. A test for sequence may be represented


as .lt/S : ( 6lIll "l- ~,12)/2 -=" ( ~21 "1" ~ 2 2 ) / 2 "

(6.3)

Finally, to test for periods, we may use

Hp: ( ~tll "t t*2,)/2 = ( #12+/*22)/2.

(6.4)

In a design with p periods, there are p! possible sequences. When the number of periods is larger than 3, generating 6 possible sequences, one may sample from among the sequences. The ten subjects in Table 6.1 may be grouped into five pairs, each pair forming a 2 2 Latin Square for periods and treatments as illustrated in Table 6.2. This design would be appropriate if the period effect varies from subject to subject. In the simple crossover design, the period effect is assumed to be the same for all subjects. Thus, for a series of Latin Squares, we may assess period square variation.
Table 6.2 Series of Latin Squares Squares Square 1 Subjects within squares S1
S2

Periods P1 Pz a
b

b
a

Square 2

S]

s2
Square 3 S1 S2 S1 S2 S] S2

b
a

a
b

b
a

a
b

Square 4

b a b

a b a

Square 5

As suggested by Cochran and Cox (1957), the data in Table 6.2 may be viewed as an incomplete four way layout or r replicate Latin Squares. Thus, we have a square by subject by period by treatment design. Letting r represent the number of squares and d the size of the square, the rd 2 degrees of freedom for the incomplete design may be partitioned as follows.

Multivariate analysis of variance of repeated meaz'urements


Incomplete design Squares Subjects Subjects x squares Periods Periods squares Treatments Residual "Total" df r- 1 d- 1 ( d . - l ) ( r - 1) ( d - 1) ( d - 1)(r - 1) d- 1 r ( d - 1)2 - ( d - 1)

65

rd 2 - 1

For the simple crossover design (Table 6.1) and the Series of Latin Squares (Table 6.2), the following results.
Simple crossover design Sequences Subjects within sequences Periods Treatments Residual "Total" r- 1 Series of squares Squares Subjects within squares Periods Periods squares Treatments Residual "Total" r- 1

r ( d - 1)
d- 1 d- 1 (d- 1)(rd-2) rd 2 - 1

r(d-- 1)
d- 1 ( d - 1 ) ( r - 1) d- 1 r ( d - 1)2 - ( d - 1)

rd 2 - 1

To analyze the data in Table 6.2, from a multivariate point of view, the data ignoring empty cells are organized as a one-group multivariate design:
PI Al Sl Sa S3 S4 S5 a a a a a BI b b b b b Al a a a a a P2 B2 b b b b b

with parameter matrix B = ( #1/x2 ~3 ~4)" To test for period effects,

Hp: ( Iz, +/~2)/2 = (/x3 +//,4)/2


the matrices C and A are: C = 1 and A'=(1/2,1/2,1 / 2 , - 1/2). (6.5)

66 T o test for t r e a t m e n t effects,

Nell H. Timm

I'IT: (/11 + / 1 3 ) / 2 = (/12 +/14)12,


the matrices C a n d A are: C=1 and

A'=(1/2,-1/2,1/2,-1/2).

(6.6)

T o assess the effect of squares or blocks, the data are o r g a n i z e d as in T a b l e 6.2. Now, however, the p a r a m e t e r matrix takes the f o r m

B, = (/ZII /112

1121 /122

/131 1132

/141 /142

/151 /152 ]"

T o test for squares, the matrices C a n d A are:

C=(14

" .-1),

m = (12).

(6.7)

EXAMPLE 6.1. C o c h r a n a n d C o x (1957, p. 130) give d a t a for c o m p a r i n g the speeds of two calculators A a n d B. T h e order of the m a c h i n e s was b a l a n c e d a n d assigned to subjects w h o p e r f o r m e d o p e r a t i o n s first o n one m a c h i n e a n d t h e n o n the other. T h e d e p e n d e n t variable was the time (seconds m i n u s 2 m i n u t e s ) t a k e n to calculate a s u m of squares. T h e d a t a for the e x p e r i m e n t are s h o w n i n T a b l e 6.3. T o analyze the d a t a from a m u l t i v a r i a t e p o i n t of view, the d a t a are reorganized as i n T a b l e 6.4 for the w i t h i n subject analysis. Table 6.3 Cochran and Cox data Squares Square 1 Square 2 Square 3 Square4 Square 5 Subjects within blocks S1 S2 S1 S2 SI S2 Sl S2 S1 S2 First (P1) calculation A B A B A B A B A B 30 21 22 13 29 13 12 7 23 24 Second (Pz) calculation B A B A B A B A B A 14 21 5 22 17 18 14 16 8 23

Multivariate analysis of variance of repeated measurements

67

Table 6.4 Within subject analysis


Pl P2 B1 A2 B2

Subjects
S1 S2 S3 S4 S5

A1

30 22 29 12 23

21 13 13 7 24

21 22 18 16 23

14 5 t7 t4 8

F o r m i n g the m a t r i c e s

C= 1

and

,, [

---2

i]

(6.8)

so t h a t A 'A = 1, the M S h a n d M S e m a t r i c e s are: M S h = ( 64.80 144.00 M e e = ( 30.425 9.625 (Sym) / 320.00 ] (Sym)) 1 1.625

(6.9)

with v h = 1 a n d v e = 4 . Since A ' A = I a n d the R ( A ) for e a c h test is 1, the a p p r o p r i a t e F - r a t i o s f r o m the a n a l y s i s are o b t a i n e d f r o m t h e d i a g o n a l s of the m a t r i c e s in (6.9) a n d A N O V A "Fable 6.5 results. T o a n a l y z e the effects of squares, the d a t a are o r g a n i z e d as in T a b l e 6.6. U s i n g C = ( 1 4 : - 1) a n d A ' =(0.707107, 0.707107), the F - r a t i o for the test for differences in" squares is F = 2 1 8 . 3 / 4 = 54.5750 - - 1 . 9 5 6 1 ~ F ( 4 , 5). 139.5/5 27.9000

Table 6.5 Latin Square series within subject analysis Source Period Treatment Error period Error treatment df t 1 4 4 MS 64.80 320.00 30.425 11.625 F 2.1298 27.5269 p-values 0.2183 0.0064

68 Table 6.6 Block analysis Squares Square 1

Neil H. Timm

Subjects Within Blocks


S1

Pl 30 21 22 13 29 13 12 7 23 24

P2 14 21 5 22 17 18 14 16 8 23

S2 Square 2 Sl S2 S1 S2 St S2 S1 S2

Square 3

Square 4

Square 5

Table 6.7 ANOVA for Cochran and Cox data Source Between Squares Subjects within squares Within Periods Treatments Error periods Error treatment "Total" SS 218.30 139.50 64.80 320.00 121.70 46.50 910.80 df 4 5 1 1 4 4 19 MS 54.575 F 1.9561 p-value 0.2397

64.800 320.000 30.425 11.625

2.1298 27.5269

0.2183 0.0064

U s i n g the multivariate split design a p p r o a c h because of the i n c o m p l e t e vector observations, we c o m b i n e the results into one table, T a b l e 6.7. As usual, subjects are r a n d o m a n d squares, periods, a n d t r e a t m e n t s are fixed. While it is possible to recover the analysis of u n i v a r i a t e designs with i n c o m p l e t e within subject vector d a t a f r o m a m u l t i v a r i a t e p o i n t of view, the variations in the r e o r g a n i z a t i o n of the data for the m u l t i v a r i a t e analysis are c o m p l e x because the m u l t i v a r i a t e a p p r o a c h requires complete vectors.

7. Multivariate repeated measurements I n the preceding designs, each subject was observed at several experim e n t a l t r e a t m e n t c o n d i t i o n s a n d o n e variate was measured. I n m a n y e x p e r i m e n t a l situations, data o n several variates are o b s e r v e d repeatedly

Multivariate analysis of variance of repeated measurements


Treatment subject

69
tq [ v (q)

tl

"

Ai

si

Yijl ~

~lj2]
vO)/
.

I "j' I
I
J~ijq ~

tv<!)| t-up j

/ y<2) t L '~PJ

?/
.

IJ(q) I

l-U~' J

v(q)

Fig. 7.1. p-variate observations over q conditions.

over several experimental conditions. Designs with multivariate observations on p variates over q conditions are called multivariate or multi-response repeated measurement designs since the multivariate observations are not commensurable at each treatment condition, but are commensurable over conditions a variable at a time (Fig. 7.1). Since each of the p-variates are observed over q conditions, it is convenient to rearrange the data in Fig. 7.1 by variates for a multivariate repeated measures analysis so that each variate is observed over q periods (Fig. 7.2). The data matrix Y for the analysis is of order N p q , where the first q columns correspond to variable one, the next q to variable 2, the next q to variable 3 and so on up to the pth variable. Alternatively, using the data as arranged in Fig. 7.2, a multivariate mixed model analysis of variance procedure may be used to analyze multi-response repeated measures data. This would be done by simply extending the univariate sum of squares to sum of squares and products matrices a n d calculating multivariate criteria to test hypotheses. However, for such an analysis we must not only assume a restrictive structure on the variance-covariance matrix associated with each variable over q conditions but that the structure on each variance-covariance matrix between variables across conditions is constant. This is even more restrictive than the univariate assumptions and for this reason is not usually recommended. Instead, a multivariate approach should be used.
Treatment subject

2
y(') ]

lp

y(2) |
Ai si Yij2 ~ .

YUp=

iy(. [ oq) l j| Fig. 7.2.

Iv(q)/

l" iJ: J

ty ,q,j

Data layout for multivariate repeated measures design.

70

Neil H.

Timm

Table 7.1 Means for multivariate repeated measurements data 1 Conditions Variables 2 3

CI

C2

C3

C1

C2

C3

C1

C2

C3

Treatments
A2

AI

/11 /12 /13 /2l /22 /23

/14 /15 /16 /24 /25 /26

/17 /18 ~19 /27 /28 /29

To analyze profile data for multivariate measurements arranged as in Fig. 7.2, the general linear model is again used. Letting p--q---3, for the arrangement of population parameters shown in Table 7.1, we consider some hypotheses which might be of interest for multi-response data. The first hypothesis of interest for profile data is whether the profiles for each variable are parallel. That is, is there an interaction between conditions and treatment? The hypothesis may be stated as

H(AC)*: (/11 --/12,/12-/13 . . . . . 118 --/19)


= (/21 --/22,/22 --/23 . . . . . /28 -- /29)" (7.1)

The matrices C and A to test H(ac). are

C(AC).'=(I--I)

and

9A6 = o

-1
0 -

(7.2)

To test for differences in treatments, HA., where H A, is


HA*: ~1 = [0~2,

(7.3)

the matrices

CA,=(1--1 ) and

A=I

(7.4)

are constructed. For differences in conditions,

[/H l
/~21 I
/14 ]

/,2]
/22 / /15 [

/24 /
/17 [ /27 J

/25
/18 /

/28]

/13 ] ~23 / /16[ /26 / /19 / /29 J

(7.5)

Multivariate

analysis ofvariance of repeated measurements

71

the test matrices are


Cc, = 12

and

A = D
96

[1 !1
- i
0

(7.6)

--

Given parallelism, tests for differences between the two treatments and among conditions are written as
3

Z ~v/3
j=l
6

A: . v l 3

j=4
9
j=7

Y~ .v13
and

=lj~=9 4p'2j/3 [ ~" 1*2j/3


LJ =7 2

t
2 2

j!l ~12j/3
(7.7)

IZil/2
i=1

Z "i2/2
i=1 i=1 2

t*,3/2
(7.8)
i=I 2

C:

~. IZi412 = i=1
2

~. ,i512
i=1

i=l

/*,7/2

Z I~i8/2
i=1

Y .,9/2
i=l

respectively. Hypothesis test matrices to test hypotheses A and C become

1
C c = (,),

9X3

oIil
3

(7.9) A =D
9x6

0 -1

Provided the post matrix A, for hypotheses stated as CBA = 0, is normalized so that A'A = I, multivariate mixed model multivariate criteria are immediately obtained from the multivariate approach for the hypotheses A, C, and (A C)*. This is not the case for the hypotheses A* and C*. EXAMPLE 7.1. To illustrate the tests of several multivariate hypotheses and to show how one may recover mixed model results from a multivariate

72 Table 7.2

Neil 1-1. Timm

Individual measurements utilized to assess the changes in the vertical position of the mandible at three time points of activator treatment Subject Group Number 1 2 3 4 5 6 7 8 9 Means 1 2 3 4 5 6 7 8 9 Means
1 117.0 109.0 117.0 122.0 116.0 123.0 130.5 126.5 113.0

SOr-Me (ram)
2 117.5 110.5 120.0 126.0 118.5 126.0 132.0 128.5 116.5 3 118.5 111.0 120.5 127.0 119.5 127.0 134.5 130.5 118.0 1 59.0 60.0 60.0 67.5 61.5 65.5 68.5 69.0 58.0

ANS-Me (mm)
2 59.0 61.5 61.5 70.5 62.5 61.5 69.5 71.0 59.0 3 60.0 61.5 62.0 71.5 63.5 67.5 71.0 73.0 60.5

Pal-MP angle (degrees)


1 10.5 30.5 23.5 33.0 24.5 22.0 33.0 20.0 25.0 2 16.5 30.5 23.5 32.0 24.5 22.0 32.5 20.0 25.0 3 16.5 30.5 23.5 32.5 24.5 22.0 32.0 20.0 24.5

T~

119.33 128.0 116.5 121.5 109.5 133.0 120.0 129.5 122.0 125.0 122.78

121.72 129.0 120.0 125.5 112.0 136.0 124.5 133.5 124.0 127.0 125.72

122.94 131.5 121.5 127.0 114.0 137.5 126.0 134.5 125.5 128.0 127.28

63.22 67.0 63.5 64.5 54.0 72.0 62.5 65.0 64.5 65.5 64.28

64.00 67.5 65.0 67.5 55.5 73.5 65.0 68.0 65.5 66.5 66.00

65.61 69.0 66.0 69.0 57.0 75.5 66.0 69.0 66.0 67.0 67.17

24.67 24.0 28.5 26.5 18.0 34.5 26.0 18.5 18.5 21.5 24.00

25.17 24.0 29.5 27.0 18.5 34.5 26.0 18.5 18.5 21.5 24.22

25.11 24.0 29.5 27.0 19.0 34.5 26.0 18.5 18.5 21.6 24.29

analysis, the data provided by Dr. Tom Zullo in the School of Dental Medicine at the University of Pittsburgh and displayed in Table 7.2 are used. Tests of differences between groups (7.3), differences among conditions (7.5) and the interactions between groups and conditions (7.1) are the primary hypotheses of interest. Alternatively, given that the interaction hypothesis is tenable, tests for differences between groups and differences
Table 7.3 Multivariate profile analysis of Zullo's data Hypotheses Wilks' A 0.583 0.422 0.026 0.884 0.034 DF (6,1,16) (9, 1,16) (6, 2,16) (3,1,16) (6,1,16) p-value 0.3292 0.3965 < 0.0001 0.6176 < 0.0001

(AC)* A* C* A C

Multivariate analysis of variance of repeated measurements

73

among conditions, as defined in (7.7) and (7.8), may be tested. Using Wilks' A-criterion to test the hypotheses, the results are displayed in Table 7.3. As mentioned previously, mixed model multivariate tests are obtained from the appropriately normalized multivariate hypotheses in Table 7.3. To see this, consider the hypothesis and error mean square and product matrices for testing C; the matrices were obtained by normalizing the 9 6 post matrix A given in (7.9) so that A'A = 1 for the hypothesis given in (7.8) when written in the form CBA = O. (Sym) 4.89~ MSP c = 2'927 " - . ~ - 0.53._~) " 7.493 ~ = 0:0~58

1.363.) - 4 . 8 7 5 ~ , - 0.148.) - ~ 0 . 3 7 9 ~

(Sym) 1 MSP e = -0.425 ~_0.161


"----

023 0 li6)

01049)

Averaging the "circled" diagonal dements of the above matrices, the MSP c and MSP e matrices for the multivariate mixed model test of C are obtained: [ 76.463 MSPc=/47.894 [. 7.373 0.472 0.273 -0,271 (Sym) ] 31.366 4.280 0.795 j | (Sym) ] 0,852 -0.164 0.773

MSPe =

The degrees of freedom associated with t~e multivariate mixed model matrices are vff and v~*,obtained from the foJ~ula vff---Vh.R(A)/p -" 1.6/3 --2 and v*=b~.R(A)/p:~16.6/3= 32, where p denotes t h e n u m b e r of variables, R(A) is the ~ank of the post mafrix A, and r h and Pe are the hypothesis and error degrees of ffe~edom assoeiated with Wiiks' A-eriterion for testing C a~ shown in Table 7.3. Wiiks' A-criterion for testing C using

74

Neil H. Timm

170.05 the multivariate mixed model is A = 0.0605 which is compared to ,~ (3,2,32)--0.663. The p-value for the test is less than 0.0001. Analyzing the data a variable at a time using three univariate mixed model split-plot designs, the univariate F-ratios for testing C are irnmediately obtained from the multivariate mixed model analysis, q-"ne univariate F-ratios are:

Variables SOr ANS Pal

/;'-value 76.463/0.472 = 162.1 31.366/0.852=36.82 0.795/0.773 = 1.03

p-value < 0.0001 < 0.0001 0.3694

8.

Growth curve analysis

While the standard M A N O V A model (SMM) is applicable in many experimental situations, the model has several limitations if an experimenter wants to analyze and fit growth curves to the average growth of a population over time. To analyze data obtained from a growth curve experiment, Potthoff and Roy (1964) developed the growth curve model (GCM) which is a simple extension of the standard MANOVA model. The model considered by Potthoff and Roy is given by

E(11o)= XBP,
V(Yo) = I.Xo (8.1) where Y0 (n q) is a data matrix, X (n m) is a known design matrix, B (rep) is a matrix of unknown nonrandom parameters, P (p q) is a known matrix of full rank p < q, ~0 (q q) is positive definite and the rows of Y0 are independently normally distributed. Comparing the G C M with the SMM, we see that only the post matrix P has been added to the model. This implies that each response variate can be expressed as a linear regression model of the form
E(y,) =

where y~ (qX 1) is the observation vector for the ith subject and /3~ is a vector of unknown parameters. To analyze (8.1), Potthoff and Roy suggested the transformation
y= I1o G - i e , ( p G -

1p,)- l

(8.2)

where G (q q) is any symmetric positive definite weight matrix either

Multivariate analysis of variance of repeated measurements

75

nonstochastic or independent of Y0 such that P G - JP' is of full rank. Employing the transformation in 8.2, the matrix Y (n p ) will be distributed mutually independently normal with unknown p.d. variance-covariance matrix

(p p)

le,(pc,

and mean E ( Y ) = X B . Hence, by using (8.2) we have reduced the G C M to the SMM with minor limitations on the selection of G. Motivation for the selection of the transformation in (8.2) by Potthoff and Roy is contained in Appendix B of their (1964) paper; they show that the BLUE of an estimable linear parametric function qJ = e ' B a (where the estimability conditions are that e belongs to the space spanned by X ' X and a belongs to the space spanned by the columns of P ) is given by

~=e'/3a.
/} = (X'X)

X' r0Zo

1P'(PEo-1p,)-1

(8.3)

Since (8.1) reduces to (8.2) under (8.2) we see that


= (X'X)S ' Yo G - I P ' ( P G - I P ' ) - 1 ,

with G replacing X0 in (8.3),/~ is very close to the BLUE. To test hypotheses of the form H0: C B A = F (8.4)

under (8.1), we merely have to substitute Y defined in (8.2) into the expression for S h and S e in (2.11) and (2.12). The degrees of freedom for the hypotheses is P h = R ( C ) and the degrees of freedom for error is
ve = n R(X)= n-r.

Setting F = 0 in (8.4) and letting Y be defined as in (8.2), the hypotheses and error sum of square and products matrices take the following form.
Sh = A' Y'X(X'X)C'[ C(X'X)C ' ] - I C ( X ' X ) - X ' YA,

so = A" r'[

- X(X'X)

x ' ] VA,

(8.5)

where uh = R ( C ) and ve = n - r. Under the SMM, no criteria is uniformly most powerful (Schatzoff, 1966; Olson, 1974). This is also the case for the GCM; however, in the G C M we have the additional problem of selecting the weight matrix G

76

Neit H. Timm

when p < q. If p = q, the transformation in (8.2) reduces to


y = 11oP - ~,

or if P is an orthogonal matrix so that P - I = p,,


Y = Yo P '

and there is no need to choose G. This was the approach taken by Bock (!963a) and the one used in the development of the M U L T I V A R I A N C E package (Bock, 1963b, 1975; Finn, 1972 [ M U L T I V A R I A N C E VI incorporates the P o t t h o f f - R o y analysis using Se to estimate Z]). I f p < q the choice of G is important since it affects the variance of q~ which increases as G -1 departs f r o m Zff ~, the power of the tests and the widths of confidence bands. A simple choice of G is to set G = I. Then

r= toe'(Pc')-'.
Such a choice of G will certainly simplify one's calculations; however, it is not the best choice in terms of power since information is lost by reducing Y0 to Y unless G is set equal to S 0. The estimator of q~=c'Ba, when it is estimable and G is set equal to I, is the B L U E of ~p assuming Z 0 = azI. To try to avoid the arbitrary choice of the matrix G in Potthoff and Roy's model and its effect on estimates and tests, Rao (1965, 1966, 1967, 1972) and Khatri (1966) independently developed an alternative reduction of model (8.1) to a conditional model.
E ( Y I Z ) .---X B + Z F

(8.6)

where Y (n p) is a data matrix, X (n m) is a known design matrix, B (m p ) i s a matrix of unknown nonrandom parameters, Z (n h) is a matrix of covariates and F (h p ) is a matrix of unknown regression coefficients. T o reduce (8.1) to (8.6) a q q nonsingular matrix H = ( H 1 H 2 ) is constructed so that the columns of H~ form a basis for the vector space spanned by the rows of P, PH~ = I and P H 2 = 0. When the rank of P is p, H I and H2 can be selected as
Hi = G - 1 P ' ( P G - t p , ) - 1, 1_12==1 - H1p,

where G is an arbitrary positive definite matrix. Such a matrix H is not unique; however, estimates and tests are invariant for all choices of H

Multivariate analysis of variance of repeated measurements

77

satisfying the specified conditions (see Khatri, 1966). Hence, G in the expression for H~ does not affect estimates or tests under (8.6). By setting

y = YoHl = yo G -1p,(p G -,p,) -1, Z = YoH2,

(8.7)

E ( Y ) = X B and E ( Z ) = 0 ; thus, the expected value of g given Z is seen to be of the form specified in (8.5) (Khatri, 1966; Grizzle and Allen, 1969). Using (8.6), the information contained in the covariates Z = YoH2, which is ignored in the P o t t h o f f - R o y reduction, is utilized. Both Rao and Khatri argued that the B L U E under the conditional model of ~ = e'Ba is more efficient than that obtained by Potthoff and Roy since their estimator includes information in Z ignored by Potthoff and Roy. This is not the case. As shown by Lee (1974) and Timm (1975) employing the standard multivariate analysis of covariance (MANCOVA) model,

g=(x'x) X'roS le'(l"S-'e')

(8.8)

where S = Y6[I- X ( X ' X ) X'] Yo. Khatri (1966) u~ng the maximum likelihood procedure obtained the same result for B. Thus, if p <q, Rao's procedure using q - p covariates, Khatri using maximum likelihood methods and Potthoff and Roy's method weighting by G - X = S - 1 are identical. Setting G = I in the Potthoff and Roy method is equivalent to not including any covariates in the R a o - K h a t r i reduction. When p = q, t72 does not exist. Testing the hypothesis /4o: CBA = F where F--0, is not the same under the Potthoff and Roy and R a o - K h a t r i reductions. Employing the standard M A N C O V A model,

S h =A' Y ' X ( X ' X ) - C ' ( C R C ' ) - I c ( x ' x ) Se=A'(PS - ' P ' ) - ' A ,
where

X' YA,
(8.9)

R=(x'x) +(x'x) X'ro[S-'-s-'e'(es 'e')-'] roX(X'X)-,


y=yo S-IP,(P S vh = R ( C ), lp,) l
and

ve=n-r-h

h=q-p.

78

Neil H. Timm

Although Potthoff and Roy's approach does not allow G to be stochastic unless it is independent of Yo, it is interesting to compare (8.6) and (8.9) if G = S. Then

Se=A'r[ I-X(X'X
=A'(PS-IP')-Ips

--X']rA
IYo[1-X(X+X)--X' ] YoS 1P+(ps-Ip')-IA

= A'(es--'e')-'
=A'(PS-lp') 1A,

which, except for the degrees of freedom for error, is identical to S~ obtained under the R a o - K h a t r i reduction. The sum of squares and products matrix Sh, however, is not the same. The development of the G C M by Potthoff and Roy and the subsequent R a o - K h a t r i reduction has caused a great deal of confusion among experimenters trying to use the model in growth curve studies. A paper which helped to clarify and unify the methodologies was by Grizzle and Allen (1969). They also developed a procedure for selecting only a subset of the q - p covariates. Potthoff and Roy's analysis of the G C M was developed by introducing the transformation

Y = YoG -JP'(PG

1p,) -I

to reduce the G C M to the SMM. T o avoid having a test procedure that was dependent on an arbitrary positive definite matrix G, Rao (1965) and Khatri (1966) proposed an alternative reduction to the standard MANCOVA model which did not depend on G. Their procedure, as discussed by Grizzle and Allen (1969), depends on selecting the "best" set of q - p covariates. In addition, one may question the use of covariates that are part of the transformed variables of the dependent variables being analyzed To avoid these problems, Tubbs, Lewis and Duran (1975) developed a test procedure to test H0: CBA = F, employing maximum likelihood methods directly under the GCM. Under the GCM, the maximum likelihood estimator of B is

= (X'X)-X'YoS -IP'(PS -'P')-'

(8.10)

Multivariate analysis of variance of repeated measurements

79

and u n d e r / t 0 : CBA = F

(x'x) -c'[
[A'(PS-'P)-iA]-'A'(PS
,p,)- 1

(8.11)

Using the likelihood ratio criterion due to Wilks,

sh =

r)' L c ( x ' x ) - c' l

r),
(8.12)

Se=A'(PS -IP')-IA~

where uh = R(C) and ve = n - ro Comparing this result with that proposed by Rao and Khatri, we see that each Sh is different, but have the same degrees of freedom and that S~ is identical for both procedures, but have different degrees of freedom. However, as pointed out by Kleinbaum (1973), both procedures are asymptotically equivalent since they have the same asymptotic Wishart distributions. N o information is available about the two procedures for small samples or about the relative power of each procedure. In the analysis of growth curve data, observations at some time points may be missing either by chance or design so that each dependent variate is not measured on each subject. In addition, the design matrix X may not be the same for each dependent variate. While these problems have been discussed in the literature by Trawinski and Bargmann (1964), Srivastava and Roy (1965) and Srivastava (1966, 1967, 1968), extending the theory of the SMM, Kleinbaum (1973) developed a generalized growth curve model ( G G C M ) for estimating and testing hypotheses when observations are missing either by chance or design with different design matrices corresponding to different response variates. As discussed by Srivastava (1967) and more generally by Kleinbaum (1970), to obtain B L U E of every parametric function q~= e'Ba in complex multivariate linear models (linear models with design matrices that are not the same for each dependent variate) that are independent of the unknown elements of the variance-covariance matrix, requires additional restrictive conditions on the model (see, e.g., Kleinbaum, 1970, p. 58). This led Kleinbaum (1973) to consider Best Asymptotically Normal (BAN) estimators for the G G C M which use consistent estimators of Z 0 and generally yield nonlinear estimators with variances that are in large samples the minimum that could be achieved by linear estimators if Z 0 were known.

80

Neil H. Timm

To test hypothesis of the form H0: CBA = 0 assuming a G C M withp <q, three frequentist analyses have been suggested to applied researchers over the past decade. Potthoff and Roy: Using the transformation Y = YoG - I P ' ( P G -- Ip,)-- 1 and forming the estimator B = ( X ' X ) - X ' Y , the hypothesis and error matrices are formed:
S~=A'Y'X'(X'X) C'[C(X'X) X ' ] YA, C']-1C(X'X)=X'rA~

Se = A ' Y ' [ I - X ( X ' X ) -

(8.13)

where eh = R ( C ) , v~ = n - r and G is any symmetric positive definite weight matrix either non-stochastic or independent of Yo such that P G - lp, is of full rank. Tubbs, Lewis and Duran: Using maximum likelihood procedures, which is equivalent to setting G = S in the Potthoff and Roy model, they obtain

= ( x ' x ) - x ' roS - 1 e ' ( e s - l e ' ) - 1 = ( x ' x ) - x '


Sh =A' YX(X'X)C'[ C(X'X)C']-~C(X'X)-X'

Y,
YA,

(8.14)

S e = A ' Y ' [ *-- X ( X ' X ) - X'] r a = A ' ( e S - 1 P ' ) - ' A ,


where Ph = R ( C ) , ve = n - r , S= Yg[I- X(X'X)-X']Yo yo S - 1 p , ( p s - i p , ) - 1. Rao-Khatri." Using a conditional model with
= (X'X)X ' Yo S - ' P ' ( P S - i p , ) - 1 = ( X ' X ) - X " Y,

and

Y=

y = yo S - l e , ( e S - 1 p , ) - 1,

s= r6[I-x(x'x)the matrices

x']ro,

sh = ~' r ' x ( x ' x ) - c'( c n c ' ) - ' c ( x ' x ) - x ' rA, Se=~'(eS-'e')-'A,
(8.15)

g = ( X ' X ) - + ( X ' X ) - X ' r0[ S - ' - S - l e ' ( e s - ,e,)- ' e s -'] r o X ( X ' X ) - ,
are formed where Ph = R ( C ) and Pe = n - - r - - q + p . While the procedure of Rao and Khatri has been "accepted" as the usual procedure employed in growth curve studies over the years and is asymptotically equivalent to the

Multivariate analysis of variance of repeated measurements

81

procedure proposed by Tubbs, Lewis and Duran, we do not know which of the procedures are best in small samples. Perhaps the determination cannot be answered on the basis of power, but on whether in assessing growth the notion of conditional versus unconditional inference is being raised (Bock, 1975). While the work of Kleinbaum has begun to address the data problems we have in analyzing data in the behavioral sciences, his procedure may lead to spurious test statistics since it depends on the method used to estimate ~20 in the construction of the BAN estimator. In this section, we have reviewed the classical frequentist analysis of the growth curve model for modeling the average growth curve for a population. Geisser (1979) reviews the basic sampl!ng distribution theory for determining confidence bounds for growth curves and more importantly considers Bayesian procedures which may be used to analyze individual growth curves. EXAMPLE 8.1. To illustrate the application of the GCM for a set of data, the data given in Table 7.2 are utilized. From the mean plots of the data in Table 7.2 for each group and variable (Figure 8.1) it appears that the growth curves for the three
SOr 128 70 ANS 30 28 / ~ 2 6
J

Pal

'26 24

68 66~

60 1 I 2 , 3 Fig. 8.1.

20 2 3 1 2
3

M e a n plots for d a t a in T a b l e 7.2.

82

Neil H. 77mrn

variables are at least linear. Some other questions of interest for the data include: (1) Are the growth curves for the two groups parallel for one or more variables? (2) If we have parallel growth curves, for some variables, are they coincident? (3) W h a t are the confidence band(s) for the expected growth curve(s)? Depending on whether we take p = q = 3 when analyzing the data in Table 7.2, the procedure used to answer questions (1), (2), and (3) will differ. For illustrative purposes, we will demonstrate both techniques using a program developed at the Educational Testing Service called A C O V S M (J6reskog, van Thillo, and Gruvaeus, 1971). Assuming that p = q = 3, the matrix B for the data in Table 7.2 is

B=[BIo
t B20
with P defined as

3H

312 01o 011 012 410 411 ~12~,,


022 420 421 422 ]

B21 /~22 020 021

P=

A1

and

AI=

0
B is estimated by

A1

1 1

2 4

3 , 9

~=
115.778 4.139 - 0 . 5 8 3 118.444 5.028 - 0 . 6 9 4 To test for parallelism, 63.278 --0.472 0.417 23.611 1.333 - 0 . 2 7 8 ] 62.000 2.555 - 0 . 2 7 8 23.622 0.456 -0.0781"

Hp: (fill

/~12 011

012 411 ~12)=~(/~2l

/322 021

022 421 422)

simultaneously for all variables, the matrices

C=(1-1),

A=

0 0

A1 0

0 A1

where

AI=

are used. Witks' A-criterion for the test is A = 0.583 and comparing A with = 0.426, the parallelism test is not rejected. The p-value for the test is % -- 0.3292.

U~05,16)

Multivariate analysis of variance of repeated measurements

83

Given parallelism, we next test for coincidence again assuming p = q. For this test, C = ( 1 - 1 ) and A =/9- Computing Wilks' A-criterion, A = 0.422. Since tables for the U distribution are not available for U ~ = U~[~6~, we may compute either Rao's multivariate F-statistic, F = 1 . 2 1 6 with 9 and 8 degrees of freedom, or Bartlett's Chi-square statistic, X2= 9.915 with 9 degrees of freedom; both are approximations of the general U-distribution (Rao, 1973, p. 556). The p-values for the two criteria are % = 0 . 3 9 6 5 and %=0.3575, respectively, indicating that we would not reject the coincidence hypothesis. Treating the data in Table 7.2 as data obtained from a single group, the c o m m o n regression function for all variables is B=(ll7.111 4.583 - 0 . 6 3 9 62.639 i.041 0.069 23.617 0.894 -0.178)o

Instead of analyzing Zullo's data with p = q, suppose that we decided a priori or through a statistical test that the regression model for each variable was linear. Then, p < q and

B=(

J?lo

fill

010

820

/~21

020

011 021

~10 ~20 ~21 "

~11)

Using the R a o - K h a t r i model with G = S, we test the coincidence hypothesis using the matrices C = ( 1 - 1 ) and A = / 6 . For this test, A = 0 . 4 4 0 and comparing A with U(6, 0.o5 _ x, 13~-0.271, we conclude that the growth curves for each group are coincident for all variables. The p-value for the test is % =0.2300. However, with p < q and G = S, the models fit to each variable take the following form Ysor = 121.210-t-1.820t YANS= 63.285 + 1.196t

YWl = 25.045 -- 0.023 t


which, as expected, do not agree with the models arrived at by taking G = 1 since p < q. Comparing the three regression models which m a y have been obtained using Zullo's data, the observed and predicted values for the models are displayed in Table 8.2. Using Wilks' A-criterion, we m a y construct ( 1 - a)% simultaneous confidence bands for each variable and each model (Geisser, 1979).

84 Table 8.2 Regressign models

Neil 1-1. ]Tram

Predicted means

Observed means 121.056 123.722 125.111 63.750 65.000 66.389 24.333 24.694 24.700

Quadratic (p = q) 121.355 123.721 125.109 63.750 64.999 66.386 24.333 24.693 24.697

Linear (p <q, G = 1) 121.268 123.296 125.324 62.408 63.727 65.048 24.210 24.393 24.576

Linear (p <q, G = S) 121.210 123.030 124.850 63.825 65.021 66.217 25.045 25.022 24.999

9.

Summary

I n this c h a p t e r w e h a v e i l l u s t r a t e d t h r o u g h s e v e r a l e x a m p l e s the a n a l y s i s of r e p e a t e d m e a s u r e m e n t d a t a e m p l o y i n g m u l t i v a r i a t e m e t h o d s u s i n g s e v e r a l s t a n d a r d designs. W e h o p e t h a t t h r o u g h t h e e x a m p l e s s e l e c t e d t h a t the M A N O V A m o d e l will r e p l a c e t h e s t a n d a r d u n i v a r i a t e t e c h n i q u e s t h a t o c c u r in p r a c t i c e ( F e d e r e r , 1975, 1977) w h e n u n i v a r i a t e m i x e d m o d e l a s s u m p t i o n s are n o t satisfied.

References Bartlett, M. S. (1939). A note on tests of significance in multivariate analysis. Proc. Cambridge Phil. Soc. 35, 180-185. Bartlett, M. S. (1950). Tests of significance in factor analysis. Brit. J. Psychol. (Statist. Section) 3, 77-85. Book, R. D. (1963a). Programming univariate and multivariate analysis of variance. Technometrics 5, 95-117. Bock, R. D. (1963b). Multivariate analysis of variance of repeated measurements. In: C. W. Harris, ed., Problems of Measuring Change, University of Wisconsin Press, Madison, 85-103. Bock, R. D. (1975). Multivariate Statistical Methods in Behavioral Research, McGraw-Hill, New York. Bowker, A. H. (1960). A representation of Hotelling's T 2 and Anderson's classification statistic. In: I. Olkin et al., eds., Contributions to Probability and Statistics, Stanford University Press, Stanford, 142-149. Box, G. E. P. (1950). Problems in the analysis of growth and wear curves. Biometrics 6, 362-389.

Multivariate analysis of variance of repeated measurements

85

Cochran, W. G. and Cox, G. M. (1957). Experimental Designs (2nd ed.), Wiley, New York. Cox, D. R. (1958). Planning of Experiments, Wiley, New York. Federer, W. T. (1955). Experimental Design-theory and Application, Macmillan, New York. Federer, W. T. and Balaam, L. N. (1972). Bibliography on Experiment and Treatment Design Pre-1968, Oliver and Boyd, Edinburgh. Federer, W. T. (1975). The misunderstood split plot. In: R. P. Gupta, ed., Applied Statistics, North-Holland, Amsterdam, Oxford, 9-39. Federer, W. T. (1977). Applications and concepts of repeated measures designs when residual effects are present. Technical Report BU-603-M, Cornell University. Finn, J. (1972). M U L T I V A R I A N C E : Univariate and Multivariate Analysis of Variance, Covariance and Regression, Version 5, Release 3 (1976). National Educational Resources, Chicago. Finny, D. J. (1960). An Introduction to the Theory of Experimental Design. The University of Chicago Press, Chicago. Gabriel, K. R. (1968). Simultaneous test procedures in multivariate analysis of variance. Biometrika 55, 489-504. Geisser, S. (1979). Growth curve analysis. Handbook of Statistics Vol. I., North-Holland, New York. Greenhouse, S. W. arid Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika 24(a), 95-112. Grizzle, J. and Allen, D. M. (1969). Analysis of growth and dose response curves. Biometrics 25, 357-381. Hedayat, A. and Afsarinejad, K. (1975). Repeated measurements design I. In: J. N. Srivastava ed., A Survey of Statistical Designs: I. North-Holland, Amsterdam, Oxford, 229-242. Hotelling, H. (1931). The generalization of student's ratio. Ann. Math. Statist. 2, 360-378. Hotelling, H. (1951). A generalized T-test and measure of multivariate dispersion. In: Proceedings of the Second Berkeley Symposium on Mathematics and Statistics, University of California Press, Berkeley, 23-41. Huynh, H. and Feldt, L. S. (1970). Conditions under which mean square ratios in repeated measurement designs have exact F-distributions. J. Amer. Statist. Assoc. 65, 1582-1589. Huynh, H. and Feldt, L. S. (1976). Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. J. Educational Statist. 1 (1), 69-82. John, P. W. M. (1971). Statistical Design and Analysis of Experiments. Macmillan, New York. Joreskog, K., van Thillo, M. and Gruvaeus, G. T. (1971). A general computer program for analysis of covariance structures including generalized MANOVA. Research Bulletin RB-71-1. Princeton, New Jersey: Educational Testing Service. Kempthorne, O. (1952). The Design and Ana(ysis of Experiments. Wiley, New York. Khatri, C. G. (1966). A note on a MANOVA model applied to problems in growth curve. Ann. lnst. Statist. Math. 18, 75-86. Kirk, R. E. (1968). Experimental Design--Procedures for the Behavioral Sciences. Brooks and Cole, Monterey, CA. Kleinbanm, D. G. (1970). Estimation and hypothesis testing for generalized multivariate linear models. Mimeo Series No. 696, Institute of Statistics, University of North Carolina, Chapel Hill, NC. Kleinbanm, D. G. (1973). A generalization of the growth curve model which allows missing data. Journal of Multivariate Analysis, 3, 117--124. Krishnaiah, P. R. (1965). Multiple comparison procedures in multi-response experiments. Sankhya A 27, 3t-36.

86

Neil H. Timm

Krishnaiah, P. R. (1969). Simultaneous test procedures under general MANOVA models. In: P. R. Krishnaiah, ed., Multivariate Analysis H, Academic Press, New York, 121-142. Krishnaiah, P. R. (1978). Some recent developments on real multivariate distributions. In: P. R. Krishnaiah, ed., Developments in Statistics Vol. 1, Academic Press, New York., 135-169. Krishnaiah, P. R; and Waikar, V. B. (1971). Simultaneous tests for equality of latent roots against certain alternatives I. Ann. Inst. Statist. Math. 23, 451-468. Lawley, D. N. (1938). A generalization of Fisher's z-test. Biometrika 30, 180-187. Lee, J. C., Krishnaiah, P. R. and Chang, T. C. (1976). On the distribution of the l~etihood ratio test statistic for compound symmetry. South African Statistical J. 10, 49-62. Lee, Y. H. K. (1974). A note on Rao's reduction of Potthoff and Roy's generalized linear model. Biornetrika 61, 349-351. Lindquist, E. F. (1953). Design and Analysis of Experiments in Psychology and Education. Houghton-Mifflin, Boston. Myers, J. L. (1966). Fundamentals of Experimental Design. Allyn and Bacon, Boston. Nanda, D. N. (1950).i Distribution of the sum of roots of a determinantal equation under a certain condition. Ann. Math. Statist. 21, 432-439. Olson, C. L. (1974). Comparative robustness of six tests in multivariate analysis of variance. J. Amer. Statist. Assoc. 69, 894-908. Pillai, K. C. S. (1960). Statistical Tables for Tests of Multivariate Hypotheses. Statistical Center, University of the Philippines, Manila. Potthoff, R. F. and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313-326. Quenouille, M. H. (1953). The Design and Analysis of Experiments.. Griffin, London. Rao, C. R. (1965). The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrika 52, 447-458. Rao, C. R. (1966). Covariance adjustment and related problems in multivariate analysis. In: P. R. Krishrlaiah, ed., Multivariate Analysis. Academic Press, New York, 87-103. Rao, C. R. (1967). Least square theol2 using an estimated dispersion matrix and its application to measurement of signals. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics, University of California Press, Berkeley, CA, 355-372. Rao, C. R. (1972). Recent trends of research work in multivariate analysis. Biometrics 28, 3-22. Rao, C. R. (1973). Linear Statistical Inference and its Applications (2rid ed.). Wiley, New York. Roy, J. (1958). Step-down procedure in multivariate analysis. Ann. Math. Statist. 29,
1177-1187.

Roy, S. N. (1957). Some Aspects of Multivariate Analysis. Wiley, New York. Schatzoff, M. (1966). Sensitivity comparisons among tests of the general linear hypothesis, d. Amer. Statist. Assoc. 61, 415-435. Srivastava, J. N. and Roy, S. N. (1965). Hierarchal and p-block multiresponse designs and their analysis. In: C. R. Rao, ed., Mahalanobis Dedicatory Volume. Pergamon Press, New York. Srivastava, J. N. (1966). Some generalizations of multivariate analysis of variance. In: P. R. Krishnaiah ed., Multivariate Analysis. Academic Press, New York, 129-145. Srivastava, J. N. (1967). On the extension of Ganss-Markov theorem to complex multivariate linear models. Ann. IrL~t. Statist. Math. 19, 417-437. Srivastava, J. N. (1968). On a general class of designs for multiresponse experiments. Ann. Math. Statist. 39, 1825-1843. Timm, N. H. (1975). Multivariate Analysis with Applications in Education and Psychology. Brooks/Cole, Monterey, CA.

Multivariate analysis of variance of repeated meaz'urements

87

Trawinski, I. M. and Bargmaim, R. E. (1964). Maximum likelihood estimation with incomplete multivariate data. Ann. Math. Statist. 35, 647-658. Tubbs, J. D., Lewis, T. O. and Duran, B. S. (1975). A note on the analysis of the MANOVA model and its application to growth curves. Comm. Statist. 4, 643-653. Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24, 471-494. Winer, B. J. (1971). Statistical Principles in Experimental Design (2nd ed.). McGTaw-Hill, New York.

P. R. Krishnaiah, ed., H a n d b o o k o f Statistics, Vol. 1 North-Holland Publishing C o m p a n y (1980) 89-1 t5

"2

Growth Curve Analysis


Seymour Geisser ~

1. Repeated measm'ements----the profile background


Multiple observations are often m a d e on units or individuals who have been sampled f r o m one or m o r e populations or groups. T h e observations themselves m a y be m a d e over differing conditions, tests or periods of time, but all responses are in a c o m p a r a b l e metric. T h e data m a y be represented by the r a n d o m vector X'y=(XI~,j . . . . . Xp~j), j = 1..... N,~, a= 1..... q where Xi,,j = xi~j is the observed response 2 on t h e j th individual in the a th group to the i th stimulus, treatment or to a single treatment at time ti (we shall subsume all of these possibilities by the term "variables"). A s s u m e that X~y is a multivariate n o r m a l r a n d o m vector with E(X~i ) = p.~ = (/~1 . . . . . . p.p~) a n d Cov(X~y) = Y.. A p r o b l e m which is often of interest is whether the q groups have parallel profiles; i.e., whether there is a consistency of shape to the m e a n vectors #~. W h e n Z = o2I, the test of the hypothesis of parallel profiles is equivalent to the test of no interaction of groups by variables which derives f r o m an analysis of variance table. Even for a Z such that the variance of the difference between any pair of variables is constant, the usual distribution of the analysis of variance F ratio of groups by variables to individuals by variables within groups will still obtain. A special case of this latter condition is the uniform covariance matrix structure Z = o 2 [ ( 1 - p)I + pee'], where e is a p-dimensional vector whose c o m p o n e n t s are all unity. The requisite sums of squares are given as

Q1= N ~] (~,..- y...)2, Q2 =P X N~(2,.- 2...)2, Q3 =P ~] X (2.~j-- ~'.a.)2,


i=1 a=l a=l i=1

Nk

q Q4=E
a=l

p E
i=l

N~ p

Na(Kia.--~i..-x.w+x...) 2, Qs=~,, X
tx=lj=l

i=l

(xi,~j-Y,,~.-x.,,j+2.,~.) 2,

Q 6 = E E E(xi~,j-y...) 2, Yi,~.=N~'~,xi,~j, x.,~j=p-lExi,~j


a i j j i

1This work was supported in part by N I G M S - G M 25271. 2The standard notation for the realization of a r a n d o m variable is too cumbersome to

maintain throughout this paper and shall be relaxed in subsequent sections. The meaning, however, will be clear from the context even though rigorous notation is being abused. 89

90

S e y m o u r Geisser

and

N~

~...=N~-I ~
j=l

Y,.,

Ki.-=q -~ ~ ~i~.,
a=l

X . . . = N - ' Exi,~j,
i a j

N=ZN,r
a

The analysis of variance table is then given in Table 1. Under this restriction on Z, F1, F 2 and F 3 are distributed as F ( p - l, (p - l ) ( N - q)),F(q- 1 , N - q), and F[(p - 1 ) ( q - 1),(p - 1)(N-- q)] respectively. The test rejects the hypothesis of parallel profiles at level fl if" F 3 >F~[(p - 1 ) ( q - l ) , ( p - 1 ) ( N - q)], where Pr[F 3 >F~] = ft. If the covariance matrix Y. is arbitrary, F 3 was shown to be approximately distributed as F under the hypothesis of no interaction, but with reduced degrees of freedom (Geisser and Greenhouse, 1958, 1959). Here F 3 is approximately F [ ( p - 1 ) ( q - 1 ) e , ( p - 1 ) ( N - q ) e ] where

e=(p - l ) - l [ T r ( Z - p - l e e ' Z ) ]2/Tr[~ --p-lee'Z]2>/(p -- 1) - l

(1.1)

and e is a p-dimensional vector of ones. An estimate of e is obtained by replacing Z with an estimator )2 = ( N - q ) - lxq = lY,N__ " l(X~j -- )7.)(X.j -- )7.)' in the original vectorial representation. The size and power of this test procedure using ~ has been studied by Collier et al. (1967) and Wilson (1975). Since e > ( p - 1)-l independently of the form of Z one m a y use F ~ ( q - 1 , N - q ) as a conservative value for F 3. This m a y be of value in particular when N - q <p (where multivariate procedures, to be subsequently discussed, cannot be applied) or when Z is not the same for each group.
Table i Analysis of variance Source Variables Groups Individuals (within groups) Groupvariables Indiv. variables (within groups) Total p- 1 q- 1 N- q (p - 1)(q- 1) (p - 1 ) ( N - q)
Np - 1

d.f.

s.s. Q1 Q2 Q3 Q4 Q5
Q6

F F 1= ( N - q) F2 o.

( N - q) Q2
( q _ 1)Q3

F3

( N - q) Q4
( q _ I)Q5

Growth curve arcdysis

91

Actually when X is assumed arbitrary an exact multivariate test can be made. This is accomplished by eliminating the level of the vector by transforming Y~s= CX~s so that E(Y~i) -- ~ -- C/z~ where C is any (p - 1) p matrix of rank p - 1 such that Ce = 0 and e is a p-dimensional vector all of whose components are unity. Hence the new p - 1 dimensional vectors ~h..... % are all the same if and only if the parallel profile hypothesis is true. Hence the test of H o : ~ ~. . . . . ~p is a one-way multivariate analysis of variance test on the transformed random vectors Y~j. The usual test statistic for H o, for A-~.a=l~j_"_l(Y~j_ q N Y,,)(Y,~j- Y,O' and B=Y.,,N,~ ( Y,~ - Y)( Y,, - Y)', Y = U - ~Y~N~, Y~ = N~- 'I~j Y~j, is

[I + A-1B] = Up_I,q_I,N_ q
m r

(1.2)

where Ur,,,t-II~= 1Xj for Xj independently distributed as beta variates with parameters (t + 1 - j ) / 2 and s / 2 where a beta density is given as

r(a+b) xa_l(l__x)b_l. f(xla,b)= V(a)F(b)


Exact percentage points of the statistic U have been tabled by Schatzoff (1966) and by Lee (1972). A worked example is presented by Greenhouse and Geisser (1959) employing both techniques and comparing them. Other worked examples appear in Danford, Hughes and McNee (1960), Cole and Grizzle (1966) and the method is reviewed in detail with examples by T i m m (1979). Sometimes a simultaneous region for these parallel profile differentials is of interest. Assume t h a t / ~ = / h + 0~e, where 01 = 0; i.e., the parallel profile hypothesis is true. A Bayesian solution to the problem of a simultaneous region for 0 ' = ( 0 2 ..... Oq) is given by Geisser (1965a) and Geisser and Kappenman (1971). They assume the convenient prior density g(/q, 0, 51.- 1) CC ]~[(p+l)/2. This yield's the posterior probability statement

P (Q(O) <<. F(q - 1 , N - q)) = 1 - fl

(1.3)

where F(a,b) represents the F distribution with a with b degrees of freedom and

Q(o) = (q - 1 ) - ' ( N - q)(e'A -'e)[ 0 - (e'A - l e ) - 1 Z ' A -1el' [~?-' + Z ' A - ' Z - ( e ' A - ' e ) - ' ( Z ' A - ' e e ' A - ' Z ) ] - ' [O-(e'A-'e)-'Z'A-'e]
(1.4)

92

Seymour Geisser

where A = x ~ q_ ~ j :N 1xoj - x ~ ) ( x . j (Z 2..... Z q ) , Z , ~ = X , ~ - X , , a = 2 . . . . . q;


N2(N -

- xo),

q N=7~,~=1N,~,

= ~ -lv~ o x aj' ~'a ~"aj=l

Z =

N2)

-- N 2 N 3 N3(N -

. . . . .

N2N q N3N q

N3)

- N3N 4

,q=N

-1

-Nq_,Nq Nq(N-Nq)
a symmetric matrix. This can easily be extended to natural conjugate prior densities; see Geisser (1965a) where a complete analysis was made for q = 2 . There does not appear to be a confidence region of comparable simplicity, e.g. Halperin (1961).

2.

Growth curve models

Originally, the parallel profile problem was subsumed under the general rubric of growth curves by Box (1950). Later Potthof and Roy (1964), Rao (1959, 1965, 1966, 1969) defined the growth curve problem as one in which the components o f / ~ were known linear combinations of some subset of unknown parameters. In general then, the column vectors of the p N random matrix say, X = ( X 1. . . . . X N ) are assumed to be independently and normally distributed with c o m m o n covariance matrix X and E ( X ) = Wpxm"gmxqZqx N where W is known and of rank m <p, Z is known and of rank q < N , and ~- is unknown. A set of problems involves the estimation and testing of and known linear functions of the elements of . Although this model was proposed by Potthoff and Roy (1964), their analysis turned out to be inadequate as Rao (1966) demonstrated. This model, however, turned out to be rather fruitful in that it provided a general format for a variety of growth curve situations. In particular, polynomial curves in time as models for growth curves are an important example. This comes about in the following way: Let
1 1 tl
t2 .. l~ n-1

W=

t2

t2 2

t~n - 1
'

(2.1)

t~

t~

= (~,, ~ ..... ~),

Growth curve analysis where %' =(~'la, T2. . . . . . Tma), a = 1. . . . . q.

93

Z =

[
.

e~

0[,

"--

Oq]

0'1

e~,

0'3.

O'q/
" l' (2.2)
eq

..

e~ is a N~ 1 vector all of whose c o m p o n e n t s are unity a n d 0~ is the null vector of size N~. This yields

(2.3) e.g., a linear model results f r o m m = 2 and E ( X:j) = (Tla --1-T2atl, Tla -~ "/'2at2 . . . . . q'la + T2atp)" (2.4)

Further a variety of hypotheses concerning the elements of ~" are easily formulated as C~D = 0 where D is a q x d matrix of rank d < q a n d C is a c m matrix of rank c < m. F o r example, in the previously discussed linear case, one m a y be only interested in testing H0: ~'zl =r22 = rE3 . . . . . ~'2q; i.e., that all the groups " g r e w " at an equal rate. Hence, 0 = CcD=(O,

1"~[1.'VII
I~"r21

712
"rz2

"'"
"'"

"Flq .~D $2q]

(2.5)

where D is any q (q - 1) matrix of rank q - 1 such that the columns of D sum to zero. Some formulations involve special structure on E. O t h e r formulations depend o n hierarchical models.

3.

Classical multivariate model--frequentist analysis

F o r Z arbitrary, R a o (1966) d e m o n s t r a t e d that the appropriate least squares estimator of ~- was

= (W'A - l W ) - ' W'A -IXZ'(ZZ')-',


where

(3.1)

A = X(I- Z'(ZZ') -1Z)X'.


Khatri (1966) s h o w e d that 4 was also the m a x i m u m likelihood estimator.

94

Seymour Geisser

In the series of papers by Rao (1959, 1965, 1966, 1967) and Khatri (1966), the basic sampling distribution theory was presented. A 1 - fl confidence region on ~" is found from

Pr[ Q(~) ~ u~ ]= 1- B,
where

Q() = I1+ W'A --IW(:c- "c)G(~ - )'1- ',


a -' = (zz') -' + (zz')- 'zx'

(3.2)

)< [A --1__m - I w ( W t A - 1 W ) - I w t A -

l ] X Z t ( Z Z t ) - 19
(3.3)

and U~ is the fl th percentage point such that Pr[ Um, q , N - q + m _ p >/ Ufl ] = l - i~. The null hypothesis that ~'= % is rejected at level fl if Q ( % ) < U~. Confidence regions for a variety of linear combinations of the elements of ~" can be obtained by noting that

Q(c,o)=lI+[ C (W'A -'w )-lc' ]

-'

(C;rD- C D ) ( D ' G - ' D ) - ~ ( C ? D -

C~-D)'I-'

(3.4)

is distributed as Uc,d,N_q p+m" M a n y useful null hypotheses, as indicated before, can be expressed as CD = O, for appropriate C and D. In the simplest case where we are dealing with one group i.e., q = 1 and Z = (1 ..... 1). R('r)= N(4-z)'W'A-1W($-~')

m(N--p)

1F(m,N-p)
(3.5)

I + NT~( U'A U)- i T2

and U is any p x (p - m) matrix of rank p - m such that U' W = 0 and T2= U ' X Z ' ( Z Z ' ) - ' . Hence a 1 - f l hyperellipsoidal confidence region for the m dimensional vector ~"is obtained from F B ( m , N - p ) the/3 th percentage point so that all satisfying P r [ R(~') < F~(m, N--p)] = 1 - fl are included in the region.

Growth curve analysis

95

Before this type of analysis was introduced it was well known that the statistic

T~ = B X Z ' ( Z Z ' ) -

',

(3.6)

for B = ( W' W ) - 1W', was an unbiased estimator of ~"and that a confidence region for ~- could be obtained from

Q(z) = tI + ( B A B ' ) - - ~( T l - ~') Z Z ' ( T, - "r)' I- ~

(3.7)

which is distributed as U,,,,q,u_ q. The form for C~-D, analogous to (3.4), is

Q( C'rD ) = [I+ ( C B A B ' C ' ) - ~( CT, D - C'rD ) (D'(ZZ')-'D)-'(CT, D - C.cD)'1-1


(3.8)

which is distributed as Uc,a,N_ q. F r o m 3.7 q = 1 and Z = ( I , . . . , 1) we obtain

R(.r) = N ( T 1 - "r)'( B A B ' ) - ' ( T 1 - " c ) ~ m ( N - m ) - ' F ( m , N Further one can write 4= r, - 8AU(U'AU)-'T2,

m).

(3.9)

with T2 and U as previously defined. This displays the fact that 4 is a covariance adjusted estimator. Since both E ( T O = E(4)--~-, then comparisons of their covariance matrices would be instructive as to which would be a more desirable estimator. It turns out that T~ is preferable when BZU=0 (3.10)

and possibly when this matrix is close to the null matrix, otherwise 4 is tpparently preferable. For Y. = a2I (3.10) certainly holds. M o r e generally it rill hold for Z = W F W ' + UOU' +a2I. (3.11)

n fact, R a o (1967, 1968) shows that if and only if (3.11) holds then T 1 is ae least squares estimator of ~-. A likelihood ratio test for

Ho:Z=WFW'+UOU'+a2I

vs.

H~:Z:WFW'+UOU'+a2I

96

Seymour Geisser

is easily obtained. The test statistic

x = [ W'A

~WBAB'I

for testing H o vs. H l is distributed as Um,p-,~,N-e-l+m u n d e r H 0, c.f. Lee and Geisser (1972). Other models for Z that have been studied are the factor analytic m o d e l of R a o (1967)

E = C F C ' + 021
and the serial correlation m o d e l

(3.12)

E= (oij)=

o2pli-Jl),

i , j = l ..... p

(3.13)

but optimal results for estimation are difficult to achieve. In some instances a confidence region either on a particular point of the growth curve or on the entire growth curve itself is of interest. Suppose W is of the form (2.1) i.e. the growth curve is polynomial and Z is arbitrary. T h e n let C = a ' = ( 1 , t , t 2..... t m - l ) a n d D = I so that for a given value of t

C~rD = (1, t ..... t m- l)('r 1..... ~rq)= (a',c 1..... a%q).


One then applies (3.4) which reduces to

Q(a',r) = II-t- [ a'( W ' A - 1 W ) - la ] - ' ( a ' 4 - a"r)G(a';r distributed


as

a'T)'l --~

Ul,q, N q-p+m" But since m)-')F(q,N-q-p+m) F(q,Nq - p + m),


(3.14)

UI,q,N q p + m = ( l + q ( N - q - p - ~
then

( a ' 4 - a"r)G( a' ? - a"r)' a,(W,A-1W)-la N-q-p+m

which provides a joint confidence region on the q polynomials at a given value of t. If q = 1 so that only one group a n d one polynomial is involved then ~-= ~'1 and

U(a'4 - a%)'(a';c- a'r) ( a'( W ' A - ' W ) - ' a ) ( 1


+ NT~( U'A U ) - ' T z) (3.15)

~ ( U - l - p + m ) - l F ( 1 , N - 1 - p + m).

Growth curve analysis

97

Note that N - 1+ T~( U'A U) IT2 = G - 1 can be given without computing U by applying (3.3). A simultaneous confidence region for the entire growth cmwe i.e. for all t, is obtained by noting that

N ( N -- 1 --p + m)(a'4 - a"r) 2


Pr m ( I + N T 2 ( U ' A U ) - I T 2 ) a ' ( W ' A - 1 W )

<F~(m,N-p) la
(3.16)

When Z is of the form (3.13) then similar results are obtainable so that analogous to (3.15) we obtain

N ( ~ T 1 a'r)'(a' T 1 .. aT) a'BAB'a


-

(N-- 1) I F ( I , N - 1 )

(3.17)

for a single value of t. For a simultaneous region for the entire growth curve we can use the fact that Pr[ N ( N - l (a'Tl-a''r)2ma,-;-B--~ ! a

<FB(m,N-1)])I-

ft.

(3.18)

Suppose a tolerance region is required on K future p-dimensional independent multivariate normal vectors with c o m m o n covariance matrix Z. Denote this set of variables by Ve K and assume E( Vp K ) = W~'F where F is a q K known design matrix. It can easily be shown that for

H=(Z,F) U, = 11+ (I - F'( H H ' ) - 1 F ) ( V-- WT, F)' W( W'A W) - 1 W ' ( V - WT1F)[-1
(3.19) is distributed a s Um, K,N_ q irrespective of the form of Z (Geisser, 1970). For K = F = q = 1 and Z = (1 ..... 1) i.e. a single vector observation to be predicted from a single group of observations,

y~= ~ ( V -

WTO'W(W'AW)-Iw'(v

- WTI)

(3.20)

is distributed as m( N - m ) - 1 F ( m , N - m). It is clear that using (3.20) or (3.19) for a tolerance region would be unsatisfactory since W ( W ' A W ) - I W ' is singular. However, we also note that, independently of U 1,

U2 = II+ ( V - WT, F)' U( U ' X X ' U) -' U'( V -

WT, F)I-I

(3.21)

98

Seymour Geisser

is distributed as Up__m,K,N. F o r the case corresponding to (3.20),

Y2 = ( V - WT1F )' U( U'XX' U) -1U'( V - WTIF )

(3.22)

is distributed as (p - m)(N + m + 1 - p ) - 1F(p - m,N + m + 1 -p). Hence for general V, a tolerance region can be obtained from the distribution of U 1+ U 2 or for the special case Y =Yl + Y 2 = ( v -

WT1)'[N(N+ 1)-1W( W'AW)--' W' + U( U'XX' U)-I U'] ( V-- WT,)


(3.23)

is distributed as a linear sum of two independent F variates. N o w the matrix of the quadratic form is positive definite with probability one so that a 1 - t3 hyperellipsoidal tolerance region emerges from the observed X which includes all V such that y < y~ where y~ is the t3 th percentage point of the linear sum of independent F variables. N o w if E is of the specialized form (3.11) where T 1 is the optimal estimator of ,, the above tolerance region would undoubtedly enjoy its "best" coverage properties. However, if X is not of this f o r m the above tolerance region need no longer exhibit as good coverage properties as some other one. For an arbitrary E, it can be shown that

U1= II+ G I ( V - w~rr)'A - I w ( W ' A - 1 W ) - 1 W'A -1( V - w~rr)l-l


where
Um,K,N_q + m--p,

(3.24)

G(-1= ( I - F ' ( H H ' ) - 1 F ) - 1 + ( V - X Z ' ( Z Z ' ) - I F ) '

v( u'A

u'( v - x z ' ( z z ' ) - ' F )

and E=(Z,F). For K = F = q = 1 and Z = ( I ..... 1) and ) ~ t h e sample mean vector,

N( N + l ) - l ( V - W~r)'A - Iw( W'A - l W ) - 1 W ' A - - l ( v - - W'TF)


yl =

1 + N ( N + 1 ) - I ( V - X ) ' U ( U ' A U ) -' U ' ( V - X )


(3.25)

~ m ( N - p ) - 1F( m, N - p ) .
Now independent of U 1,

U 2 = [ I + ( V - W~-F)'U(U'XX'U)-1U'(V - W~F)[ -1

(3.26)

Growth curve ana~s~

99

is distributed as Up m,K,No For tile case corresponding to (3.25)

Y2 = ( V - W~F)' U( U'XX' U ) - ' U'( V ~ W~rF)


N+m+l-p

(p-m)

r(p-m,U+m+l--p),

(3.27)

Therefore, as before, a tolerance region for V can, in theory, be obtained from the distribution of U 1+ U2 or for the special case y =YI +Y2. How ever, the tolerance region apparently will include V in discolmected regions due to the f o r m of y l. What properties this type of coverage will possess is not clear at present. Another problem of great interest brought to the fore by Lee and Geisser (1972, 1975) is the problem of conditional predictive or tolerance regions. Suppose V can be partitioned into an observed portion V O) which is Pl X K and an unobserved portion V (2) which is 1o2 X K such that
V

( V(1)1
\ V (2) ]

andpl +P2 P" A tolerance region for V (2) is required for the observed V O)
and X. It would appear that the results previously obtained from (3.19) and (3.20) can be used, as before, except that now values are also inserted for V ) and the region consists of all V (2) satisfying U 1+ U 2 ) c o n s t . We shall illustrate this with the case that usually occurs in such problems, K=F=q.---1, Z = ( 1 . . . . . 1). Let

N ( N + 1 ) - 1 B ' ( B A B ' ) - I B + U( U'XX' U)-1 U'=

=p~__( P11
P2~
and

P12] = ( 011
P22] I Q21

el2] -I
Q221

W (2) ]
where Pij and Qij are pixpj and W (0 is p i x m, for i,j= 1,2. Then from (3.23) the 1 - f l tolerance region consists of all V (z) satisfying

y = [ V (2)- W(2)T,-

Q21Ql~l(v(')- W(')TI)] '


(3.28)

e22 [ V ~2)- W <~)r, - Q~, Q ill (v~l) _ W") rl) ] + (v")w") rl)'Q,~'(v~')- x o ) r d ~<y,

100

Seymour Geisser

As is sometimes the case for the confidence region setup, a particular


1 - fl region can exhibit peculiar properties, here we note that if

Yfl ~" ( V(I)-- m(l)Tl) ' Qll-1( V(|)_ W(I) ~,1)


the region for V ~2) is empty. Clearly to avoid such a difficulty it would be much more sensible for the tolerance interval to be obtained from a statistic conditional on the observed V 0). However exact conditional tolerance regions do not seem to be feasible either for this model or for the model which results in ? as an optimal estimator for ~-. We shall take this up again in the latter part of the next section when discussing a Bayesian approach to conditional prediction of V (2) given V (1). Portions of the material in this section are also presented in detail with worked examples by Timm (1979).

4.

A Bayesian approach--estimation

The Bayesian approach presented in this section was given in a series of papers (Geisser, 1970; Lee and Geisser, 1972, 1975). Consider the previous model with arbitrary covariance matrix Z and a convenient prior density (Geisser and Cornfield, 1963; Geisser 1965b)

From (4.1) we obtain the posterior marginal density of +

P('O ~: I( W ' A - ' W )

'+ ('~ -)G('~ - )'1-u/2

(4.2)

(Geisser, 1970). For a posterior region for , Q ( T ) = I I + W'A-~W(~---~)G(T--+)'I-'~-U,,,q,N_q and for C+D (4.3)

Q(C,rD)=II+[C(W'A-1W)

1C']

'
(4.4)

X (C~rD - C~rD)(D'G-~D)-'(C~rD - C'rD')'[-'

is distributed as Uc,ZN_ q which differs from its frequency distribution Uc,d,N_q_p+ m given by (3.4). Hence Bayesian regions at a given point t of the polynomial a'+ and simultaneously for all t can easily be given in a fashion analogous to (3.14-3.18).

Growth curve analysis


For the case analogous to (3.5) we obtain
R ( r ) = N(~r - ~)' W-'A - ' W ( 4 -- r)
1 + NT;( U'A U) - ' T2

|01

/9't N_mF(m,N--m).

(4.5)

Since
( N - m) - I F 3 ( m , N -- m) < ( N - p ) 'Fl~(m,N---p)

(4.6)

the region given by (4.5) will always be included in (3.5). The posterior distribution of r is the general determinantal density (c.f. Geisser, 1966), denoted by D ( . [ 4 , G , ( W ' A - 1 W ) - I , N ) , where say Y, the random matrix, is distributed as D(. [&A, E, N) if

d(Y) =

Cm'flx-qm/21~"l~/2lAIm/2 Cm, n[~'~-(Y-m) A ( Y -m) , lN / 2

(4.7)
'

where g is m x m and p.d. A is q q and p.d., Y and zX are m q and v=N-q>m> l and

Crn-l=I~m(m

m ')/4t HI'=(P'k-21--i).

(4.8)

Hence E ( r ) = 4 , when it exists and ~: is also the mode. The posterior expectation of Y. is given as E(Z) = ( N - p -- 1)- '{ ( X - W~rZ)(X - W~rZ)'-I - ( N - m - q - 1 ) - '
XW(W'A-1W)-Iw'[trG-'ZZ']}.

(4.9)

Further it can be shown that E(Z) - Z is always negative definite where the m.l.e. = N -I(X- W'rZ)(XWZ)'.

(4.10)

From this Bayesian viewpoint i.e. 5? arbitrary, it is also possible to obtain T 1 as the posterior expectation. Let the model be given as

E(x) = (w, v ) ( ; ) z =
with prior density for E-1, r,

+ u,z

(4.11)

g ( y - ' , r, n) o~ I~l<p + o/2

(4.12)

102

Seymour Geisser

Then the marginal density of ~ is

p(z)oclBAB'+(~- T,)ZZ'(T--1',)'l-(N-'+m)/2,

(4.13)

so that the center of this distribution is T v Hence for q = 1, Z = ( 1 ..... 1)

N(,c- TI),(BAB')--'(,c- T 1 ) ~

m-- g ( m , U - p ) . N-p

(4.14)

This is to be contrasted with the previous case (4.5) which is essentially conditional on ~7--0. If we consider the special covariance model Z = WF W'+ UOU'+ 42I, we note that o 2 [ W ( W ' W ) - I W ' + U ( U ' U ) - I U ' ] = o 2 I , so that from the frequency point of view, ~ = W [ F + o Z ( w ' W ) - I ] W ' + U[O+ o2(U ' U)-1] U' where the brackets may be relabeled F and 0 respectively, but both now p . d . - - n o essential difference ensues. F r o m the Bayesian point of view the newly labeled F and 0 are now dependent on W which some Bayesians find undesirable but we shall regard it as a convenience to do so since the ensuing results are greatly simplified by this device. Choosing the simple structure model Z = W F W ' + UOU', and the convenient prior density (4.15)

g(F-l,O-l,.r)oclF[(m+l)/2g(O

1),

(4.16)

where g(O -1) is arbitrary, the posterior density of ~- is obtained as

p(.c) cc [BAB' + (~c- Zl)ZZ'('r

TI)' I-N/2,

(4.17)

yielding T, as the center of the posterior distribution of ~-. Hence a posteriori 1I+ (BAB')-I(T -- T1)ZZ'('~-- Z l ) ' [ ~ Um,q.N_ q (4.18)

precisely as its sampling counterpart, (3.7). Regions for C~'D are readily obtainable in a form equivalent to (3.8). Further, if

g(O - l) ec I01(p-'~ + 1)/2,


then E(F) = ( N - q - m - 1)-IBAB', (4.19)

E(O)=(U-p+m-

1 ) - I ( U ' U ) - ' U ' X X ' U ( U ' U ) -l.

Growth curve analysis

103

5. Bayesian prediction--simple structure


If we wish to predict V when the simple structure (4.15) for X obtains, the predictive density of V is given as

f ( V ) oc ]U'(XX' .4-( V-- WT, F)( V - WTiF)') U I -(N+ K)/2 [I + ( I , F ' ( H H ' ) - ' F ) ( V-- WTiF)'B'
(,~A a ') - I B( V - W r , F)I - {N + , , - q>/,

(5.1)

A maximal mode of the density of V is clearly WT~F which is also its predictive expectation. It can be easily shown that a posteriori

U, = II+ ( I - F ' ( H H ' ) - I F ) ( V - WT, F)' X W ( W ' A W ) -x W'( V - WTIF)[--'


(5.2)

is distributed Um, K , N _ q exactly as its frequency distribution given in (3.19). Further a posteriori v2=tl+(vWr, F)'V(V'XX'V)-'V'(V-

wr, e)1-1

(5.3)

is distributed a s Up_re, K, N independently of U ajust as its sampling distribution. Hence the sampling theory given in the previous section, e.g. the region generated by the statistic (3.23), has a Bayesian interpretation. Sometimes for the sake of comparisons, linear combinations of the individual p-dimensional future vectors of V= ( V 1..... Vk) are of interest. Previously the comparisons were m a d e parametrically via regions on CrD. It is often the case that a more informative comparison is vested in predicting a function of the vectors of V (Geisser, 1971). F o r example, if one had 2 groups i.e. q = 2 then one could obtain the predictive distribution of V ~ - V2 as a comparison rather than, say, the posterior distlibution of % - r 2. At any rate if l is a k 1 arbitrary real non-zero vector then v.=Vl=Zki=lliVi is a linear combination of V I , V 2. . . . . V k where l ' =

(ll ..... lk).


The predictive density of Vl = v is obtained as

f ( v ) oc [U'(XX' + (l'l)-tvv') U]-(N+ i)/2


X Ir(I- F ' ( H H ' ) - i F ) - ' l + (v - WT1FI )"

X W ( W ' A W ) - I W ' ( v - WT, F1)I {U+,-q)/2,

(5.4)

104
or more simply f(v)cc(l+

SeymourGeisser

v' U( U'XX' U)-I U' v ) -v+

I
1+

(1) -- WTIF/)' W( W*A W) -1 Wt(1) _ WT1F/) -(N+ l- 0/2

t'(l-e'(uu')-'F)-'l

Regions similar to (3.23) can now be generated for v. Marginal densities for a subset of the p components of v can also easily be obtained. For conditional prediction of V (2) when V ) is observed we note that f(V(2) I VO))c~f(V). However, this still does not resolve the problem of a convenient region on V ~2) given V 0). It was shown by Lee and Geisser (1972) that for K = 1 the predictive distribution of V (2) given V 0) can be reasonably well approximated as F( V(2)[ V('))~St.(-;/~2.1, b ( 2 N + 2 - q - p z ) J ~ 1 , 2 N + 2 - q), (5.5)

where Y ~ S t . ( . , # , Z , N ) is a multivariate student distribution with density

f ( y)cc [1 +( Y- Iz)'(vE)-I( Y- I*)]-N/2,


a special case of the general determinantal density (4.7); where

(5.6)

t'-:-

N+ l - p + m N-l-p+m'
W (1)

tt~2.1= W (:) T1F- J ~ 1,/21( V O) b--l + t ( N + l - q - m ) ( N +

T,F),

(5.7)

l - p + m ) -1

+ ( V(I)- W(')TIF)'J11.2(V(I) - W(1)TI F),

t(N+ 1 - q - m) (I+ F ' ( Z Z ' ) - ' r ) - ' W( W'A W)--' W" J= (N+l-p+m)
+ u(u'xx'u) -~ u '

=/Jll J121.
J21 J22 ] Hence an approximate predictive region for V (2) given V 0) is obtained

Growth curve analysis

105

from (2N

+ 2 - q-P2) ( v(z)_ ~s2.1)'[ b- 1J22] ( V(2)-/Ls/.,)


P2

~ F(p2,2N + 2 - q-P2).

(5.8)

Note for q=l=F, Z = ( 1 ..... 1) some simplification occurs in that I + F'(ZZ')-IF=(N+I). At any rate a sensible region results. Numerical procedures are given for K = 1 by Lee and Geisser (1972) for obtaining the predictive mode of V (2) given V ) and an exact solution for the particularly interesting case P2 = 1.

6. Bayesian prediction-arbitrary covariance case


For the arbitrary covariance case the predictive distribution of V was obtained by Geisser (1970) as f ( v ) o c I U'(XX' + ( V - WF)(V-- WF)')UI -uv+K-')/2

x IG,Im/ZlI+ G,( V - W;cF)'A -1W(W'A


X WtA

- IW)-

-1(

V-

m'rr)! -(N+K-q)/2.

(6.1)

While E ( V ) = X~F is easily derived, it is clear from the form of (6.1) that X;rF is not the mode of V. Numerical procedures for calculating the predictive mode when K = 1 are given by Lee and Geisser (1972). For Q = BXZ'(ZZ')- IF+ BA U( U ' A U)- lut( V - XZ/(ZZ ')-'F),

U, = ll + ( W'A - ' W ) ( B V - Q)G~(BV-is Um, K,N_ q and is independent of

Q)'I

(6.2)

u2= II+ v'u( u'xx'u)-'

(6.3)

Hence the predictive distributions of U 1 and U2 differ from their sampling distributions given by (3.24) and (3.26). Again U l + U2 is the sum of two independent U variates so that conceptually a predictive region can be obtained. For r = K = F = 1,y 1, as defined by (3.25), is distributed as m ( N - m ) - l F ( m , N - m ) and is independent of Y2 as defined by (3.27), which is distributed as ( p - m ) ( N + 1 - p ) - l F ( p - m , N + 1-p). As noted before these predictive distributions differ slightly from their sampling

106

Seymour Geisser

distributional counterparts. Similarly a disconnected region for V can be obtained from y =Yl +Y2, which may not only be difficult to compute but also not very appealing. For very large samples multivariate normal theory can be applied as a convenient approximation. For v = Vl, similar remarks hold. For predicting V (2~ given V O) we again note that f(V(2)[ V (1))cx:f(V) but the only situations where this is easily utilized is K = 1 and P2 is low dimensional, 1 or 2 perhaps; thus permitting the density for V (z) given V (1) to be easily plotted. Otherwise there doesn't seem to be any optimal way to utilize the results in an exact manner other than rather crude normal approximations. For K = 1, numerical procedures are given by Lee and Geisser (1972) for calculating tile conditional predictive mode of V (2). Lee and Geisser (1975) also examined and compared a wide variety of conditional prediction procedures on two sets of growth curve data. The conclusions reached for those data sets were that the form of Z was important in reducing predictive error (in particular serial correlation models for E) and that growth curves are often highly individual so that past data on completed growth curves may be relatively unimportant for predicting an individual's future growth when compared to his own past data. This means that other models emphasizing individual curves may be useful in many situations. The predictive distribution of V can also be used for ascertaining which of possibly q growth curve models is most appropriate for g. Synthesizing the Bayesian notions developed for classification (Geisser, 1964), and growth curves (Geisser, 1970), Lee (1977) provides solutions for this problem. Details are provided for a variety of cases involving various degrees of knowledge about % and E~ the parameters of the a th group. The results are then couched as the predictive probability that V belongs to a particular family of growth curves based on prior probabilities of this event. Calculations are m a d e both for the arbitrary case a n d w h e n Z~ is of simple structure. This work has been further extended by Nagel and De Waal (1979).

7.

Individual growth curves

Suppose now each vector X~ has its own growth curve so that the model is
E(X.)= W%, a = l . . . . . q.

We now make certain simplifying assumptions (Fearn, 1975), namely that

Growth

curve analysis

107

the model is a two stage hierarchical one so that

X~]'r~,o2~N(W%,o2I),
(7.1)
Hence the marginal distribution of X~ is easily obtained, to wit:

X , ~ N ( W ~ , WFW' + o21).

(7.2)

This model was considered by Rao (t965, 1967, 1972, 1975) Lee and Geisser (1975) and Fearn (1975). We shall present the analysis of Fearn which is Bayesian and inspired by Lindley and Smith (1972) and Smith (1973). The posterior distribution of % given X, o 2, F is normal with mean and covariance matrix
q

~'~*= Eb'~) = a ~ + U - a ) q - 1 ~ %
k=l

(7.3)

D.* = Cov(~o) = (a + q - ' ( i - a) ) o2( w' w ) - ',


respectively where

a=(o-2w' w+r-')-lo-2(w,w), o=(w,w)-lw,x~.

(7.4)

Further for a uniform prior distribution on ~=given o 2 and Y, the posterior distribution of ~ is normal with mean and covariance matrix given by
q "r*~-E(~)--~-q-I E 'ra,

(7.5)
D* -- Cov(~*) = q-I{o2(W'W) -1 + I ' } . It is now further assumed that the prior density of a is g(a)oco -1 and that F-~ has a Wishart distribution with O degrees of freedom and matrix R

g(F-l[o,R)eclFI - ( P -2m - l)exp{

--

12trY- IR }.

(7.6)

Fearn suggests p = m as appropriate when knowledge about the precision of I7 is vague. But a value for R (perhaps diagonal in certain cases) gleaned

108

S e y m o u r Geisser

from some prior knowledge is required. Even so the integrations necessary to obtain the appropriate marginal densities are rather difficult. Hence, the following estimates are used as approximations for a 2 and F respectively.
q

~2=(q(p_ m)+2)-I

(X,-

W~G)'(X~, - Wz~,),
(7.7)

These are then inserted in (7.3) and (7.5) to obtain approximate regions for z~ and ~ based on normal approximations. The nominal 1 - f l probability for such a region will undoubtedly overestimate the actual value. At any rate an approximate 1 - f l probability region on an individual ~is given by
* '

(D2)

'

(7.8)

where x~(r) represents the/3 th percentage point of a chi-squared random variable with r degrees of freedom. Similarly

(;i- ~*)'D* l(~_ ~,) < x~(m )

(7.9)

yields an approximate 1 - fl probability region for q. If one wants to estimate a polynomial growth curve e.g. a'~', where a' = (1, t, t 2..... t m- 1) at a particular value t, then an approximate 1 - f l probability region is found from

( a "G - a "r d) < t . a D,~a

(7.1o)

For a simultaneous region on the entire individual growth curve whose probability is approximately at least as large as 1 - fl we obtain
(atTa _

aTd) ma'D*a

~ ,x2

< x~(m)"

(7.11)

Similar results are obtained for the group mean growth curve i.e. at a single point t a 1 - fl probability region is obtained from ( a ' ~ - a'~*) 2

a'D*a

< X~(1)

(7.12)

Growthcurve analysis

109

and for all t a region of probability approximately at least as great as 1 - fl is ( a ' ~ - a'?*) 2 ma'D*a < x~(m). (7.13)

In all cases (7.9-7.13), it is expected that the probability for the given regions is somewhat less than the stipulated 1-/3, due to the approximations involved, unless p and q are quite large relative to m. For some work on tolerance regions based on a frequency approach, the reader is referred to Bowden and Steinhorst (1973). Consider now predicting a new vector V which is distributed as

V]'cq + ,, o 2 ~ N ( ~Zq'q+1,o21)
and

~q+,l~,o2,r--N(~, w r

w ' + o21).

Now as before the posterior distribution of ~, when the prior distribution is uniform given o 2 and F, is as

~]o2, I ' ~ N ( ~ * , D * ) .
This permits the computation of the predictive distribution of V given X 1..... Xq and o 2 and F which is V 102, F, X ~ N( W?*, W(F + D ') W ' + o21).

(7.14)

Hence (7.14) can be used to generate an approximate predictive distribution with the insertion of estimates for o 2 and F, as before. For conditional prediction of V (z) given V ) again standard normal theory is applied to (7.14) so that approximately V(2)IX, V(O~N(W(2)zr * +A21AI~I(v(1)where A ----( AllA21 A22]A12] -- W ( F + D * ) W ' + o z I . Hence [ V(2)- W(2)~*_ A 2 1 A ~ I ( V 0 ) - Wo),~*) ]'A22.11

W(1)~*),A22.1),

(7.15)

[ v (2) -

w(2)

* - A2 A

v(')-

*) ]

< X/](P2) provides an approximate predictive region for V (2) given V ).

(7.16)

110

SeymourGeisser

One notes that this model and procedure makes many stringent assumptions and produces only approximate results--hence the model should be carefully checked and the results applied with caution. However, if this model is really appropriate it will produce the best results.

8o A sample reuse approach for conditional prediction We now describe a data analytic method called Predictive Sample Reuse (Geisser, 1974, 1975), which can be applied to conditional growth curve prediction, and does not require distributional assumptions. For convenience we make one small change in notation; instead of V being the future vector observations we shall relabel

V=Xq+|=I q + l [ (2) " [Xq+lJ


Suppose from the first q vectors, ^ X p . . . , X _ ~/ each at the s a m e p points we generate a predictor of Xq(~l, say X((qZ) ). Further suppose another predictor of X q(2) (27 + l is obtained, say 2 q + l ~ which depends only on the observed X q(1) +|" Finally we combine the two independently calculated predictors into a new predictor

f X o) )

2("7 =J I/2(:) a) q+l-. (q), 2(:7,; q+


for f~ E ~, ~ being a specified class of matrices. An interesting case is

(8.1)

(8.2) where f~ is p2XP2 matrix such that ~2 and I - f ~ are both non-negative definite. Define

x.(:) = a2((q~)_ ,.)+ (I - a)2.(:),

(8.3)

where a = 1..... q and 2((q2)_,,~)is the predictor for X~ (2) based on


X 1. . . . . X~_I,X~+ 1. . . . . Xq and of the same functional form as 2((q2~and 2 . (2) is the predictor of X~ z) based only on X~ (1) and of the same functional form

~?(2)1. a s lXq+

Further define a discrepancy measure

D(a)= ~ d(2~2),X~ 2))


a=l

(8.4)

Growth curve analysis

111

which is then minimized with respect to ~ within its given domain of definition. If ~ is the unique solution then the final predictor is given as

Y q~+ l = ~2/g~ + ( I - fi~2 ~+ l ] q


If
q

(8.5)
(8.6)

d(.,.)= E

( x Y ~ -- x ~

) ' (X~ " ( ~ -- X~ ~ )

and m <Pl and p2 = l, i.e. ~ is 1 l, a solution for combining predictors, based on simple least squares predictors, appears in Geisser (1975) where 2((q2~ ) = W(2)T, and' )(q(Z2, = W ( 2 ) ( W (l)t W(I)) - lw(1)'y(l),. ~Xq+l, though there only illustrated for m = 2. A general solution m a y easily be obtained for other forms of .~((q2~and )~q(2+) when m <Pl and P2 is arbitrary a s

E (x~ 2~-(2~/~(2~ _~2~), a 1~, ( q - l , , ~ )


ot=l

a=l

~2 ~2~ I , a ) -~2~2~"~ I (q a 1~, (q

l,a)

-xy))'

(8.7)

provided it exists and satisfies the constraints. The simplest way to achieve the solution (8.7) is to use the matrix differentiation technique described in De Waal a n d N e l (1978). In summary we have described a low structure data crunching device which simulates the predictive process as best it can, given a complete lack of distributional assumptions. The method has its roots in cross-validation analysis.

9o Group growth carve comparisons


For standard distributional assumptions, we have already indicated in Section 2 how certain traditional tests are executed for hypothese s on C.rD. In particular for polynomial growth curves these tests not only require distributional assumptions but, also that each individual vector be measured at the sam e p points. Both of these assumptions can be relaxed if we admit randomization or permutation tests. Zerbe and Walker (1977) present a method of implementing such a permutation test to ascertain whether several group mean growth curves can be considered to be essentially one group or not, i.e. each group is identifiable by some label and one tests for the relevance of the group label. The test can also be specified for a particular subinterval of time.

112

Seymour Geisser

Suppose N individuals comprise q groups so that t h e j th individual in the a th group is X,'j =(XI~j, X2~j . . . . . Xp~j). N o w the data for each individual is fitted by least squares to a polynomial of degree m ~ j - 1 wherep,j >m~j. Let m be the m a x i m u m of the m~j. Further m~j is also a polynomial of degree m by virtue of augmenting the residual m -- m~j terms of the polynomial with zero coefficients. Let the polynomial fitted to Xj be represented as x,j(t) and be considered to have a population average of T~(t). Further assume that we wish to test whether the hypothesis that ~-,(t)= ~'(t) a = 1. . . . . q for all t over some interesting interval of time t~ <~t<-<t 2. Hence the authors propose the following analysis: Let

x~.(t) = N -1 x~j(t),
j

x..(t) = N -' ~_, ~ x~i(t )


a j

(9.1)

and
/2

f ( x l ( t ) - xa(t))Zdt
tl

(9.2)

be defined as a measure of the squared distance between two curves and arrange the data in Table 2.
Table 2
ANOVA Source Groups Within groups SS DF q- 1 N- q

B = 1~, N ~ f tt](x,.(t ) - x..(t)) 2 d t E = Y,~Nj f ~(x,j(t) - x~.(t)) 2 d t

It would appear that an F ratio statistic, say, F0= [ ( N - q ) B J / [ E ( q 1)1 (9.3)

should be sensitive to alternatives in relation to the degree that


t2

~. N~ f ('ca(t) - ~'(t))2 dt
o~ tl

(9.4)

departs from zero. The F ratio is computed for every permutation of the N individual curves such that there are N~ in the 0t TM group. The significance

Growth curve analysis

113

level fl is the number of such F ratios that are at least as large as the observed permutation given by F 0. Since the n u m b e r of such ratios m a y be prohibitive to compute it is suggested that/3 be estimated by choosing at r a n d o m d of these permutations and noting the n u m b e r f which are at least as large as F 0 and using the binomial distribution to place confidence limits about ft.

10 Concluding remarks
Growth curve analysis as presented by most statisticians has, until rather recently, generally stressed testing and estimation of the set of parameters ~-. We have attempted to shift the focus so as to emphasize more strongly, prediction, and yet discuss the traditional concerns. There are several reasons for the predictivistic point of view. The first is that an investigator is often more concerned with prediction than testing and estimation even if the growth curve model could, by some stretch of the imagination, be assumed a true representation of the physical process underlying the responses. Secondly,-it is quite clear that growth curve models do not in this sense provide such an exact physical specification. T h e y are basically statistical paradigms that are particularly convenient and useful for vastly complex p h e n o m e n a about which knowledge is often incomplete, fuzzy and generally lacking in the fundamental relationships. Lastly, prediction involves the entity that investigators actually m e a s u r e - - t h e response itself. Thus predictions can be, to a degree, validated by further investigation which is not the case with those hypothetical c o n s t r u c t s - - t h e parameters of the model, unobservable as they are. Hence testing, e.g. two groups are the same or differ in regard to their growth curve parameters, is really a selection problem in that we should choose the alternative that enhances prediction from one or both of the groups. The conventional 0.05 level for rejecting a null hypothesis is probably not very good for this purpose. That the estimation of ~- can be of some interest, e.g. estimating the differential growth rate of two groups whose growth curves are both approximately linear, we do not deny. But again this is, in a sense, associated with, and finally subordinate to prediction because the differential growth rate is really useful for describing where a future response from a r a n d o m unit from one group will be, compared to one f r o m another group. This of course, can be ascertained without recourse to estimating T. This work was supported in part by a grant f r o m the National Institute of General Medical Sciences.

114 References

Seymour Geisser

Bowden, D. C. and Steirthorst, R. K. (1973). Tolerance bands for growth curves. Biometrics 29 (2) 36t-371. Bo, G. E. P. (1950). Problem in the analysis of growth and wear curves. Biometrics 6, 362-389. Cole, J. W. L. and Grizzle, 3. E. (1966). Applications of multivariate analysis of variance to repeated measurements experiments. Biometrics 22 (4) 810-828. Collier, R. O., Baker, F. B., Mandeville, G. K. and Hayes, T. F. (1967). Estimates of test size for several test procedures based on conventional variance ratios in the repeated measures design. Psychometrika 32, 339-353. Danford, M. B., Hughes, H. M. and McNee, R. C. (1960). On the analysis oN repeated measurements experiments. Biometrics 16, 547-565. De Waal, D. J. and Nel, D. G. (1978). Parametric Multivariate Analysis. University of the Orange Free State, Bloemfontein. Fearn, T. (1975). A Bayesian approach to growth curves. Biometrika 62 (1) 89-100. Geisser, S. (1964). Posterior odds for multivariate normal classification. J. Roy. Statist. Soc. Ser. B 26, 69-76. Geisser, S. (1965a). A Bayes approach for combining correlated estimates, J. Amer. Statist. Assoc. 60, 602-607. Geisser, S. (1965b). Bayesian estimation in multivariate analysis. Ann. Math. Statist. 36, 150-159. Geisser, S. (1966). Predictive discrimination. In: P. Krishnaiah, ed., Multivariate Analysis, Academic Press, New York, 149-163. Geisser, S. (1970). Bayesian analysis of growth curves, Sankhyd Ser. A 32, 53-64. Geisser, S. (1971). The inferential use of predictive distributions. In: V. Godambe and D. Sprott, eds., Foundations of Statistical Inference, Rinehart and Winston, 456-469. Geisser, S. (1974). A predictive approach to the random effect model. Biometrika 61, 101-10'7. Geisser, S. (1975). The predictive sample reuse method with applications. J. Amer. Statist. Assoc. 70, 320-328, 350. Geisser, S. and Cornfield, J. (1963). Posterior distributions for multivariate normal parameters. J. Roy. Stat. Soc. Ser. B 25, 368-376. Geisser, S. and Greenhouse, S. W. (1958). An extension of Box's results on the use of the F distribution in multivariate analysis, Ann. Math. Statist. 29, 885-891~ Geisser, S. and Kappenman, R. F. (1971). A posterior region for parallel profile differentials. Psychometrika 36 (1) 71-78. Greenhouse, S. W. and Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika 24, 95-112. Grizzle, J. E. and Allen, D. M. (1969). Analysis of growth and dose response curves. Biometrics 357-381. Halperin, M. (1961). Almost linearly-optimum combination of unbiased estimates. J. Amer. Statist. Assoc. 56, 36-43. Khatri, C. G. (1966). A note on a MANOVA model applied to problems in growth curve. Ann. Inst. Statist. Math. 18, 75--86. Lee, J. C. (1977). Bayesian classification of data from growth curves. South African Statist. J. 11, 155-166. Lee, J. C. and Geisser, S. (1972). Growth curve prediction. Sankhyd Ser. A 34, 393-412. Lee, J. C. and Geisser, S. (1975). Applications of growth curve prediction. Sankhy8 Ser. A 37, 239-256.

Growth curve analysis

115

Lee, Y. S. (1972). Some results on the distribution of Wikks Likelihood-ratio criterion. Biometrika 95 (3) 649-664. Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model (with discussion). J. Roy. Statist. Soc. Ser. B 34, 1-41. Nagel, P. and De Waal, D. (1978). Bayesian classification and estimation of growth curves South African Statist J. (in press). Potthoff, R. F. and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313-326. Rao, C. R. (1959). Some problems involving linear hypotheses in multivariate analysis. Biometrika 46, 49-58. Rao, C. R. (1965). The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrika 52, 447-458. Rao, C. R. (1966). Covariance adjustment and related problems in multivariate analysis. In: P. Krishnaiah, ed., Multivariate Analysis, Academic Press, New York, 87-103. Rao, C. R. (1967). Least squares theory using an estimated dispersion matrix and its applications to measurement of signals. Proe. 5th Berkeley Syrup. 1, 355-372. Rao, C. R. (1968). A note on a previous lemma in the theory of least squares and some further results. Sankhy6 Ser. A 30, 245-252. Rao, C. R. (1969). A decomposition theorem for vector variables with a linear structure. Ann. Math. Statist. 40, 1845-1849. Rao, C. R. (1972). Recent trends of research work in multivariate analysis. Biometrics 28 (1) 3-22. Rao, C. R. (1975). Simultaneous estimation of parameters in different linear models and applications to biometric problems. Biometrics 31 (2) 545-554. Schatzoff, M. (1966). Exact distribution of Wilks' likelihood ratio criterion. Biometrika 53, 347-358. Smith, A. F. M. (1973). A general Bayesian model. J. R. Stat. Soc. Set. B 35, 67-75. Timm, N. H. (1979). Multivariate analysis of variance of repeated measurements. Handbook of Statistics Vol. I (in press). Wilson, K. (1975). The sampling distribution of conventional, conservative, and corrected F-ratios in repeated measurements designs with heterogeneity of covariance. J. Statist. Comp. and Simulation 3, 201-215. Zerbe, G. O. and Walker, S. H. (1977). A randomazation test for comparison of groups of growth curves with different polynomial design matrices. Biometrics 33 (4) 653-659.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 117-132

~_

Bayesian

Inference

in MANOVA

S. J a m e s

Press

1,

Introduction

This article focuses attention on the various models of the analysis of variance from the Bayesian inference point of view. In particular, we consider the multivariate analysis of variance (MANOVA). Let yj denote an N 1 vector of observations on the jth dependent variable (yield of the jth crop for N farms, score on the jth type of examination for N subjects, etc.). Let X : N x q denote the (design) matrix of observations of q independent variables (including a constant term) that are related to the jth independent variable (q characteristics of each of the N farms, q characteristics of each of the subjects who took the jth type of examination, etc.). Then, assuming linearity of the relationships,
Y l [ X -= X 91 -k- U1 (Nq) (ql) (NI)

(Nxl) yzlX=

92

u2

yplX=

9p +

represents a system of N observations on p dependent variables and q independent variables; flj : q 1 denotes t h e f h unknown coefficient vector, and ~ : N 1 denotes the vector of errors corresponding to the jth dependent variable, for each of the N cases. The error vectors may of course include omitted variables, misspecification structure of the model (nonlinearities, etc.), and errors in observing the dependent variable. We write the above system of equations in the more convenient compact form (U p)

fix=

(N q) (q p)

(N p) 117

(1)

118

S. JamesPress

where Y = ( Y l . . . . . Yp), U = ( u 1..... Up), a=~(/~ 1. . . . . tip). Also let U'=--(vi . . . . . vN), so that vi' denotes the i th row of U, and the prime denotes transpose. The Bayesian model of M A N O V A will be developed in Section 4 as a special case of the Bayesian multivariate regression model. We assume the fixed effects model. We first develop the Bayesian multivariate regression model.

2.

Assumptions of the model

To effect inferences about the unknown parameters of the model it is necessary to make some assumptions. Violations of these assumptions will lead to new models and modifications that must be m a d e to the basic model. 1. p + q < N; this assumption permits all parameters to be estimated. 2. rank of X(Nq)=q; if rank of X < q estimators of B will not be unique. 3. E(vi)=O; v a r ( v i ) = - E ( v i v j ) = X , for all i. Z is assumed to be an arbitrary, positive definite, symmetric matrix of order p. Thus, we are guarano teed that v i has a density in p-dimensional space. The fact that Y, does not depend upon i means that the error vectors all have the same covariance matrix. This assumption is the multivariate version of homoscedasticity; it helps to provide estimators of the elements of B that have m i n i m u m variance. 4. T h e v i vectors are mutually independent. Then, E ( v i v j ) = O for iv~j; that is, the vi vectors are mutually uncorrelated. Assumptions 3 a n d 4 together provide the basis for m i n i m u m variance estimators. The implication of the assumption is that the N farm yields (or test scores of different subjects) are independent. 5. E (vi) = N(0, X), for all i = 1..... N; that is, the probability law of the v~ vectors is normal with mean zero and covariance matrix Z. Thus, yields of the p crops or scores on the p types of examinations are correlated. 6. X is fixed and known; this implies that we need not m a k e any assumptions about the distribution of X.

3.

Estimation

The unknown parameters in the model are (B, X). qqaey will be estimated by combining subjective (prior) information with the observed data, via Bayes theorem.

Bayesian inference in MANOVA

119

It is well known (and straightforward to show) that the maximum likelihood estimators of B, and 31. are given by

= ( p , .....
(q Xp).

x ) ix'r,

2
(PP)

= ~(Y-XB)'(Y--XB).

Thus, if the matrix of residuals (estimated errors) is given by

~1= Y - X B ,

and

V------U'U,

an unbiased estimator of Z is given by

2=

i t)'O. N-q

It will be useful to rely upon an orthogonality property of MLE's, namely

(Y-XB)'(Y-XB)= X(B

V + ( B - B ) ' ( X ' X ) ( B - B).

(2)

This property is readily demonstrated by writing ( Y - XB) as [( Y - X/~) B)], expanding the sum of squares and noting that the cross product terms vanish.
-

3.1. Likelihood function


Since the vi vectors are independent, the probability density I of the error matrix is
N

p(UIX) =p(v~ ..... VNIX)= ]-[ p(ViIX).


j=l

Since
1 -

1/2(vjX-'~),

p(vjlE)= (2~)P/21~11/---~2e P(UI~")oclZ[-N/2exp

{(

--

) E (vjE-tvj) ,
zj=l

where cc denotes proportionality. Since U' U= Y][vjvj, the likelihood becomes p(UlY~)cclXl-U/2exp{( - )tr U'UE-~}, where tr(.) denotes the
tWe will be using p(.) to denote densities in a generic way and will not change letters to distinguish one density from another; that will be accomplished by changing arguments.

120

S. James Press

trace operation. Now change variables to find the density of Y. Since U= Y - X B , the Jacobian of the transformation is unity and the new density becomes

p(YIX,B, y~)~ I~1

N/2 exp { ( - ) tr( Y-- X B ) ' ( Y - X B ) Z - 1).

Use of eq. (2) gives the likelihood function p( Y IX, B, ~ ) oc I~ [- N/Z e x p ( ( -- ) tr[ V + (B - B ) ' ( X ' X ) ( B - g ) ] --1}.

(3)
3.2. Prior to posterior analysis To make posterior inferences (those made after observing the sample data) about (B, Z), we must first assess a prior distribution (one based only upon information available prior to observing the sample data) for (B, E). This prior distribution is very personal, in that it is the distribution representing the subjective beliefs of, and prior information available to, the decision maker, z The posterior distribution will of course be relative to the same individual. Thus, the entire M A N O V A analysis yields changes in subjective beliefs, predictions, or decisions all based upon the assessed prior of a given individual. We will consider two classes of priors, non-informative, and informative. Case 1. Non-informative prior For this case we adopt the prior density (see Geisser and Cornfield, 1963)

where

p(B) cc constant,
That is,

p ( ~ ) ~ I~.I-(P+l~/2

(4) This assessment assumes B and Z are independent, a priori; the elements of B are all independent, a priori, and they are all uniformally distributed over the entire real line. Thus, the prior distribution of B is improper. Eq. (4) also implies an improper prior distribution for E. In one dimension 2The term "decision maker" will be used throughout this article to refer to the individual for whom the MANOVA analysis is being carried out, whether or not he will actually make a decision at this time.

Bayesian inference in MANO VA

121

(p= 1), this assumption would imply that logZ is uniformly distributed over the entire real line; so we are adopting ap-dimensional generalization of that idea. The distribution we have adopted, though improper, will lead to proper posteriors, and meaningful statistical inferences. We propose this prior distribution for many reasons. A principal reason for using the non-informative prior is that in many respects the distribution corresponds to our intuitive notion of prior ignorance about (B, E); that is, the idea of our subjective feelings about (B, Z), a priori, being very vague, and not well defined; we are inclined to feel that all possible values are equally likely; or we wish to adopt a posture of trying our best to let the data "speak for themselves", objectively, without "contaminating" the inferences with individual, subjective beliefs. The posterior inferences made on the basis of this non-informative prior distribution will often be the same inferences that would be made by the frequentist statistician. For this reason, the non-informative prior provides sort of a benchmark, or point of departure, for Bayesian inferences. That is, inferences based upon the non-informative prior are those dictated by what we would believe in the "absence of prior information", and those based upon the informative prior are those based upon what we would believe when we try to combine our substantive subjective prior beliefs with observation. This non-informative prior distribution can be shown to correspond to minimal information in a Shannon information sense (Shannon, 1948). It also has the property that it is invariant under parameter transformation groups (Jeffreys, 1961). Thus, probability statements made about observ.~ able random variables should remain invariant under changes in the parameterizations of the problem. In proposing fiducial inference R. A. Fisher implied a non-informative distribution on the parameters (for p = 1), as did Fraser (1968) in his development of structural inference. A summary of these developments is given in Press (1972, Chapter 3). Bayes theorem implies that the posterior density is proportional to the product of the prior density and the likelihood function. In the multivariate regression problem, we multiply eqs. (3) and (4) to yield the joint posterior density p(B,~-,lX, Y) cc [Z,I-(~+p+~)/2
exp{ ( - 1) t r Z - ' [ V +(B-

B)'(X'X)(B-/~) ] ).

(5)
Inferences about B, without regard to Z, may be made from the marginal posterior density of B. The marginal posterior density of B is

122

5;, James Press

given by

p(BIX, Y)=

Z>O

p(B, EIX, Y)d.

(6)

We will actually want to make inferences about O(pq)~-B'. Let t~=:/~ '. The integration in (6) is readily carried out using the properties of the Wishart distribution (see, e.g. Press, 1972, p. 229). The result is

p(OlX, Y) o: [V+ ( 0 - O)(X'X)(O- 0)' I-N/2.


Tile expression on the right is readily recognized to be the kernel of a matrix T-distribution (Kshirsagar, 1960). The complete density is given, for - oe < 0 < + ~ , by
p(OlX,

Y) =

k[ v I ( N - q ) / 2 I x t x I P / 2 [

V"[- ( 0 -- O ) t ( X ' X ) ( O

- 0)]-N/2

(7)
where N > p + q - l, V>O,X'X >0, and

k=

Fq(N/2)

This result is identical with that found by Geisser (1965, eq. (4.8)). The notation V > 0, for any matrix V, means that V is positive definite symmetric. The notation Fq(t) denotes the q-dimensional gamma function, defined as Fp(t)=

f
X>O

IXlt-(P+l)/2e-tr(x)dX
p

~ q/.p(p -1)/4 H j=l

F(t\

The result in eq. (7) implies, in part, that the rows and columns of O, a posteriori, follow multivariate Student t-distributions, and the individual elements of 0 follow univariate Student t-distributions. It also follows that E(O[X,Y)=O, and if O ( p q ) ~ ( ( O 1 ) ( p l ) . . . . . (Oq)(pl)), a n d O~lpq) ~-(0; ..... 0q), var(0 IX, Y)= ( 1 / ( N - p - q - 1 ) ) V ( X ' X ) - 1 . Other properties of the distribution in (7) have been given by Dickey (1967) and Geisser (1965).

Bayesian inference in MANOVA

123

From an operational viewpoint, it is useful for computing confidence regions on 0 to note that (see Geisser, 1965)

u-

Ivw Iv+ ( o - O ) ' ( x ' x ) ( o -

0)i

has a n gp, q, N q distribution, as defined by Anderson (1958); i.e., it is distributed as the product of independent beta variates. Thus, the posterior distribution of U (where 0 is the random variable) is the same as the sampling distribution of U (for fixed 0). So a posterior region for 0 is found from the relation

e ( u ( o ) <. uo,p, ,N_q} = 1


where Uot,p,q,N_ q is the a th percentage point. Therefore the Bayesian region on 0 is equivalent to the confidence region. Inferences about 22, without regard to B, may be made from the marginal posterior distribution of 22. The posterior density of E is found by integrating eq. (5) with respect to B, and is given by p (NIX, Y)cc l y . I - ( N +p- q+ 1)/2 exp{ ( - )tr 22-1V }. The expression on the right is readily recognized as the kernel of an inverted Wishart distribution (see e.g. Press, 1972, p. 109). The complete density is given, for V > 0, by
P(~"IX, Y) =

i vI(N-q)/2 C122{(N+p--q+ 1)/2 exp{ (-- )tr Z - 1 V },

(8)

where

It follows from (8) that


E(Z[X,Y)U-p-q-l

'

where N - p - q - l > O . Variances and covariances of (Y.[X,Y) may be found, e.g., in Press (1972, p. 112). Posterior inferences regarding the diagonal elements of 22 (or blocks of diagonal elements) may be made from the marginal densities of the distribution in (8). The marginals of the

124

S. James Press

diagonal elements of JE follow inverted gamma distributions while the marginals of the block diagonal elements of N also follow inverted Wishart distributions (see, e.g., Press, 1972, p. 11 I).

Case 2. Informative prior We now treat the case in which the analyst has some specific subjective prior information he would like to interpose in this problem. The mechanism we propose for introducing this information involves the so-called (generalized) natural conjugate (this class was introduced by Raiffa and Schtaifer (1961)) family of distributions. The approach we suggest is to represent the prior distribution of (B,N) by a parametric family of distributions whose members are indexed by certain fixed, but as yet undetermined, parameters (often called hyperparameters to distinguish them from the parameters that index the sampling distribution). The hyperparameters are then assessed for the decision maker on the basis of his specific prior information. For example, the decision maker" might not know the value of a regression coefficient Ou, but he might feel 0u is most likely equal to about 0~ although it could be greater or less than 0* with probabilities that get steadily smaller as we depart from 0~ in either direction, symmetrically. That is, 0,7 is assumed to follow some unimodal, symmetric distribution (such as normal) centered at 0". The decision maker could be "pressed" further and he might conjecture that in his view it is unlikely that the value of 00. would lie outside some stated range. These assertions could be used to assess some of the hyperparameters by taking the roughly stated coefficient value to be the mean of the corresponding prior distribution; the stated range could be used as the value of three standard deviations of the corresponding prior distribution. Extending these ideas to many parameters will yield a complete assessment of the hyperparameters. The assessment problem is not simple and must be carried out carefully. It involves forcing the decision maker to introspect about the problem and to draw upon both his past experiences, and any theory he believes about the phenomenon at issue. There are now a number of computer programs available for assisting the analyst to assess prior information from the decision maker (see Press, 1979). Such problems greatly facilitate the problem of assessment in a multiparameter problem such as M A N O V A . The regression coefficients will be assumed to be jointly normal, a priori, while the covariances will be assumed, a priori, to follow an inverted Wishart distribution. The regression coefficients are first expressed as a long concatenated vector. Recall that B(qp)~ ((/31)(qX 1)..... ( flP)(q 1))" Now define fl(;q 1)E(fl~ ..... fl;); similarly for/3. Next note the convenient

Bayesian inference in MANOVA

125

identity relationship: ( fi - / } ) ' [ Z - 1 ( X ' X ) ] ( fl -/~) = tr Z-1 [ (B-- B)'(X'X)(B - B)I"

(9)
This identity is established readily by recalling the definition of direct product and examining the general element of both sides of the identity. Now assume that a priori,/3 and Z are independent and follow densities p( fl I,,F) cc exp ( (-- )(/3 - q))'F- '(/3 - q)) ),

p(Y.IG,m)o:lY.l-m/2exp{ ( -

) tr[ G Z - l] },

for G > 0 , m >2p, so that the joint prior density is given by

p(B,Y.[q),F, G,m)oclzl-m/2exp{ ( - ) tr[ GZ -l + ( f l - ~ ) ' r - ' ( fl-q~) ]}. (10)


Note that (~,F, G,m) are hyperparameters of the prior distribution that must be assessed. We now go forward in the analysis assuming (0, F, G, m) are known for a given decision maker. Bayes theorem yields the joint posterior distribution of (B,Z), for the case of the non-informative prior, by multiplying the likelihood function in (3) by the prior in (10). The result is (B, NIX, Y,O,F, G,m)eclZ]-(m+N)/:exp((--()[ (B-O)'F-'( fl-O) +trZ-'[(V+ G) + ( B - / } ) '
~)J]). (11)

x(x'x)(s-

Integrating eq. (11) with respect to Z, in order to obtain the marginal posterior density of B, gives, for all - 0c < B < + o%

p(BIX, y,O,F,G,m)

e x p { ( - )( B - o ) ' F - I ( f l - o )

I( v + G) + ( B - ~)'( X ' X ) ( B - ~)1 (N+ m-p-,)/z" (12)


This density being, the product of multivariate normal and matrix T-densities, is very complicated. It is therefore very difficult to use it to make

126

S. JamesPress

posterior inferences, except numerically. It is straightforward to develop a large sample normal approximation, however. 3 The result is that for large N, it is asymptotically true that

e( ~3IX, Y, ~,F, G,m) -~N( /3o,J


where

-1),

(13)

/30== - [ F - ' +( V + G)-'(X'X) ]-' ( F - Iq~+ [( V+ G)-I(X'X) ] ~ ),


and

J-~F-I+(V+G)

'@(X'X).

Thus, in large samples, posterior inferences about the elements of/3 may be made from (13), without regard to Z. The marginal posterior density of X is readily found by using the identity in (9) in eq. (11), completing the square in/3, and integrating the resulting normal density with respect to/3. The resulting density is

p(Y~IX, Y, dp,F, G, m) c~I 1-(m+u)/2lF - l + y - , ( X , X ) I-1/2


exp(--1) tr[Y=-I(v+ G ) + / 3 ' ( ~

IX'X)fi

-- [ F-'q)+(E-'X'X)t~]'[ F - l + Y.-i(X'X) ]-' [ F-'q~+ (E-I X'X)I~] }.


4.

(14)

MANOVA models

4.1. One way classification


Adopt the p-dimensional, one way layout (classification), fixed effects model. Specifically, assume
z (t) = + (15)

where a = 1..... q; t = 1..... 7 ; and alently, assume

z~(t) is a p 1 response vector. Equiv-

~[z,~(t)]=N(O,~,~.),

Z>O.

3The approximation is found by expressing the T-density portion of eq. (11) as an exponential, and then letting T become large.

Bayesian inferencein MANO VA

127

That is, there are observations on q populations, each p-dimensional, with c o m m o n covariance matrix, and we want to compare the mean vectors, and linear functions of their components. Accordingly, define
Y* ~ - [ Z I ( I ) . . . . . zl(Zl);...;Zq(l),...,Zq(Tq)]; (pXN) (pXN)

U'-~

[ v,(1) . . . . . DI(T,); . .. ;l)q(l ) . . . . .

q(Zq)]'~

B ' ~ [ ( 0 1 ) ( p x l ) . . . . . (Oq)(pxl) ] ~ ( p x 0q")~ (pq)

r 1

X (NXq )

ii i
B' G
(rXp) (pXq) (qX1)

Tq

Note that N--=YYT~. With these definitions the M A N O V A model in eq. (15) becomes the regression model ( Y = X B + U). Suppose we are interested in a set of r comparisons of one dimensional means. Then define

Lp = C,
(rl)

=-C,0C2,

(16)

where C 1 and C 2 are constant, preassigned matrices. The components of ~b are linear combinations of the elements of 0. We can m a k e posterior inferences about ~b, or the elements of qJ, from the posterior distribution of ~.

4.1.1. Non-informative prior


In the case of a non-informative prior distribution on (0, E), the marginal posterior density of 0 is the matrix T-density given in eq. (7). The linear transformation in (16) yields for the posterior density of ~, the multivariate

128

S. games Press

Student t-density

p(g~lx, Y) ~ { G ( x ' x ) - ' c 2


where ~us,

+ (q~ - ~)'(c, vc;)-l(lp

-- I~)}--(v+r)/2

(17)

t~-CIOC2, v=N-(p+q)+ 1. E[tPIX, Y]=(~,


var[~lX, Y] = [

C2(X'X)-'C2] P -- 2 ( C 1 VC[),

As an illustration of tile use of eq. (17) suppose we would like to make posterior inferences about simple contrasts, i.e., simple differences in mean vectors, or in their components. Take r = p , so that C 1 is the identity matrix of order p, I. Take C 2 to be the q-vector given by C~ =[1, - 1,0 ..... 0]. Then (pxl)
i,
(p X 1)

= CIOC 2 = 0 1 - 0 2 ,

= ~, - ~ = e, - e~,
1 ro

where
_

zo---

Ta t = l

E z~(t).

In this case, eq. (17) gives the posterior density for the difference in the mean vectors of populations 1 and 2. As a second example, take r = 1, (C1)(,p)=(l,0 ..... 0), and (C~)oq)~ ( 1 , - 1,0 ..... 0). Then, ~ ( 0 1 1 - 0 1 2 ). That is, ~ denotes the difference in the first components of the mean vectors of populations 1 and 2. The posterior density of such a simple contrast is found from eq. (17) as

P(q'lX, Y) oc{ (a,~- 2a~2+ a2E)+ ( v-~ )( q'- ~)2) --(N-p-q+2)/2


(18)
where

(X'X)-I=--A = (ao.), and V=--(vo. ). That is, for ~==011- 012, (~-~) [ Vll(au- 2a,2+ a22 ]

Bayesian inferencein MANO VA

129

follows a standard univariate Student t-distribution with p ~ - N - - p - q + 1 degrees of freedom. Note that while in the classical sampling theory case, only confidence interval statements can be made about q~, and they must be made for what might happen if many samples were to be collected, in the case of Bayesian inference, the entire distribution (posterior) of ~/, is available for inferences, and all assertions are made conditional on the single sample actually observed. Joint inferences about several simple contrasts can be made from the higher dimensional marginal densities of (17) (higher than univariate).

4.1.2.

Informative prior

The marginal posterior distribution of B was given in eq. (12). It is very complicated, as would be inferences based upon it. For this reason we examine, instead, the large sample approximation case. The asymptotic posterior distribution of B is normal, and is given in eq. (13). Since all elements of B are jointly normally distributed in large samples, a posteriori, so are all linear comparisons of means. Thus, in large samples, the posterior distribution of q~=---GB'C 2 is also multivariate normal and all contrasts may be readily evaluated.

4.2. 4.2.1.

Two-way (and higher) classifications No interactions

Adopt the p-dimensional, complete, two-way layout with fixed effects, no interaction between effects, and K observations per cell, K > 1. Then, the response vector is conventionally written as (19) where i = 1..... I; j = I ..... J; k = 1..... K. Here, of course, /x denotes the overall mean, a i the main effect due to the first factor operating at level i, 6j denotes the main effect due to the second factor operating at level j, and v~k denotes an error term. Note that all terms in eq. (19) are p-dimensional column vectors. Since the errors have been assumed to be normally distributed, it follows that

~(zij~[O,Z)= N(Oij, E),

E>O.

130

S. James Press

N o w we place the M A N O V A model in (19) in regression format. A c c o r d ingly, define


y
l

(pN)

(pxN)

U'

- [ (v111)~ 1~,,~1~ ..... v~l,,;... ;~,.. . . . . . (v,.,,,)~,, ,~],


(p q)

o-

g'-[(O,O~l~,...,O,,;O,~
<--K-~ ~--K-->
.... ,

.....

O,~;...;O~,...,O,~],

X t

(pXN)

<--K--~

i ..... 1

1 .....

where

q=-H,
Now

N = IlK.

Y = X B + U.
For this situation, for LP(rX0 = (C1)(rp)O(pq)(C2)(q 1), if r =p, and C 1= I, so that tp = OCz, a n d a n o n - i n f o r m a t i v e prior, the posterior density of ~p is given by the multivariate Student t-density

p(@lX, y) oc { C ~ ( X ' X ) - I C2 + (q~ - ~)' V-'(tp - ~) } -(p +p)/Z,


where v - - I J ( K - 1 ) - ( p - 1). As an example, take the case in which (C~) 0 x q)~[1, - 1 , 0 . . . . . 0], so that

(px i) But 011 - 021 - cq - o~2, and 011 - 021 ~-" ~1 - (~2. By choosing C 2 appropriately, we obtain the posterior distribution of any linear c o m b i n a t i o n of main effect vectors, and by choosing C 14=1, we obtain linear combina~ tions of their c o m p o n e n t s . F o r the case of an informative prior, normally distributed posteriors result for all contrasts, as in the case of a one w a y classification. Inferences about contrasts for higher w a y layouts for both n o n - i n f o r m a tive and informative priors are obtained in the same w a y as we have carried out the analyses for the one a n d two way layouts.

Bayesian inJerence in M A N O VA

131

4.2.2. M A N O V A with interaction For the simplest case of a fixed effects model with interaction we examine the complete two w a y layout with K observations per cell, K > 1. Then,

(z,j~)~. 1) = ~ + ~, + ~++ T0 + v 0 k - (o0.)<~ 1)+ v0~,


where all terms are p-dimensional; "to denotes the effect of an interaction between the first factor operating at level i, a n d the second factor operat: ing at level j ; i = 1. . . . . 1; j = 1. . . . . J ; a n d k = 1. . . . . K. It n o w follows that E(Zo.k[O,Z)=N(Oo, E ), ~>0.

N o t e that (X, Y , B , U ) m a y all be defined exactly as they were above for the case of a two w a y layout without interaction (the difference between the two models b e c o m e s a p p a r e n t only in the definition of 0). Thus, as before, if ~(px ~)= O(pxq)(Cz)(qx1), and ~ = 0C 2, where q=--IJ, the posterior density of ~ is given b y

p(~lX, r)~:{ c~(x'x)

~c~+(~-4)'v-'(~-~)}

~+~)/~,

for p = - - I J ( K - 1 ) - ( p - 1). N o w recall that to ensure estimability (identifiability) of all of the parameters, it is c u s t o m a r y to impose the constraints a + = O, ~ + ~-~O, Yi+ = 0 for all i, a n d y +j = 0 for all j ,

where a plus denotes an averaging over the subscript. Thus a+ =--1- lY~llaio It now follows that TO = O0 -- Oi+ -- 0 +j + 0 + +o So every 70 is just a linear function of the 0o (as are the a~ and 6fl. Thus, posterior inferences a b o u t the m a i n a n d interaction effects m a y be m a d e by judicious selection of C 2. Inferences in higher w a y layouts with interaction effects are m a d e in a completely analogous way.

References
Anderson, T. W. (1958). A n Introduction to Multivariate Statistical Analysis. Wiley, New York. Dickey, J. M. (1967). Matric-variate generalizations of the multivariate t-distribution and the inverted multivariate t-distribution. Ann. Math. Statist. 38, 511-518.

132

S. James Press

Fraser, D. A. S. (1968). The Structure of Inference. Wiley, New York. Geisser, S. (1965). Bayesian estimation in multivariate analysis. Ann. Math. Statist. 36 (1) 150-159. Geisser, S. and Cornfield, J. (1963). Posterior distributions for multivariate normal parameters. J. Roy. Statist. Soc. B 25, 368-376. Jeffreys, H. (1961). Theory of Probability (third edition). Clarendon Press, Oxford. Kshirsagar, A. M. (1960). Some extensions of the multivariate t-distribution and the multivariate generalization of the distribution of the regression coefficient. Proc. Cambridge Phil. Soc. 57, 80-85. Press, S. J. (1972). Applied Multivariate Analysis. Holt, Rinehart and Winston, New York. Press, S. J. (1979). Bayesian Computer Programs, In: A. Zelhaer, ed., Stuch'es in Bayesian Econometrics and Statistics in Honor of HaroM Jeffreys. North-Holland, Amsterdam. Raiffa, H. and Schlaifer, R. (1961). Applied Statistical Decision Theory. Harvard University Press, Boston. Shannon, C. E. (1948). The mathematical theory of communication. Bell System Tech. J. 27, 379-423, 623-656.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 133-177

Graphical Methods for Internal Comparisons in ANOVA and MANOVA


R. Gnanadesikan

1o Introduction

The analysis of variance, in addition to its concerns with formal statistical procedures such as estimation and tests of hypotheses as discussed in the other chapters of this volume, has important value as a data analytic tool for summarizing patterns of variability and underlying structure in data. Typically, in analysis of variance situations, one wishes to use the summaries (such as contrasts or mean squares in ANOVA and dispersion matrices in MANOVA) to answer m a n y rather than one or a few questions regarding the structure of the data. For example, even from the formal viewpoint of tests of significance there is not only interest in an overall F-test but also in simultaneous tests of several hypotheses and the so-called multiple comparisons problem (see Chapter 21). Given the important objective of assessing the relationship amongst, and the relative importance of, the various experimental factors as they affect the observed one or more response variables, one needs ways of focusing at least on the relative magnitudes of relevant summaries that arise in analyses of variance. F r o m a data analytic viewpoint, one would like to have procedures which use some sort of statistical model for aiding the process of making comparisons amongst the members of collections of comparable quantities, while at the same time not requiring a commitment on any narrow specification of objectives, including the unquestioned acceptance of all the assumptions made in the model. The aim should be the facilitation of detecting both anticipated and unanticipated phenomena in the data. Thus the techniques should have value not only for identifying possibly real effects but also for indicating the presence of outliers, heteroscedasticity and other peculiarities which are often assumed to be non-existent by the formal model. Examples of collections of comparable quantities include a set of singledegree-of-freedom contrasts, a collection of A N O V A mean squares or 133

134

R. Gnanadesikan

MANOVA dispersion matrices, and a group of residuals. Procedures for making comparisons of the sort described above among such comparable quantities are called internal comparisons techniques and this chapter discusses several probability plotting techniques for such internal comparisons in ANOVA and MANOVA. Exhibit 1 shows a two-way categorization of orthogonal analysis of variance situations. One way of the categorization pertains to the dimensionality, p, of the response variables in the analysis, viz. ANOVA (p = 1) or MANOVA (p > 1). The second factor in the categorization pertains to the orthogonal decomposition of n-space (n = n u m b e r of observations) as specified, for instance, by the design of the experiment. The experimental design specifies an orthogonal decomposition of n-dimensional space into subspaces each of which is associated with a meaningful facet of the experiment. The orthogonality of the decomposition is conceptually ap~ pealing and convenient in thinking of the different experimental facets as being uncorrelated, and for simplicity of discussion here such orthogonality will be assumed. Even with such a decomposition, however, one can distinguish three types of circumstances as indicated by the rows of Exhibit 1--viz. the all single-degree-of-freedom case (e.g. all main and interaction effects in a two-level factorial experiment), the multiple but nevertheless equal degree-of-freedom case, and lastly the general decomposition situation wherein the different subspaces may have differing dimensionalities or degrees of freedom. Section 3.1 below discusses the probability plotting techniques relevant to the univariate ANOVA cells numbered (I) and (II) in Exhibit 1. While there is a technique that has been proposed by Gnanadesikan and Wilk (1970) for cell (III), the conceptual and computational aspects are far from simple and straightforward so that, for purposes of this handbook, this material is not included. Section 3.2 is addressed to the probability plotting methods relevant to cells (IV) and (V). The blank cell is one for which no method has been proposed to date. Section 2, which follows, consists both of a general introduction to the basics of probability plotting and of specific discussion of particular probability plots that will be used in Section 3.
Response structure Deg.-of-freedom decomposition All single d.f. Multiple equal (v) d.f. Mixed d.f. Univariate (I) (II) (III) Multivariate (IV) (V)

Exhibit 1. Catego1~,zationof orthogonal ANOVA/MANOVA situations.

Graphical method~" for internal comparisons in ANOVA and MANOVA

135

2.

Quantile-quantile ( Q - Q ) probability plots

The essential ideas underlying a Q-Q probability plot can be described in terms of comparing two cumulative distribution functions, /~x(') and @(-), as shown in Exhibit 2. For a specified value of the cumulative probability, p, one can obtain the corresponding percentage points or quantiles, qx(P) from one distribution and qy(p) from the other. A Q-Q probability plot is a plot of the points {qx(P),qy(P)} for a range of chosen
values of p between 0 and 1,

If the two cumulative distribution functions are identical, clearly qy(p)= qx(P) for all p and the configuration of the Q-Q plot will be a straight line with zero intercept and unit slope. If the two distributions being compared differ only in location a n d / o r scale, i.e. Y=I~+aX ( - ~ < / * < + c c , 0 < o < ~ ) , then because such a relationship is also satisfied by the quantiles (i.e. qy(P)-- t~+ oqx(P) for allp) it follows that the configuration of the Q-Q plot in this case will still be linear but with an intercept of/z and a slope o. This linear invariance property of Q-Q plots makes it a particularly attractive practical tool in analyzing data. In the preceding description and discussion the two cumulative distribution functions involved have been theoretical ones drawn as smooth curves in Exhibit 2. In the practical context of using Q-Q plots, however, either or both of the functions can be empirical cumulative distribution functions+

z 0 I'-(.2 Z

cdfGy

o_
ec

l-if)

t
txl > _J

.9

qy(P) qx(P)
QUANTILES

Exhibit 2. Definition of Q-Q plots.

136

R. Gnanadesikan

An empirical cumulative distribution function for a sample of size n is a step function with steps of height l / n at each of the ordered values in the sample. Indeed in the case when both the distribution functions are empirical ones, the corresponding Q-Q plot (sometimes referred to as an empirical Q-Q plot) is a tool for assessing the similarity of the distributions of two samples. The use of an empirical Q-Q plot instead of the more familiar two-sample tests for location a n d / o r scale differences can be more revealing. While location and scale differences will be reflected in the intercept and slope of an empirical Q-Q plot, the presence of more subtle shape differences will be reflected by departures from linearity. In the special case when the two samples are of equal size the empirical Q-Q plot is merely a plot of the corresponding order statistics, i.e. a plot of the smallest value in the first sample against the smallest value in the second sample, the second smallest in the first sample versus the second smallest in the second, and so on. This is because the ordered values can themselves be considered as the empirical quantiles corresponding to cumulative proportions such as (i-)/n or i/(n+l) for i = 1 ..... n, where n is the common sample size. If the two samples are of unequal size (say m and n with m < n ) then a convenient convention for making an empirical Q-Q plot is to plot the ordered values of the smaller sample (size m) against the sample quantiles extracted from the larger sample, using the cumulative proportions ( i - )/m or i/(m+ 1) for i = 1..... m, for obtaining these quantiles. Exhibit 3 is an empirical Q-Q plot of the daily maximum ozone levels observed at two sites, Chester and Bayonne, in New Jersey. The former is a more rural, upwind site while the latter is downwind and much closer in to the New York metropolitan region. To facilitate interpretation the straight line of zero intercept and unit slope is drawn in, and a comparison of the configuration of the points with this line shows that the levels at the more rural site are at least as high as they are in the more urban location! For the most part, the configuration is quite linear and conforms quite closely to the 45 line so that the two sets of data are not strikingly different with respect to location and scale. There is a very slight indication that the upper tail values at Chester are somewhat larger than the corresponding ones at Bayonne. Also the presence of several "steps" in the plot suggests a more pronounced quantization effect in the Chester measurements than in the Bayonne ones. In its most widely used form, a Q-Q plot involves comparing an empirical cumulative distribution function (viz. a step function) with a theoretical or specified cumulative distribution function (viz. a smooth curve). For example, one may wish to compare the empirical cumulative distribution function of a data set against the cumulative distribution

Graphical methods for internal comparisons in A N O V A

and MANOVA

137

f
/
,m/ w/ * /
/ /

r.3 Ct3 .)

I
0.05

I
0.10

___t
0.15

I
0.20 0.25

~).00

BAYONNE TRAILER O Z O N E DAILY MAXIMA ( P P M ) MAY SEPTEMBER 1974-5

Exhibit 3. Empirical Q-Q plot of daily maximum ozone levels at two New Jersey sites. function, ~(-), of a standard normal distribution. A so-called normal Q-Q plot of the data can be made for this purpose and consists of plotting the ordered observations against the corresponding quantiles of the standard normal distribution. Specifically, if y ( l ) < Y ( 2 ) < " " <Y(n) denote the ordered observations, a normal Q-Q plot is a plot of the n points {qx(Pl),Y(o), i = l . . . . . n, where qx(pi)=D-l(pi) and Pi is a cumulative proportion associated with the ith ordered observation. C o m m o n l y used values ofpi are (i-)/n or i/(n + 1) but one can define pi = ( i - - a ) / ( n - 2a + 1) and use any of a variety of specifications for a (e.g. a - - 0 which leads to p i = i / ( n + l ) , or a = 5 1 which yields p i = ( i - ) / n or a = ~ 1 with pi= ( i - )/(n + ~), etc.) For moderately large values of n the different choices for a should not lead to noticeably different configurations of the Q-Q plot. The inverse function of the standard normal, ~ - 1 , which is needed

138

R. Gnanadesikan

for obtaining the qx(Pi) may be computed by using a procedure due to Hastings (1955, p. 192). (See also Appendix I.) In general, the Q-Q plot of a set of data against the quantiles of any theoretical or specified distribution with distribution function, /7(-), involves plotting the ith ordered value, Y(i), in the data set against the corresponding quantile q x ( P i ) ~" F - l(pi), i = 1..... n, with Pi defined in one of the above ways. All that is needed is an algorithm for computing the inverse function, F - ' ( . ) , and numerical procedures and computer pro grams for such a calculation are available for a wide range of distributions including the half-normal, the gamma, the Weibull and the beta distributions. In Section 3 the Q-Q plots used in the contexts of ANOVA and MANOVA will be those employing the quantiles of the normal, halfnormal, chi-squared and gamma distributions. For a family of distributions, such as the normal distribution, which involves only location (or origin) a n d / o r scale parameters, because of the linear invariance property mentioned earlier, all one needs is to be able to compute the quantiles of the standardized member (i.e. location or origin parameter=0, scale parameter= 1) of the family. On the other hand, if the family involves a shape parameter, as does the gamma or beta family of distributions, then one has to be able to compute the quantiles for each member of the family with a specified shape parameter value. Once again, however, one can disregard any location (or origin) and scale parameters that may be present by taking their values to be, respectively, 0 and 1. A final comment on Q-Q plots may be in order before proceeding. A somewhat different suggestion for Q-Q plots is to plot the ordered observations against the expected values of order statistics from the specified distribution instead of its quantiles. The main drawback of this version is the computational effort involved in obtaining the required expected values. Moreover, in large samples the difference between the quantiles and the expected values of the order statistics would be small anyway. The computational convenience in using quantiles tends therefore to favor their use in most applications of Q-Q plots.

3.

Specific Q-Q plots for ANOVA and MANOVA

In this section some specific Q-Q plots will be described that are useful for assessing the relative magnitudes of such things as the treatment effects and the residuals in both uniresponse and multiresponse circumstances. The first subsection treats the uniresponse methods while the second pertains to the multiresponse procedures.

Graphical methods for internal coreqmrisonsin ANO VA and MAN O VA 3.t. Univariate ( A N O V A ) situations

139

The simplest classical situation of ANOVA is that of tile two-sample comparison, and the empirical Q-Q plot described and illustrated in Section 2 is the appropriate technique for this case. The plotting of the quantiles in one sample versus the corresponding quantiles of the second sample enables an assessment of the similarity of the distributions of the two samples without assuming any specific distributional form for the data, and thus the method is distribution free. For the more complex ANOVA situations considered in this section, however, some distributional assumptions will be made in generating the relevant Q-Q plots but the configurations of the plots will themselves provide checks on the apo propriateness of the assumed distributional models. 3. l. 1. Q-Q plots f o r Cell I o f E x h i b i t 1 The prototype of the A N O V A situation in Cell I of Exhibit 1 is the two-level factorial experiment involving a single response variable, Y, and Daniel (1959) was the first to propose a Q-Q plot for this case: In a 2 N ( = n) factorial experiment all ( n - 1 ) of the main effects and interactions are single-degree-of-freedom contrasts and the decomposition of n-space into these mutually orthogonal contrasts is standard, meaningful and not arbitrary. For this reason the description of the techniques in this section will be in the context of such factorial experiments. The methods can, however, be used to assess any meaningfully defined set of mutually orthogonal single-degree-of-freedom contrasts. For example, a set of ANOVA residuals (viz. observation-fitted value as defined in Chapter 25) may be considered as approximately single-degree-of-freedom quantities and analyzed by the Q-Q plotting techniques to be described in this section. The full set of residuals are only approximately single-degree-offreedom quantities because they are not independent in that subsets of them might add to zero, but for situations involving factors each with a moderate number of levels this may not cause any serious difficulties. At any rate, if Y ' = ( Y l . . . . . Yn) denotes Lhe n ( = U v) observations on the single response variable, Y, in a 2 N factorial experiment, the ANOVA basically involves a linear transformation to a set of n experimentally meaningful quantities x=R-y, where R is an n n matrix whose first row consists of all l's and the remaining rows consist half of + l's and half of - l ' s thus defining the usual single-degree-of-freedom contrasts associated with the main effects

140

R. Gnanadesikan

and interactions of the N treatment factors (see Chapter 25). To m a k e the transformation an orthogonal one, we can utilize R o = ( 1 / ~ / n ) R . Also, to obtain the usual definitions of the different effects we could multiply the first row of R by 1 / n thus obtaining the first element of x as the m e a n of the observations, and multiply the remaining ( n - 1) rows of R by 2 / n thus obtaining the main effects and the different interaction effects due to the treatment factors. The matrix, Re, of this transformation is thus related to R by R E = D - R where D is an n n diagonal matrix whose first diagonal element is 1 / n and the remaining diagonal elements are 2 / n . A useful conceptualization of what the last ( n - 1 ) rows of R e accomplish is that each of these rows partitions the n observations into two sets of n / 2 observations, and then determincb a specific treatment effect as the distance between the centroids (arithmetic means) of the two sets. For example, a row of R E that corresponds to the main effect of a certain factor would partition the observations into two equal sets with one set containing the n / 2 observations which have the particular factor at its lower level and the other set consisting of the remaining n / 2 observations with that factor at its higher level; and the main effect itself would just be the difference between the means of the latter and former sets. F o r present purposes, it is sufficient to notice that whether R, Ro, or R E is used for the transformation, the last ( n - 1) elements of x are mutually orthogonal single-degree-of-freedom contrasts. F o r simplicity and definiteness, the orthogonal version is utilized and the transformation is rewritten
as

X1

x=

X2

=Ro.y,

(1)

n--1

where m = ( ~ / n ) f and x 1..... x~ i are constant multiples (viz. ~/n) of the usual main and interaction effects. Internal comparisons amongst x~ ..... xn_ 1, or some subset of them, are of prime interest. As null assumptions for generating methodology, if Y~..... Yn are assumed to be uncorrelated and normally distributed with a c o m m o n variance o2, then because of the orthogonality of R o it follows that m,x~ ..... xn_ 1 would also be uncorrelated and distributed normally with the same variance o 2. In fact, because of the "averaging" involved in the transformation, even if the initial observations are non-normal one would expect that m , x ~ , . . . , xn_ ~ would be more normally distributed. At any rate, with an assumption of normality for x 1..... xn_~ and an added

Graphical methodsfor internal comparisons in AN O VA and MANO VA

t41

null assumption of there being no real treatment effects at all in the experiment, one can consider x x..... xn-1 as a random sample from a normal distribution with mean 0 and variance a z. The null assumptions are intended as straw men merely to provide a backdrop against which departures from the null model can be detected and studied. Specifically, one can make a normal Q-Q plot of the x i. That is, one can order the single-degree-of-freedom contrasts to obtain xo) < X ( 2 ) < <X(n_ 1) and plot the ( n - 1 ) points whose coordinates are {q~-~(pi),xl 1 () ), where pi is a cumulative probability value (e.g. (i-~)/(n-1)) and q5-~ is the inverse cumulative distribution function of the standard normal distribution. If the interest in internal comparisons is confined to a subset of L( < n - 1) of the treatment effects, then the above steps are applied to the corresponding L contrasts using L in place of ( n - l) in the various definitions. If the data conform to the null model, the configuration will be linear with zero intercept and a slope which would yield an estimate of the unknown o, and except for this last feature one would have an uninteresting experimental data set! More realistically, perhaps some of the treatments would have real effects while others would not. In this case, the points in the lower left corner of the plot or in the upper right corner of the plot will tend to correspond to the real effects and will depart from a linear configuration indicated by the points in the middle which would all tend to correspond to effects that do not depart from null assumptions. Thus, the appropriate error configuration (the middle of the normal Q-Q plot) has been generated from the data itself and the departures from null conditions emerge as departures from such an error configuration. Although a normal Q-Q plot of the (signed) contrasts is very useful and revealing, if one were interested primarily in assessing the relative magnitudes of the treatment effects, it would be natural to consider either the absolute or squared values of the contrasts. Indeed, if [X1] . . . . . [Xn-1] denote the absolute contrast values, Daniel (1959) suggested ordering these to obtain Ix[(1)< Ix[(2)< . . - < [x[(n_~ and plotting them against the corresponding quantiles of a half-normaldistribution, i.e., a standard normal folded about its mean so that the density in question would be

2)e_.2/2, h(u) = x / g

u>lO.

(2)

A half-normal plot of the ( n - 1) absolute contrasts is thus a plot of the ( n - 1) points whose coordinates are { H l(p/), ix[(0} ' w h e r e p i is a cumulative probability such as (i-)/(n-1) and H -~ is the inverse of the cumulative distribution function for the half-normal distribution defined by (2). Again if the interest is in internal comparisons of magnitudes of a

142

R. Gnanadesikan

subset of L ( K n - - 1 ) of the effects, the above steps are used with the absolute values of the L contrasts of interest and L replaces ( n - 1) in the descriptions. For internal comparisons of relative magnitudes, two procedures entirely equivalent to the half-normal Q-Q plot of absolute contrasts would be Q-Q probability plots of the squared contrasts against the quantiles of either a chi-squared distribution with 1 degree of freedom or a standard gamma 1 distribution with shape parameter ~. That is, the ordered squared contrasts, x ~ ~; ~<x(,~_~), 2 may be plotted against the quantiles of the distribution whose density is either

c(u
or

= 727"

1/2 e

u/2

(3)

g ( z ) = ~F(1/2-------Cz-l/2e-z,

z>~O.

(4)

Since interpreting configurations on Q-Q plots is the essence of their usefulness as data analytic tools, it would be appropriate to describe a few canonical patterns and their interpretations. Exhibit 4 shows a series of schematic plots representing different possible patterns of configurations on a half-normal Q-Q plot of absolute contrasts. If all contrasts conform to null assumptions (an unlikely and uninteresting situation!), the configuration would be linear as in panel (a) of Exhibit 4. The intercept would be zero and the slope would be an estimate of the error standard deviation o. (Note: On a Xgl)Q-Q plot of the squared contrasts the slope would be an estimate of the error variance o 2 and on a gamma Q-Q plot of these the slope would estimate 2o2.) Next, if a few of the contrasts correspond to real treatmen~ effects then the configuration would depart from linearity in the style of panel (b) in Exhibit 4. The two points in the top right corner of this picture are clearly larger than what they would be expected to be from an extrapolation of the linear configuration in the lower left part of the picture. The interpretation would be that the two treatment effects that correspond to these points are real. Labelling the points (especially in the top right corner) by the treatment effects can be a quick way of pinpointing real effects. In the schematic representation, for instance, the two top deviant points are labelled (arbitrarily in this case) as the main effects of factors B and D. The linear configuration in the lower left corner of panel (b) is appropriately regarded as an "'error" configuration--its intercept is zero and the slope is an estimate of the error standard deviation.

Graphical methods for internal comparisons in A N O VA and M A N O VA

143

(a)
000

(b)
~B eD
o

e
0

NULL "~ 0

o.o~e

REAL EFFECTS

(c)
HETEROSCEDASTICITY

(d)

g
O....O0 O

O 00

o'j ee 0

i OUTLIER

(e)

(f)

gO
oo

e~

el O 20UTLIERS 0 0

ooeee o NON-NORMALI TY

Exhibit 4.

Schematic plots of patterns on half-normal Q-Q plots.

A standard assumption in A N O V A , which has also been employed as part of the null model discussed above, is the one of homoscedasticity or c o m m o n variance (o z) for all the observations. This assumption often goes unchecked in m a n y formal uses of ANOVA. W h a t if it is not valid for a particular body of data? As long as the observations are uncorrelated, the contrasts (the x) defined by eq. (1) will, in fact, have the same variance even if the initial observations have diffelent variances, and a study of the residuals in a variety of ways (including simple plots of t h e m against the fitted values as well as normal probability plots) would be helpful for identifying such heteroscedasticity. One type of heteroscedasticity that is particularly important in analysis of variance situations is the possible presence of more than one underlying error variance. If this were the case in a two-level factorial experiment, the contrasts would not all have the same variance, and this would affect the configuration on a half-normal Q-Q plot of their absolute values. For example, if the configuration is suggestive of two intersecting straight lines as in panel (c) of Exhibit 4, a

144

R. Gnanadesikan

reasonable interpretation would be that the contrasts belonging to the same linear piece have an underlying c o m m o n variance but those that belong to the two different pieces do not share a c o m m o n variance. The essential idea is that the "slope" in a Q-Q plot is a reflection of the scale (or standard deviation) of the things plotted. However, one may wonder why the pattern of panel (c) isn't interpreted as an extension of the one in panel (b), viz. that all the effects corresponding to the points on the line of steeper slope are real effects. This could, of course, be an explanation too. However, since real effects would show up as shifts that are unlikely to conform smoothly to a straight line configuration, one would expect the pattern of departures in the top right part of the picture to be more ragged than the smooth type of pattern exhibited in panel (c). Nevertheless, in a real world situation, further study of the membership of the linear pieces and thinking about the experimental context of the data would be the appropriate courses of action in deciding which of the interpretations is more reasonable. The main achievement of the Q-Q plot is that it has pointed up a deviation and helps develop some insights into unanticipated peculiarities in the data. Real data are often likely to contain at least a few outliers. With unstructured data, especially in small samples, such outliers m a y be easy to detect as the ones that "stick out" at either end of the sorted data. With structured data, such as those involved in ANOVA, the identification of outliers becomes more difficult while their influence on the usual analyses can be unduly large. It would, therefore, be natural to wonder what the effect of an outlier from a two-level factorial experiment would be on a half-normal Q-Q plot of the absolute contrasts. Panel (d) shows the configuration that would result in the presence of a single extreme observation. The main feature to focus on in this schematic picture is the non-zero intercept. Remembering f r o m the definition of the single-degree-offreedom contrasts that each observation appears in every contrast with a coefficient of either + 1/ V n or - 1/ V n, and that the outlying observation may be considered as introducing a major bias or shift, the effect of the outlier would be to bias every contrast moving all of them away from zero. Since one is looking at the absolute values of such contrasts in a half-normal plot, the result would be the positive bias of even the smallest absolute contrasts thus inducing a positive intercept as in panel (d). W h a t if there were two initial observations that are outliers of a b o u t the same magnitude? Such an outlier pattern m a y not be a c o m m o n occurrence but it is still interesting to raise the question. Going back to the definition of the contrasts again, it is easy to figure out that the outliers will appear with the same sign in half the contrasts (thus biasing these considerably) and will appear in the remaining half of the contrasts with opposite signs (thus cancelling out each others bias). The half-normal plot

Graphical methods for internal comparisons in ANO VA and MANO VA

145

would then appear as in panel (e) of Exhibit 4, with the essentially "unbiased" contrasts defining the lower linear piece and the remaining half of them conforming to the shifted upper piece. The final panel in Exhibit 4 demonstrates, again in a schematic way, the effect of bad non-normality in the data. As stated earlier, enhanced normality would be expected for the contrasts and, from a practical viewpoint, it is only when the initial observations are so badly non-normal that even the contrasts are still quite non-normally distributed that one would need to be concerned about corrective action. When the contrasts are non-normal, or equivalently the absolute contrasts are not half-normal, the half-normal Q-Q plot will not be linear even at the lower end but will exhibit curvature throughout. The type of curvature in panel (f), for instance, might suggest that the observations are lognormally distributed and a transformation may be appropriate. The canonical patterns in Exhibit 4 have all been discussed in terms of half-normal Q-Q plots. But the same discussion applies to the X~I) and gamma (with shape parameter= ) Q-Q plots of the squared contrasts and also to other Q-Q plotting techniques discussed in the later sections of this article. To simplify the discussion of how to interpret Q-Q plots in ANOVA and MANOVA situations, the schematic patterns in Exhibit 4 were highly stylized and the different sources of non-null patterns were isolated for individual exposition. In practice the departures from null conditions may occur not due to single but multiple causes. The above discussion of the typical patterns can be of value in understanding and disentangling the patterns in such complex real situations as well. The next example, taken from Daniel (1959), illustrates the use of half-normal Q-Q plots. The data is from a 25 factorial on penicillin production described by Davies (1956) and the second column of Exhibit 5 shows the ordered values of the 31 absolute contrasts as given by Daniel (1959). (Note: The values are constant multiples of the contrasts as defined by eq. (1) and they have been rounded off.) The labels of the corresponding treatment effects are shown in the first column of Exhibit 5. The 3rd column is a listing of the fractions (i-)/31, i--1 ..... 31, while the last column gives the corresponding quantiles of the half-normal distribution using a computerized numerical algorithm for the purpose (see Appendix I). The first number in the last column, for example, satisfies the equation

(x/ t f'Ze-"2/2
7r/a o

du = 0.0161.

A half-normal Q-Q plot is obtained by plotting 31 points whose coordinates are the corresponding values in columns 2 and 4 of Exhibit 5, and

146 Original identification ABC AE CD B BD D CDE ABDE BCDE BCD ADE ABE BDE BE DE ABCE ACD ABD BCE ACDE BC AC AD ACE ABeD AB ABCDE CE C A E

R. Gnanadesikan
Ordered observation 0.00 2.00 4.00 6.00 7.00 9.00 12.00 14.00 16.00 18.00 21.00 22.00 28.00 29.00 30.00 31.00 33.00 34.00 39.00 47.00 53.00 53.00 54.00 58.00 58.00 64.00 77.00 93.00 153.00 190.00 224.00 Theoretical quantile 0.020 0.061 0.101 0.142 0.183 0.224 0.265 0.308 0.350 0.394 0.438 0.483 0.529 0.576 0.624 0.674 0.726 0.780 0.836 0.895 0.957 1.023 1.093 1.170 1.255 1.349 1.457 1.586 1.747 1.974 2.406

Probability 0.0161 0.0484 0.0806 0.1129 0.1452 0.1774 0.2097 0.2419 0.2742 0.3065 0.3387 0.3710 0.4032 0.4355 0.4677 0.5000 0.5323 0.5645 0.5968 0.6290 0.6613 0.6935 0.7258 0.7581 0.7903 0.8226 0.8548 0.8871 0.9194 0.9516 0.9839

Exhibit 5. Table of ordered values of 31 contrasts in a 25 experiment and quantiles of the half-normal distribution (Davies, 1956; Daniel, 1959).

E x h i b i t 6a s h o w s s u c h a plot. A r e a s o n a b l e i n t e r p r e t a t i o n of this c o n f i g u r a t i o n is t h a t the m a i n effects E , A a n d C a r e real. I n o r d e r to s t u d y t h e r e m a i n i n g c o n t r a s t s m o r e a p p r o p r i a t e l y , it is a g o o d i d e a to replot t h e m o n a r e - s c a l e d h a l f - n o r m a l Q - Q plot. W h a t is d o n e to r e - s c a l e is to use n e w c u m u l a t i v e p r o b a b i l i t i e s ( i - 1 ) / 2 8 , i-1. . . . . 28, a n d to r e c o m p u t e t h e q u a n t i l e s of t h e h a l f - n o r m a l d i s t r i b u t i o n c o r r e s p o n d i n g to t h e s e f r a c t i o n s . E x h i b i t 6b is a r e p l o t of t h e r e m a i n i n g 28 a b s o l u t e c o n t r a s t s a g a i n s t t h e t h u s r e c o m p u t e d q u a n t i l e s . ( N o t e t h a t the v a l u e 93 f o r C E is p l o t t e d a g a i n s t 2.369 r a t h e r t h a n 1.586 as in E x h i b i t 6a.)

Graphical methods for internal comparisons in ANOVA and MANOVA


250

147

* E

200
*A

Fu)

* C

~: 1 5 0

t) w

o to m c~ ta w o o: o 100

*CE

" ABCDE

"AB

50

o*

0 O.O

I
0.5

I
1 .0

R
1 .5 THEORETICAL

I
2.0 QUANTILES

I
2.5

I
3,0

,S

31

POINTS

HALF-NORMAL ON T H E P L O T ,

PROBABILITY PLOT IDENTIFICATIONS ON

THE

PLOT

Exhibit 6a.

Half-normal Q-Q plot corresponding to Exhibit 5 (Daniel, 1959).

148

R. Gnanadesikan

1oo I--

. . . . . .

r- - - -

"CE

8O

"ABCDE

u~ co "~ oc 60

*AB

0 m m

~ c~ 0

4-0

W= I

20
Q

0 0.0

1
0.5

L
1 .0 1 .5

I
2.0

J
2.5

J
3.0

3.5

THEORETICAL 28 POINTS HALF-NORMAL ON T H E P L O T , 3

QUANTILES PROBABILITY PLOT IDENTIFICATIONS OH THE PLOT

Exhibit 6b.

Replot of Exhibit 6a.

Graphical methods for internal comparisons in A NO VA and MAN-O VA

149

The configuration in Exhibit 6b may be interpreted as sufficiently null and no other departures are uncovered. The idea of replotting just illustrated is a useful procedure and should be used with all of the techniques described in this article. Several other interesting examples of the use of half-normal plots are given by Daniel 0959).
3.1.2, Q - Q plots for Cell H of Exhibit 1 A prototype of this situation is a multiway cross-classification or table with v + 1 replications within each cell. If there are k cells, then the ANOVA leads to k within-cell error mean squares, s 2 2 each with v 1. . . . . sf,, degrees of freedom. Relative magmtude comparisons amongst these k mean squares would be of interest. Another example of this cell of Exhibit 1 is the internal comparison of the relative magnitudes of all the main effects in a s m factorial experiment, since the m mean squares corresponding to the main effects would all be based on p = ( s - 1 ) degrees of freedom. In fact, the methods to be described here may be utilized in any situation which involves the comparison of several ANOVA mean squares each based on p degrees of freedom and associated with mutually orthogonal facets of the experiment. Returning to the prototype mentioned in the first paragraph of this section, the null assumptions for assessing the relative magnitudes of s~..... s~ are that these mean squares be considered as a random sample from a central chi-squared distribution with p degrees of freedom. The 2 2 appropriate Q-Q plot is thus a plot of the ordered values s(1 ) <s(2 ) < ..- < S~k) against the quantiles of either a X~,) distribution with density G ( u ) - - u'/2-'e-~/2 2"/2F(p/2)
1

u/>O,

(5)

or a standard gamma distribution with shape parameter p / 2 whose density is

g./2(z) = )t/r'r'2-----v z , / 2 - , e - Z ,

z >/0.

(6)

Specifically, the chi-squared Q-Q plot will be a plot of k points whose coordinates are { C~-1 (pi),s(i)), 2 i = 1. . . . . k, wherepi is the usual cumulative probability and C~-1 denotes the inverse of the cumulative distribution function of the X~) distribution defined by eq. (5). Similarly, the equivalent gamma Q-Q plot is a plot of { G,/2(Pi),s(o}, -1 2 i = 1..... k, where Q~l is the inverse cumulative distribution function associated with eq. (6). Numerical

Probability 0.010 0.029 0.048 0.067 0.087 0.106 0.125 0.144 0.163 0.183 0.202 0.221 0.240 0.260 0.279 0.298 0.317 0.337 0.356 0.375 0.394 0.413 0.433 0.452 0.471 0.490 0.510 0.529 0.548 0.567 0.587 0.606 0.625 0.644 0.663 0.683 0.702 0.721 0.740 0.760 0.779 0.798 0.817 0.837 0.856 0.875 0.894 0.913 0.933 0.952 0.971 0.990 Exhibit 7.

Ordered observation 62.0 81.0 89.0 115.0 131.0 135.0 152.0 173.0 189.0 235.0 239.0 255.0 305.0 341.0 409.0 422.0 426.0 445.0 489.0 491.0 509.0 518.0 525.0 531.0 551.0 593.0 601.0 605.0 615.0 747.0 790.0 831.0 839.0 845.0 845.0 849.0 951.0 1039.0 1077.0 1098.0 1107.0 1123.0 1141.0 1146.0 1325.0 1501.0 1524.0 1707.0 1781.0 1969.0 2194.0 3981.0

Theoretical qtaantile 0.291 0.524 0.695 0.842 0.976 1.100 1.219 1.333 1.444 1.552 1.659 1.765 1.870 t.975 2.079 2. t 84 2.289 2.395 2.502 2.610 2.720 2.831 2.944 3.059 3.176 3.296 3.418 3.544 3.674 3.807 3.945 4.088 4.236 4.390 4.551 4.719 4.897 5.083 5.281 5.493 5.719 5.963 6.229 6.521 6.846 7.214 7.638 8.141 8.763 9.583 10.805 13.367

Table of 52 ordered variances and quantiles of X~4)distribution (Hald, 1952). 150

Graphical methods for internal comparisons in ANOVA and MANOVA


4000

151

I-

:F

3000

L) z

>

2000

w
Q

1000

I
2 CHI-SQUARED

I
4

L _
6 THEORETICAL PROBABILITY PLOT, 5 2 P O I N T S ON

I
8 QUANTILES

I
10 OF

I
12 14 FREEDOM

4.000 DEGREES THE P L O T

Exhibit 8. Chi-squared Q-Q plot colTespon~ng to Exhibit 7 ( W i ~ et al., 1962a).

152

R. Gnanadesikan

methods for computing the inverse functions, C -~ and G - 1 , exist (see Goldstein, 1973; Wilk et al., 1962a; and Appendix I). The next example taken from Wilk et al. (1962a) utilizes data from Hald (1952) and entails an internal comparison of 52 variances each based on four degrees of freedom. Exhibit 7 lists the cumulative fractions ( i - )/52, the ordered variances and the corresponding quantites of a chi-squared distribution with 4 degrees of freedom. Exhibit 8 is the chi-squared Q-Q plot obtained by plotting the values in column 2 of Exhibit 7 against the corresponding values in column 3. The point in the top right corner of the plot is seen to depart from the linear configuration of the remaining 51 points thus suggesting that one af the variances is too large relative to the others. In this example, even without replotting (see discussion in Section 3.1.1) one can conclude that the remaining 51 variances are reasonably homogeneous. 3.2. Multivariate ( M A N O V A ) situations'

Clearly one approach to analyzing observations on p response variables from n experimental units organized according to some experimental design is to ignore the intercorrelations amongst the variables and carx)" out p separate univariate analyses. This is often a wise and useful strategy and, depending on the decomposition of n space specified by the experimental design, one can use the appropriate techniques discussed in Section 3.1 with each of the p variables. In this section, however, techniques appropriate to Cells IV and V of Exhibit 1 will be described. These involve simultaneous analysis of more than one response variable and should be viewed as supplements to the separate univariate analyses. 3.2.1. Q-Q plots for Cell I V of Exhibit 1

As in Section 3.1.1, the prototype of the experimental design for this situation will be taken as a 2N factorial with the difference that on each of the n ( = 2~) experimental units one has ap-dimensional vector observation. If the rows of the n p matrix Y' are the p-dimensional observations, then

y,=

=[Y,

(7)

Ly'oJ
so that Yi, the jth column of Y', consists of the n observations on the jth

Graphical methods for internal comparisons fl~ ANOVA and MANOVA

153

variable, j = 1..... p. One carl define a transformation,

m!
x~

x;

--R 'o Y-

(/T~1 rn 2
x~ x2

~o ~

mp

)
' (8)

"--

xp

where R o is the same n n orthogonal matrix that was involved in eq. (1), m'=(~/n)y'=(Vn)(fl,y2 . . . . . f p ) and the ( n - 1 ) elements of the colunm vector Xj are just the ( n - 1 ) single-degree-of-freedom contrasts derived from the observations on the jth variable, j = 1. . . . . p. Separate urfivariate analyses of the elements of the Xj m a y be carried out by the methods of Section 3.1.1. The present focus is on analyzing the collection of vectors x~, i = 1..... n - l, whose p elements are the single-degree-of-freedom contrasts with respect to the p variables which presumably are intercorrelated. The vectors, x~, to be called s i n g l e - d e g r e e - o f - f r e e d o m contrast vectors, may be conceptualized as (n - 1) vectors emanating f r o m the origin with each of them being associated with a specific main or interaction effect in the factorial experiment. F o r example, in a 2 3 factorial experiment whose three factors are labelled A, B and C if there is a bivariate response (i.e., p = 2) involved, one may have a configuration of seven two-dimensional contrast vectors as shown schematically in Exhibit 9. The vectors in this display are labelled by the treatment effects associated with each of them. The coordinates of each vector are proportional to the univariate treatment effects for a particular main or interaction effect associated with that vector. For example, the abscissa value of the vector labelled A in Exhibit 9 would be proportional to the main effect of factor A with respect to the response variable Yl, while the ordinate value would be proportional to the main effect of A for 1:2Recalling the earlier discussion in Section 3.1.1 of the accomplishments of the transformation involved in going from the initial observations to the ( n - 1 ) main and interaction effects, the contrast vectors x~ may be con~ ceptualized as vectors joining the centroids of partitions of the p-dimen~ sional data, there being ( n - 1) different partitionings corresponding to the ( n - 1 ) effects. If one were interested in the relative magnitudes of the effects one needs a way of measuring the distance between pairs of centroids of certain partitions of the data, or equivalently one needs a measure of size of the vectors x~, i = 1..... n - 1. F o r example, in Exhibit 9 discarding the orientations of the seven vectors one looks for a measure of

154

R. Gnanadesikan

Y2

~" A Exhibit 9. Schematic display of contrast vectors from a 23 experiment involving two response variables. their sizes that will enable a comparison of the relative magnitudes of the seven treatment effects. A key difference between the univariate and multivariate cases is that there is no single or unique measure for specifying the size of the vectors x~. One useful class of measures of size is the class of positive semi-definite (p.s.d.) quadratic forms d = x'Ax. The p.s.d, matrix A can be used to reflect differing variances of, and intercorrelations among, the p response variables and this degree of flexibility in a measure of size is extremely important from a data analysis viewpoint. More will be said later a b o u t the choice of A but, for the present, given a specific choice the problem of internal comparisons of the relative magnitudes of the x~ is reduced to the unidimensional problem of assessing the relative magnitudes of the nonnegative quantities d~=x~Ax i, i = 1 . . . . . n - 1 , which are squared distances between centroids of certain partitions of the initial data.

Graphical methodsfor internal comparisons in ANO VA and MAN O VA

155

Basically what is desired is a Q-Q plot for the dg and this in turn implies a need for a null distributional model against which the di may be evaluated. If one were to assume that the initial observations y} are independent p-variate normal observations with a common unknown coo variance matrix ~, then the x~ will also be p-variate normally distributed with the same covariance matrix N. Furthermore, as an extreme null scenario if one were to assume that none of the treatment effects are real, then the x~ may be considered as a random sample from Np[O, l~], where 0 is the null vector. Exactly as in the univariate case of Section 3,1.1, even if the initial observations are non-normal the contrast vectors would be expected to be more multinormally distributed because of the averaging involved in the transformation R o of eq. (8). If x is Np[O, ~2] then it is well known that the distribution of d = x ' A x is the distribution of the linear combination, Z~=lXjX2(1), where r is the rank of A, XJ are the positive eigenvalues of AN and the x~(l) are independent central chi-squared variates with 1 degree of freedom each. Although exact, this is not a very useful result in practice since, among other things, Z is usually unknown However, an equally well-known (e.g., Satterthwaite, 1941; Patnaik, 1949; Box, 1954) approximation to the distribution of a linear combination of independent chi-squared variates can be used. The essential result is that the distribution of d may be approximated by a ganmm distribution with scale parameter, X, and shape parameter, 77, with both ?, and B to be estimated. In the context of the present problem, the required null distributional model therefore is that d,. =x~Ax,., i = 1..... n - 1, may be considered as a random sample from a distribution whose density is g(d;)t,r/)-?tn dn-~e-Xa; d~>O, (9)

r(n)

where X and ~/ are both positive and need to be estimated from the d~. Given appropriate estimates X and ~, the Q-Q plot here would involve ordering the d/ to obtain d(1)~<d(2 ) ~<"2 ~<d(n-l) and plotting the ( n - 1 ) points whose coordinates are { G - l(pi; X, ~), d(,.)), where Pi is a cumulative fraction such as ( i - 7 1 )/n-1 and G - i denotes the inverse of the cumulative distribution function associated with the density in eq. (9). Since ~ is a scale parameter estimate and hence only affects the slope of a Q-Q plot (see the earlier discussion in Section 2), in practice one can plot { G - l(pi; 1, ~), d(i ) }, i.e., the ordered d(0 are plotted against the corresponding quantiles of a standard gamma distribution (viz. X= 1) with shape parameter ~. Numerical methods for computing the required quantiles,

156

tL Gnanadesikan

G - l ( p i ; 1,~), are available (see Goldstein, 1973; Wilk et al., 1962a; and Appendix I). Issues in and approaches to estimating X and ~ need to be considered next. One approach would be a trial and error one of experimenting with different values of X and ~ until one obtains a set which essentially straightens out (as much as possible) the lower end of the g a m m a Q-Q plot of the ordered 4. With access to good numerical and graphical computing facilities this could indeed be a very effective way of obtaining the appropriate Q-Q plot. On the other hand, one m a y want to avoid or at least reduce the n u m b e r of iterations involved in such a process by using some systematic procedure for estimating ;k and ~/. One such systematic approach is m a x i m u m likelihood estimation based on an order statistics formulation and the steps involved as well as a rationale for t h e m are discussed next. The null assumptions stated above are once again intended to serve as a backdrop against which interesting departures can be detected. Ideally, therefore, one would like to determine the fitted g a m m a distribution only from those d~ which conform to the null assumptions so that the non-null d,. will exhibit themselves as departures. Specifically, if some of the treatment effects are real, so that the contrast vectors xi associated with these effects can not be considered as having a null m e a n vector, the d,. corresponding to these effects would still be distributed as linear combinations of independent chi-squared variates, except that now these will be non-central chi-squared variates. Unfortunately, the g a m m a distribution approxima-. tion mentioned earlier is also applicable to this situation which involves non-central X2'S! Hence the inclusion of all the d~, including possibly those that reflect real treatment effects, in estimating X and ~ would tend to obscure the presence of the interesting non-null structure in the data. Since the ideal situation of basing the estimates on only those d~ that conform to null assumptions is unachievable in practice because one can not identify such d~ with certainty, the best that one can do is to try to obtain some insurance against the disastrous effects of including non-null d~ in the estimation. Intuitively, since relative magnitudes are the focus of interest, one feels that the small d,. are more likely to conform to null assumptions than the large ones. With this in mind, one orders the d i to obtain

(lO)
and then judgmentally specifies a number K ( < n - 1 ) as the number of which are likely to conform to the null assumptions. Next, as additional insurance, one uses only the M ( < K) smallest d~, considered as the M smallest order statistics in a sample o f size K, for determining the estimates

Graphical methoclsfor internal comparisons in AN O VA and MANO VA

157

There are two conflicting statistical aspects that one is trying to keep in balance via this order statistics formulation: if M is much smaller than the "true" number of 4 that conform to null assumptions, the estimates would not be statistically efficient; on the other hand, if M is chosen to be too large so that the actual values of some non-null dg end up influencing the estimates, then these would be seriously biased. Paying a small price in efficiency to avoid the major problem of bias seems worthwhile in the present context and, as a general rule of thumb, choosing values of K/M > 1.5 seems reasonable. Using the order statistics formulation in the preceding paragraph, the method of maximum likelihood is a relatively straightforward way of obtaining the estimates X and 9 to be termed role (maximum likelihood estimates). The rnle may be obtained either by using computer programs of the numerical methods described in Wilk et al. (1962b) (see Appendix II) or from tables therein. The role are functions of K/M,d(M ) and the two summary statistics,

Pm~d(i'Ji=l

# (M'andS=~d(i)i=.

d(M)'

where the d(0, i = 1..... M, are the M smallest 4 as denoted in (10). For specified values of K/M, P and S, Wilk et al. (1962b) tabulate values of ft(=~/Xd(M)) and ~, and the role X is thus obtained by dividing the tabulated value of ~ by the product of the known value of d(M~ and the tabulated value of tl. The technique for Cell IV of Exhibit 1 thus consists of: (i) computing the single-degree-of-freedom contrast vectors, x~, i = 1..... n - 1; (ii) choosing a pp compounding matrix A that is p.s.d, and calculating d~--x~Axi, i = 1 ..... n - l ; (iii) ordering the 4 to obtain d(1)~d(2)<-.. ~<d(n_,) and using the M smallest of these as the M smallest order statistics in a sample of size K to obtain the role X and ~ of the scale and shape parameters, respectively, of a gamma distribution. (Note: Choice of M and K left to judgment); and (iv) plotting the ( n - l ) points whose coordinates are {2i,d(o}, where the quantile, 2i, of a standard gamma distribution with shape parameter ~ satisfies the equation

fo

~,U~- 1e - " du = F(~)-pi,

and Pi is a cumulative fraction such as (i- )/n--t. Once again if one is interested in internal comparisons amongst a subset of L ( < n - 1) of the

158

R. Gnanadesikan

treatment effects, the only change needed in the preceding description is to replace (n - 1) by L. The choice of the matrix A in step (ii) of the above description has to be discussed. A given choice leads to a specific unidimensional mapping of the x, in terms of their sizes, and it is important to recognize that no truly multidimensional situation should be expected to be adequately captured by any single unidimensional representation. Different choices of A would yield insights into different aspects of the multivariate data and the use of several A's in analyzing a given body of data is to be highly recommended. Gnanadesikan (1977, pp. 235-236) lists ten possible choices for A but at least three general types among these are worth mentioning. The first is to choose A = I which leads to d i = x~x/, the squared length of x~. This choice is not always satisfactory since it does not permit weighting the different response variables differentially to reflect their different scales or variances. A second choice that enables one to handle this issue is A = D where D is a p p diagonal matrix whose diagonal elements are reciprocals of the variances of the p variables. The required variances may either be specified or estimated from the contrast vectors themselves. If the contrast vectors are used f o r estimating the variances then, in many applications, since at least some of them will be non-null it would be wise to use a robust estimate of variance based on a subset of the contrasts rather than on all of them. While the second choice allows for differing variances of the response variables it does not allow for any intercorrelations among them. A third type of compounding matrix that scales the variables, so as to allow both for differences in their variances and for varying degrees of correlation between pairs of them, is of the form A = S -1, where S is a covariance matrix of the p variables. Again S may be either specified or estimated from the contrast vectors, xi, which are being analyzed. In this situation also it would be sensible in many applications to use a robust estimate instead of the usual covariance matrix based on all ( n - 1 ) contrast vectors. Except for the first choice of A, it will be noted that the choice of A may depend on the data. Despite this data dependence, since once A is chosen it is common to all the d~ whose relative magnitudes are being assessed, in the viewpoint of internal comparisons adopted here it is considered to be a fixed quantity. To conclude this section, the use of gamma probability plots of the d~ is illustrated by two applications described by Gnanadesikan (1977, Examples 36 and 42). The first utilizes data from 29-1 fractional factorial experiment which was concerned with the effects of 9 factors on the visual quality of P I C T U R E P H O N E . The factors included such things as the sex of the observer, the sex of the person whose picture was being transmitted,

Graphical methods for internal comparisons in ANOVA and MANOVA

159

contrast, and the bandwidth. There were eight response variables (p = 8) involved, each of which was a rating of the goodness of the picture on a ten-point scale as judged by human subjects used in the experiment. Of the 256 ( = 29-1) treatment effects, there was particular interest in the 129 that measured the main effects (9 in number), the two-factor interactions (36 in number) and the three-factor interactions (84 in number). Exhibit 10a shows a gamma probability plot of the 129 di obtained from the 8-dimensional single-degree-of-freedom contrast vectors associated with these effects. The choices of A, M and K / M are indicated on the picture as is the estimated ~. The points seem to cohere to two intersecting straight lines and is reminiscent of panel (c) in Exhibit 4. The presence of many higher-order interactions among the labelled points, as well as their conforming relatively smoothly to a straight line, suggests that the 129 contrast vectors seem to have two different covariance matrices thus casting a doubt on the homoscedasticity of the initial observations. In this example, going back to study the design of the experiment it became clear that some of the factors (e.g., sex of the observer) had indeed been so-called wholeplot factors while others (e.g., contrast) had been administered as so-called split-plot factors. Whether the split-plot design was deliberate or inadvertent, the whole-plot factors would be expected to have a larger error dispersion than the sub-plot ones. In the present example, the factors labelled A, B, C and D were whole-plot ones and the remaining ( E through I ) were sub-plot ones, and indeed 14 of the 18 labelled points are for effects that are confined entirely to the whole-plot factors and they largely account for the line of steeper slope in Exhibit 10a. A more sensible analysis in this situation would be to remove the 14 contrast vectors that account for the main, two- and three-factor interaction effects of A, B, C and D amongst themselves and to look at the remaining 115 contrast vectors as a separate set. A gamma probability plot of the 115 d, associated with the latter set is shown in Exhibit 10b which also specifies the choices of A, M and K / M and states the estimated ~. In this configuration, which is reminiscent of panel (b) in Exhibit 4, one can interpret the treatment effects associated with the top 7 or 8 points as being real. Gnanadesikan (1977, Example 36) also uses this set of data to demonstrate how a multivariate analysis can be more sensitive than separate univariate analyses of each response variable. Specifically, while Exhibit 10b enables one to identify 7 or 8 treatment effects as possibly real ones, pooling the findings from eight separate analyses using the techniques of Section 3.1.1 leads to delineating only 3 or 4 of the effects as possibly systematic2 The next example illustrates the ability of the gamma probability plot method to reveal another kind of peculiarity in the data which can often totally distort a formal test of hypothesis in A N O V A or M A N O V A but

160
|.62

R. Gnanadesikan

oACD
I

o 1.35

X Ct}

oABD 1.08 oBCD HoOBC C G--O~AB C


m

Z I-cO

0.81
OJ E

o o
W O,J

0.54

0.27 ~
0
0
2.79

A~"AG ' A E

D-o"CG .ic
I
5.58 8.36

GAMMA QUANTILES
Exhibit 10a. G a m m a Q-Q plot of L = 129 squared distances (d/) from a 29-1 experiment on P I C T U R E P H O N E - A = |, M = 64, K~ M = 2, ~ = 2.33 (Gnanadesikan, 1977, p. 238).

8.09
I 0
X o9 (..)

6.74

OG

z 5.39 I.(/')
7, Q 4.04
w rt,-"

ol oCG OAG

~
W rr

2.70

AF \

.i A E

w 1.35
rr"

cc~

2.79

5.57

8.56

GAMMA OUANTILES
Exhibit 10b. G a m m a Q-Q plot of L = l l 5 squared distances (at/) chosen as subset from Exhibit 10a--A = I, M = 57, K / M = 2, ~ = 2.40 (Gnanadesikan, 1977, p. 239).

Graphical methods for internal comparisons in A NO VA and MA NO VA

161

still go undetected. The illustration (see, Gnanadesikan, 1977, Example 42) pertains to 10 batches of transistor devices, each batch consisting of 10 devices, and the " d a t a " in question were values of three fitted coefficients (@,/~ij, % ) characterizing the aging behavior of the j t h device in batch i, i = 1..... 10;j = 1..... 10. To look for systematic differences across batches, separate one-way A N O V A of the &, the fi and the 3, as well as a one-way M A N O V A of the three-dimensional "data", (&,/3,~)), can be carried out. Exhibit 1 la shows the univariate A N O V A table obtained b y analyzing the 33's. The significance level associated with the observed F-ratio is large enough that the formal inference would be that there are no significant differences amongst the batches. The A N O V A F-tests for the other two coefficients and the usual variety of M A N O V A tests (see Chapter 2) also led to the same conclusion in this example. On the other hand, Q-Q plotting methods yielded a great deal of insight into the data and one such analysis is presented next. Studying residuals is an integral part of good data analysis and, in the present case, one can look at the residuals in the one-way classification, viz. e~j=(t~ij--t~i.,~ij--~i.,~ij--~i.), where (~i. is the m e a n of t~ij for the devices in the ith batch and similarly/~i, and ~i. are batch means. Considering the e,7 as 100 approximately single-degree-of-freedom vectors, one can employ the g a m m a probability plotting technique of this section to compare the relative magnitudes of the residual vectors via a measure of their size. An inordinately large residual would point to a possible outlier in the data. At any rate, using the inverse of a robust covariance matrix (S*) of the three-dimensional residuals as a compounding matrix one can compute t00 values of d/j=e~/(S*)-~eij and make a g a m m a probability plot of these. The d/j values here are treated exactly as the d i were treated in the earlier discussion and once again the shape parameter required for the g a m m a probability plot is estimated from the smallest order statistics. Exhibit 1 lb shows a g a m m a plot obtained in this way with points labelled by batch and device number. Clearly the residual for device 7 in batch 1 is inordinately large and this analysis has identified an outlier. Exhibit 1 l c shows a replot of the 99 d/j remaining after the largest one is omitted. Source Mean Batches Devices within batches Total DF 1 9 90 100 SS 13.868 13.333 118.615 145.816 MS 13.868 1.481 1.318 F-ratio 1.12; P(Fg,9o~, 1.12)>0.3

Exhibit 1la. One-wayANOVA of 7.

162
9.68

R. Gnanadesikan

T O
x
i

8.06

(O

6.45

4.84
..J

E~

3.25

nO

1.61

',1,5) ,/
0 1.18 2.37 5.55
GAMMA

~9)
5.91 7.10

4.75

QUANTILES

Exhibit 1 lb. G a m m a Q-Q plot of quadratic forms of M A N O V A residuals (Gnanadesikan, 1977, p. 267).

1.79

o 1.49
X
i S

n 1.19 .~O

{1,2) ("6)O~ \ o O~-(1,5 ) \(1,10)!


ocL(l,4) (1,31

t/J

n 0.90

( 1,1 )- O-,,(1,8)

_1

bJ r~ bJ E O

0.60

0.30

0 ~-~m-r

~-Trrrrm~ ~'~rrm~zE

1.15

2.29

3.44 4.59 GAMMA QUANTILES

5.73

6.88

Exhibit 1 lco Replot of Exhibit 1 It) (Gnanadesikan, 1977, p. 268).

Graphical methods for internal comparisons in A N O VA and M A N O VA

163

There appear to be 9 other inordinately large values which are labelled on this picture. A striking thing is that these are all the remaining residuals in batch 1! One possibility is that all of the devices in batch 1 are outlierso Another is that device 7 of this batch is such an extreme outlier that it distorts the batch m e a n vector, (&1.,/01.,~11), so badly that every residual vector in the batch is also badly biased (although the sum of the residuals in the batch have to add to zero!). Indeed it turned out that the second explanation was the pertinent one for the data as discussed more extensively by Gnanadesikan (1977). For present purposes, suffice to emphasize that the distorted residuals f r o m batch 1 have clearly inflated the error dispersion so much that the formal tests of hypotheses all fail to detect batch differences even though they do exist!
3.2.2. Q-Q plots for Cell V of Exhibit 1 The multivariate analogue of the univariate situation considered in Section 3.1.2 is one in which there are k M A N O V A sum-of-products matrices, S1,S 2..... S k, each being p p and based on p degrees of freedom, and the interest is in internal comparisons of "size" of the dispersions summarized by these matrices. An entirely equivalent problem is the one of internal comparisons of "size" amongst the mean sum-of-products matrices, S i / p , i = 1. . . . . k. Matrices, such as Si, arise in formal M A N O V A hypothesis-testing problems that involve k mutually orthogonal sets of hypotheses each of which has a matrix due to the hypothesis (see Chapter 21) of rank p. For example, for testing hypotheses of no m a i n effects in an m-level factorial experiment with p response variables, one would have a sum-of-products matrix associated with each m a i n effect based on 1,= (m - 1 ) degrees of freedom. Also, in analogy with the univariate example mentioned in Section 3.1.2, with ( p + l ) "within-cell" replications of a p-dimensional response, one m a y wish to assess the validity of assuming that the dispersions within cells are all the same at least in their "sizes". The assessment involved here is a comparison of the "sizes" of the different within-cell covariance matrices each of which would be based on ~, degrees of freedom. One needs ways of measuring tile "size" of dispersion matrices as a first step and clearly, f r o m a data analysis viewpoint, having alternate measures of size that lead to different insights into the dispersion structure is both desirable and realistic. Two functions of the eigenvalues of a dispersion matrix m a y be used as two different measures of its "size": the arithmetic and geometric means. (See R o y et al., 1969, Chapter II, 3.) Since in m a n y M A N O V A situations p m a y exceed p, thus implying that ( p - p ) of the eigenvalues will be zero, one can consider taking the arithmetic and geometric means of just the non-zero eigenvalues as the measures of size.

164

R. Gnaroadesikan

The arithmetic mean is sensitive both to large and small eigenvalues while the geometric m e a n is particularly sensitive to small eigenvalues which are of special interest as indicators of reduction of dimensionality. Because of their differing sensitivities, the two measures would tend to lead to diffeo rent insights. To arrive at the final forms of the two measures, however, another issue has to be taken into account. Since the p response variables m a y be measured on very different scales, one m a y want to weight the deviations from null conditions in each of the variables differently. To a c c o m m o d a t e this, one can scale the initial sum-of-products matrices by a p p positive semi-definite matrix A to obtain the collection, S i A ( i ~ 1 . . . . . k ) , and for measuring the "size" of SiA consider the two alternate measures: ai = arithmetic m e a n of the non-zero eigenvalues of SiA, i = 1 ..... k, gi = geometric m e a n of the non-zero eigenvalues of SiA, i = 1 ..... k. (Note: Some people prefer to use the sum and product of the eigenvalues instead of the arithmetic and geometric means. F o r computing the sum (or arithmetic mean) of the eigenvalues, of course, one need not carry out an eigenanalysis since it is the trace or the sum of the diagonal elements of the matrix in question.) The issues in choosing A here are exactly the same as the choice of the compounding matrix A discussed in Section 3.2.1 and several choices of A would be appropriate once again. Also, whether A is specified or estimated from the data, since it is c o m m o n to all k values of a; (or gi), it is considered a fixed quantity. The problem of internally comparing the relative magnitudes of the SiA is, at any rate, viewed as one of internal comparisons amongst the a i or the gi, and one needs an eval~mting distribution for each of these summaries Given such an evaluating distribution, say for the ai, one can use it as a null backdrop and plot the ordered values 0.Qa(1 ) ~< a(2 ) < - - . <.a(k ) against the corresponding quantiles of the evaluating distribution. A g a m m a distribution with unknown scale and shape parameters, which need to be estimated, turns out to be an adequate approximation for this purpose, and Gnanadesikan and Lee (1970) discuss a rationale for this approximate result. In the M A N O V A situation, to minimize the effects of possible real sources of variation, exactly as in Section 3.2.1 it would be sensible to base the estimation of the scale a n d shape parameters of the g a m m a distribution on an order statistics formulation.

(ii)

Graphical methods for internal comparisons in A N O VA and M A N O VA

165

Specifically, if a0)<a(2 ) < . . . < a ( ~ t ) < . . . <a(K ) < . . . ~<a(k ) are the ordered arithmetic means of eigenvalues, then considering the first M ( < K) of these as the M smallest order statistics in a random sample of size K (to be specified as a number < k) one can estimate the scale and shape parameters, ?ta and ~a, of the approximating gamma distribution for the ai's. The use of the maximum likelihood approach is again reasonable, and with the m/e ~ in hand one can make a gamma probability plot of the k points whose coordinates are {G-l(pi;l,~),a(o}, i = 1 . . . . . k, where G - l ( p i ; 1,~a) is the quantile corresponding to a cumulative probability Pi (e.g. ( i - )/k) for a standard gamma distribution with shape parameter ~ . For internal comparisons among the gi's, one starts with tlae ordered values g o ) < " " < g(k) and repeats the steps of the preceding paragraph with these in place of the a(0 to obtain role Xg and ~g for the parameters of the evaluating gamma distribution. The corresponding gamma probability plot has k points whose coordinates are ( G - l(pi; 1, lg),g(i)}, i = 1..... k. The Q-Q plotting techniques for the a~ and gi are illustrated in the context of a discriminant analysis problem which was concerned with identifying people on the basis of speech spectrograms of their utterances of certain words (see Bricker et al., 1971). The example was used by Gnanadesikan and Lee (1970) who proposed the methods described in this section and it was concerned with assessing the similarity of 6 6 covariance matrices of 10 people. In a standard discriminant analysis (which has close parallels with a one-way classification MANOVA) such matrices are pooled to provide the so-called within-groups dispersion matrix and it is legitimate to be concerned about the validity of such a pooling procedure. Disregarding orientational aspects, one can use the above Q-Q plotting techniques of the arithmetic and geometric means of the 10 covariance matrices to study at least the similarity of "size". Exhibit 12a (reproduced from Gnanadesikan, 1977) shows a gamma probability plot for tile arithmetic means while Exhibit 12b (also from Gnanadesikan, 1977) is the plot for the geometric means. (Note: In fact, Exhibit 12a uses the sums of the eigenvalues instead of the arithmetic means but the omitted multiplicative constant would only influence the slope of the configuration.) The points in each exhibit are labelled 1 through 10 to correspond with the 10 persons involved. The choices for A, K and M for both plots were, respectively, I, 10(= k) and 5, and the role of ~a and ~g were 4.236 and 4.245. As seen from the ordinate scales of the two plots, however, the scale parameter estimates X, and Xg would be extremely different. At any rate, the interpretation of both configurations is that at least in terms of the two size measures, the covariance matrices of the 10 people are not markedly different and one could feel comfortable about pooling them for purposes of the discriminant analysis (or one-way classification MANOVA). Also, the differences in the internal ordering of the 10 covariance matrices

i66 5.86

R. Gnanadesikan

09

3.21

o
x

2.57

06

_o
t.95 07 ~o 1
~J 0 0

1.29 o8 0.64 04

e3

e5

0 0

I t ,42

I 2.85

L_ 4.25

1 5.66

I 7.08

L49

GAMMA QUANTILES

E x h i b i t 12a.

G a m m a Q-Q plot of s u m s of e i g e n v a l u e s of k = 10 c o v a r i a n c e m a t r i c e s - - A =

I, M = 5, K / M = 2,~ = 4.236 ( G n a n a d e s i k a n , 1977, p. 256).

1.45 o9 1.19
o
X

~ 0.95

i-I1 cO

5 o3 =2 o4 0.48 o6.I

n,. 0.71

~8

o 0.24

o7

0 0

I 1.42

I__J_~_ 2.84 4.25

I 5.67

I 7.09

8.5t

GAMMA QUANTILES

E x h i b i t 12b. G a m m a Q-Q plot of g e o m e t r i c m e a n s of e i g e n v a l u e s of k = 1 0 m a t r i c e s - - A = I, M = 5, K / M = 2, ~ = 4.245 ( G n a n a d e s i k a n , 1977, p. 257).

covariance

Graphical methods for internal comparisona' in A N O VA and M A N O VA

167

in the two plots are interpretable in terms of the different sensitivities of the arithmetic and geometric means mentioned earlier. For example, the covariance matrix for the sixth person is second largest as judged by Exhibit 12a but second smallest as indicated by Exhibit 12b. This is explainable by the fact that this covariance matrix had a noticeably small eigenvalue which thus had a major influence on the geometric mean.
4. Summary and conclusions

Some Q-Q probability plotting methods for orthogonal ANOVA and MANOVA situations have been described in this chapter. The essential idea common to all of them is firstly to decide on an appropriate summary statistic for capturing the information that is of interest, and secondly to use an evaluating distribution to provide a null background by means of its quantiles against which the ordered observed values of the summary statistic are plotted. Two key differences between the procedures for the univariate and multivariate situations are: (1) the need in the latter case to conceptualize and specify measures of size which in turn raise the issue of appropriate scalings of the different variables; and (2) the need to estimate an appropriate evaluating distribution in the multivariate case which in turn involves flexibility and judgemental aspects such as those involved in the order statistics approach discussed in Section 3.2. The idea of estimating the evaluating distribution appears in practice to introduce a desirable robustness into the procedure in the sense of expanding the ability of the Q-Q plot to handle quite non-normal data. This suggests the possibility that even in the univariate situation instead of specifying the evaluating distribution, say as a X{1) in Section 3.1.1 or Xg~)in Section 3.1.2, one could try to estimate it from the data just as in the multivariate case. The main advantages of the Q-Q probability plotting methods are that they: (1) provide easily assimilable displays of comparable quantities which permit one to pinpoint places where the interesting structure is; (2) allow the data to influence facets of the analysis, such as the gleaning of the items that belong essentially to the error configuration, instead of having to prespecify these; and (3) utilize some model or null assumptions to generate the analysis but not requiring that these be bought unquestioningly, since the configuration on the plot often reveals violations of assumptions and unanticipated peculiarities in the data. Work on methods for constructing "confidence bands" for Q-Q plots is in its nascent stage and has some ways to go before resulting in sensitive and useful tools for routine data analysis. Using the standard deviations of order statistics, especially well-separated ones (e.g. extremes, quantiles and mechans), to obtain bands of uncertainty for Q-Q plots is a simple procedure for aiding judgments of straightness of Q-Q plots.

168

R. Gnanadesikan

Appendix I. Computations of quantiles


L1. The normal distribution

The value x that satisfies the equation,


(x)= 1

x -

2 e- t/2dt=p,

is the quantile of the standard normal distribution corresponding to the cumulative probability value p. An approximation 2 to x, due to Hastings (1955, p. 192), may be used and is repeated here. If 0 <p < ,

2= --rl+ {(ao+a,rl+a2rl2)/(1 + b,~/+ b2~ 2 + b3~/3)},


where ~ = ( - 2 1 n p ) 1/2, %=2.515517, a1=0.802853, a2=0.010328, b 1= 1.432788, b2=0.189269, b3=0.001308.

If i < p < 1, use (1 - p ) in place o f p in the definition of ~ and use - 2 as the approximation for the required x. The approximation is adequate except for extremely small or extremely large values of p. For greater accuracy the methods described by Cunningham (1969) or Milton and Hotchkiss (1969) may be used. For users who do not want to or cannot have access to a computer for calculating the quantiles, commercially available (from Keuffel and Esser Co., 40 E 43rd Street, New York, N.Y. 10017) normal probability paper can be used. In using this special purpose graph paper, the ordered observations would be plotted against the cumulative probability levels that are used to label one of the axes instead of the quantile values.

L2.

The half-normal distribution

A quantile of a half-normal distribution may be obtained by using the relationship between the cumulative distribution functions of the halfnormal and the (full) normal distribution and thus enabling the use of the method described above. Specifically, if x ( > 0) is a value that satisfies the equation 2 x H(x) = ( X / ~ ) f 0 e-t~/2dt=p, then x also satisfies the equation
(x) = 1 +p

Graphical methods for internal comparisons in A N O V A and MANOVA

169

Thus, using (i + p ) / 2 in place of p in the description of the method for computing the quantile of the normal distribution is all that needs to be done. For users who wish to use special probability paper instead of computing the quantiles, all that is needed to make half-normal probability paper is a relabelling of the cumulative probability axis on regular normal probability paper. Specifically, one would delete all values of p less than 50% on the normal probability paper and replace values of p greater than 50% by the values 100(2p- 1)% (see Daniel, 1959, 2).
L3. The chi-squared distribution

The quantile of a X~ distribution corresponding to a probability p is most simply obtained as the square of the quantile of the half-normal distribution corresponding to the same probability p. Hence, for this case, the problem reduces to the one in (2) above, which in turn relates to the method in (1) above. The quantile of a X~2)distribution corresponding to a probability p is just
- 2 In(1 - p ) .

For other values of the degrees of freedom, one can either use the relationship between the chi-squared distribution and the gamma distribution (for which methods of obtaining quartiles are described in (A.I.4) below) or use a scheme suggested by Goldstein (1973). If Zp denotes the quantile of a standard normal distribution corresponding to a cumulative probability p, then Goldstein (1973) proposes using an extension of the well-known Wilson-Hilferty asymptotic approximation of a X2 variate by a normal variate. Specifically, if x denotes the quantile of a X~) distribution corresponding to a cumulative probability q ( = l - p ) , and if 2 < 1 , < 2 + 4[Zpl, Goldstein (1973) suggests the polynomial approximation, x ~ p { (1.0000886 - 0.2237368/~ - 0.01513904/l, 2)
+ yp (0.4713941 + 0.02607083 / ~,- 0.008986007/pz)

+),2(0.0001348028 + 0.01128186/1, + 0.02277679/1,2) + 9 ( - 0.008553069 - 0.01153761/u - 0.01323293/1,2) +y4(0.00312558 + 0.005169654/~, - 0.006950356/t, 2) +yps( _ 0.0008426812 + 0.00253001/1, + 0.001060438/1,2) + y6(0.00009780499 - 0.001450117/p + 0.001565326/~,2) } 3, (A.I.I) where yp ffi ~,- l/:Zp.

170

R. Gnanadesikan

Users interested in special purpose probability papers should be aware that such are available for values of v = 1(1)10 from Technical and Engineering Aids for Management (104 Belrose Avenue, Lowell, Mass. 01852; paper numbers 611-1 through 611-10 cover respectively the values of v from 1 through 10).

L 4.

The gamma distribution

The quantile, x, of a standard gamma distribution corresponding to a cumulative probability p satisfies the equation 1 x G(x; 1,7) = F - ~ f o u n - ' e - U d u = p ' where 7 > 0 is the shape parameter. A numerical method for computing the required quantile would be to have a way (e.g., series expansion) of computing an approximation to G(x; 1,71) for given values of x and 7/ and then using an iterative approach to finding a value 2 of x such that G(2; 101) is sufficiently close t o p . The N e w t o n - R a p h s o n technique is one well-known iterative procedure for determining 2. Another method (see Wilk et al., 1962a, 3) is to find bounds x L and x u such that X L < X < X u and then obtain 2 by repeatedly "halving the interval" until G(2; 1,4) is sufficiently close to p. The steps involved in this second process are explicitly described below. A convergent series approximation is specified by

fo
where

Xu n l e - " d u ~ x n e - X / q ,

(A.I.2)

q--1

xj

q--1

rq(X,7)= E

j=o 7/(r/+ 1 ) - . . (T/+j)

= Z 5,
j=o

do= 117,
4 + , = xg/(TI + J + 1),
and q ( > x 7/) is chosen such that (A.I.3)

(~1+ q - x) dq--1 ~ l o - 7 r q ( x , 7 )

(A.I.4)

If the number of terms, q, in the approximating series is chosen so as to satisfy this constraint, the relative error in approximating the infinite series, T~(x,~l), by the finite series, Tq(x,~l), will be less than 10 -7.

Graphical methods for internal comparisons in A N O VA and M A N O VA

171

For making a gamma probability plot of n points one needs to obtain n quantiles, 21 < 2 2 < < 2., which satisfy the equations
2~" e - ; '

7](x,,r " O-p,r(~), -~

i=l,., ., n,

(A.I.5)

where Pi is a specified cumulative probability (e.g. (i-)/n) and the evaluation of the series Tq is described above. The gamma function in the right-hand side of Eq. (A.I.5) may be computed by using a rational function approximation given by Hastings (1955, p. 158): if 0 < ~ < 1 if 1 <'q < 2
= (rtf + l ) ( ~ f + 2 ) . . -

(A.I.6)

(rtf+ [~/]- 1)F(1 + fly),

if 77> 2

where [ ~/] = the integral part of ~/, ~f = fractional part of ~/, so that

and

r(1 +
where

1 + affly+ a2 l + " "

+ a8@,

(A.I.7)

a I = -0.577191652,

as=-0.756704078, a6=0.482199394, a7= -0.193527818, a8=0.035868343.

a2= 0.988205891,
a3= -0.897056937, a4=0.918206857,

Given a value of Pi and the computational formulae (A.I.6) and (A.I.7), the right-hand side of eq. (A.I.5) may be calculated. Also, a lower bound on the required quantile 2 i is known to be

2iL = [

172

R. Gnanadesikan

and this can be calculated using the same method. Next, noting that the left-hand side of eq. (A.I.5) is an increasing function of 2 i and assuming that 2;+ 1 has already been found, the left-hand side of eq. (A,I.5) can be evaluated mid-way between 2,L and 2i+ 1, i.e. at x m = (2iL + 2-+ 1)//2. If the evaluated quantity is smaller (larger) than the computed value of the right-hand side then the next trial point is the midpoint of the upper (lower) half-interval since x m is then smaller (larger) than 2,. This process of halving intervals can be continued until the required root, 2,, of eq. (A.I.5) is determined to the desired number of significant digits. Having obtained 2i, one can repeat the steps to find 2~_ 1 starting with the fact that x(/_ 1)L()~i--1 <3~i" This process will yield all of the quantiles once a method of obtaining the largest quantile, 2~, is specified. One way for determining 2n is to evaluate the left-hand side of eq. (A.I.5) for values of r2~L using successively r = 1,2, 3... until, say for r = r*, the left-hand side exceeds the right-hand side of eq. (A.I.5) as computed for i = n. Then the method o f halving intervals can be applied starting with the interval [(r*-1)2nL, r*2~L] to once again determine 2~ to the desired degree of accuracy. The procedure just described for computing the quantiles entails repeated numerical evaluation of [F(,/)G(x; 1,,/)], and the convergence of the series approximation involved is rapid for small values of x (relative to 7/) but slow when x is large (relative to ~/). For the latter situation, Jaeckel in unpublished work suggests approximating (F(,/)[ 1 - G(x; 1, 7/)]} by
k

x n - l e -x ~, f j + [(T/--1)--. ( T l - k - l ) x * l - k - l e - x / x - - ' O l - k d l - 2 ] ,
j=0

where fo = l, fj = (TI - j ) f j _ l/X, and k is chosen large enough. Given the quantiles, 21 < .. < 2 n, of a standard gamma distribution the quantiles, Yl ~ ' " "~fn, of a general gamma distribution with origin parameter a ( - oe < a < + ~ ) and scale parameter h ( > 0) can be obtained using the simple linear relationship, fii =(xi + a)/)t. As a special case, the quantiles of a chi-squared distribution with p degrees of freedom may be obtained from those a standard gamma distribution by utilizing the fact that a - - 0 , )~=7 1 and 7/ p / 2 for such a chi-squared distribution. This relationship between the quantiles of a gamma distribution and those of a chi-squared distribution raises the question of why not use (A.I.1) with 2~/ in place of i, to compute the appropriate quantile of a standard gamma distribution. Indeed such an approach may lead to an adequate result when ~7 is moderate or large, and even for all values of */ it may yield a good initial value of the quantile for use in the more elaborate iterafive computations (described above) for an improved determination of the quantile.
=

Graphical methodsfor internal comparisons in ANOVA and MANOVA

173

Appendix II. Computation of maximum likelihood estimates parameters of a gamma distribution

(mle) of

Given the M smallest order statistics, O < X ( I ) < X ( 2 ) < ' ' ' < X ( M ) , in a random sample of size K (known and > M) from a gamma distribution with density g(x;~k, Tt)= ~-~-)-xn--'e -x~, x>O;X>O,~>O,

the problem is to determine the mle of the scale and shape parameters M 1/M/X(m ) and S = and 7, respectively. If ~=~X(M ), P=(IIi=lx(g)) Y~.Ix(o/MX(M), the likelihood equations that need to be solved simultaneously for ~ (and thence ~) and ~ may be written as l n e = K F'(~) M r(n) ln~K_I
'

and (A.II.1)

_nl( K ) e-~ S - ~ - -~ --~ - 1 J 07, ~ ) '


where

j(n,g)=
and

J'(~,~)=

J(~/,~') =

un-'lnue-~Udu.

The left-hand sides of eq. (A.II.I) are functions of the observations alone while the right-hand sides involve fairly complicated functions of ~" and 71, and the required mle have to be computed by iterative techniques of solving (A.II.1). Wilk et al. (1962b) describe numerical procedures for this and what follows is a summary of their suggestions. If we denote the right-hand sides in (A.II.I) as P(B,~) and S01,f) respectively to emphasize the fact that they involve B and f, and if the corresponding left-hand sides in (A.II.1) are denoted P0 and S Oto emphasize that these are observed summaries calculated from the M smallest order statistics, we can rewrite (A.II.1) as

Po=POI,~),
The functions P ( ~ , f )

and and

So=S(~,~ ).

(A.II.2)

S(B,~) involve the di-gamma function,

174

R. Gnanadesikan

F'(n)/F(rt ), and the functions, d ( n , f ) and J ' ( n , f ) , defined following (A.II.1). Given numerical methods for computing F'(n)/F(~), J ( ~ , f ) and J ' ( n , f ) , for given values of 7) and ~, one can therefore compute the functions P(n,~) and S(n,~). Starting with trial values of 77 and ~, one can determine "corrections" to such values by "matching" the computed right-hand sides in (A.II.2) with the known left-hand sides and iterating this process until the corrections become negligible. Specifically, suppose To and ~0 are initial trial values and ~) and f are sufficiently close to % and go so that the following truncated Taylor series will be reasonable approximations: ap

P(n, ~) =-P(no,~o) + (7 - T o ) ~ + ( ~ - fo) 0 f '


+

0P

where the partial derivatives are evaluated at ~/= V)o,~ = ~o. Eq. (A.II.2) may be reexpressed approximately as

(,) -

To) ~

oe 0S

+ ( ~ - ~o) -o~

OP = P o - *'(To, ~o), OS (A.II.3)

(7 - To)a-::o, + (~ - ~o) ~

= S o - S(no, ~o).

The partial derivatives in the left-hand sides of (A.II.3) are constants since they are evaluated at (no, ~o) and the right-hand sides are constants too, so that the two equations in (A.II.3) can be solved simultaneously for the two quantities ~)-no ( = x ) and ~ - f o (=Y). The values of x and y thus determined may be used as "corrections" to no and fo yielding aq*= no+ x, ~* = ~o +Y. The entire process can then be repeated with ~)* and f* as the new trial values and iterating until values ~ and ~ are found such that t P 0 - P(~,~)[ and IS0- S(~,~)[ are adequately small. The partial derivatives in (A.II.3) can be derived explicitly by going back to the definitions of P ( n , f ) and S(n,~) and then developing numerical methods for computing functions such as the tri-gamma, 0 J'(,~,~') 0 J'Cn,~') On J(n,~) ' o~ J(,7,~) ' etc. An alternative approach is to estimate the required values of the partial derivatives by appropriate divided differences. Specifically if ns and

Graphical methods for internal comparisona" in A N O VA and MA N O VA

175

~1 are "close" to To and ~o respectively, the required approximations are

0e ~o,~o ~ b_ = e , o - eoo. 07 AT 7 1 - To ' OS AS S l o - Soo.

0e ~ Ae = Po,- t'oo 0~ l.o,~o A~ ~ - ~o ' as AS Sol - Soo.

.o.~o~- A-~ =

hi-no

'

~o.~o---- A~ =

~,-~o

'

where Poo = P(To, ~o), Plo = P(~h, ~'o), Pol = P(no, ~'l), Slo = S(v/1, ~'o), and Sol = S(no, 5 ) . If one uses divided differences to approximate the partial derivatives in (A.II.3) and the iterative scheme described above for determining successive corrections until ~ and ( are obtained, the only remaining requisite is a method for computing P(T,~') and S(n,~) for given values of n and ~. This in turn means that ways of computing values of F(T), F'(n)/F(n), J(n,~) and J'(n,~) are needed. A method due to Hastings (1955) has been described in Appendix I (see A.I.6 and A.I.7) for computing F(T). The di-gamma function may be computed from the approximation, r ' ( n ) __ r(T) }In[ (10+ nf)(11 +Ty)] +

6(10 + ny)(11 + Tf) -

1 1 -~ + T---+-i- +"'+

1 ~

] J

if 0 < T < l l , ln[(T(n -- 1))] + 6 n ( T - 1) ifn>~ll. (A.I1.4) The function J(T,~) can be written as J(n'~') = r(~)

{n foluo-le-~"du'

and a series approximation for the integral on the right-hand side was described in Appendix I (see A.I.2-4). Lastly, for evaluating J'(T,~) it is useful to note that
J'(~'~') =

~(_~)_ [ F'(n)_ln~]_folUn_,lnue_~Udu
r(~)
'

176

R. Gnanadesikan

a n d that the integral o n the r i g h t - h a n d side c a n b e a p p r o x i m a t e d b y a series as follows:

1
fo u n-1 l n u e - ~ U d u ~ - e q--1 E djej; j=0

-~ ~ j=o "q(~+l)'~'(~+J)

(1

1
"'" + -~

~+

=--e-~

(A.II.5)

w h e r e d o = 1/~/, dj+, = d j ~ / 0 1 + j + 1), e 0 = l / r / , e j + l = ~ + 1 / ( ~ / + j + 1), a n d the n u m b e r of terms q is c h o s e n large e n o u g h to ensure a n a d e q u a t e l y s m a l l relative e r r o r in a p p r o x i m a t i n g T'~(~,~I) b y Tq(~,~/)---xq-~dj~. F o r e x a m p l e , if it is d e s i r e d to e n s u r e t h a t the relative error d o e s n o t e x c e e d 10 - 7 t h e n q ( > ~ - ~ 7 ) s h o u l d b e large e n o u g h to satisfy the test t h a t

~{~l+q+qOJ+q-~))
r/(rt + q - ~.)2

< 10_7Tq(.~,.r/).

References
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance-I. Ann. Math. Statist. 25, 290-302. Bricker, P. D., Gnanadesikan, R., Mathews, M. V., Pruzansky, S., Tukey, P. A., Wacliter, K. W., and Warner, J. L. (1971). Statistical techniques for talker identification. Bell Syst. Tech. J..50, 1427-1454. Cunningham, S. W. (1969). From normal integral to deviate. Algorithm AS24, Appl. Statist. 18, 290-293. Daniel, C. (1959). Use of half-normal plots in interpreting factorial two-level experiments. Technometrics 1, 311-341. Davies, O. L., Editor (1956). Design and Analysis of Industrial Experiments. Second edition, Hafner, New York. Cmanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York. Gnanadesikan, R. and Lee, E. T (1970). Graphical techniques for internal comparisons amongst equal degree of freedom groupings in multiresponse experiments. Biometrika 57, 229-237. Gnanadesikan, R. and Wilk, M. B. (1970). A probability plotting procedure for general analysis of variance. J. Roy. Statist. Soc. B 32, 88-101. Goldstein, R. B. (1973). Chi-square quantiles. Algorithm 451, ComrrL ACM 16, 483-485. Hald, A. (1952). Statistical Theory with Engineering Applications. Wiley, New York. Hastings, C. (1955). Approximations for Digital Computers. Princeton Univ. Press. Milton, R. C. and Hotchkiss, R. (1969). Computer evaluation of the normal and inverse normal distribution functions. Technometrics 11, 817-822. Patnaik, P. B. (1949). The non-central X2 and F-distributions and their approximations. Biometrika 36, 202-232.

Graphical methods'for internal comparisons in A N O V A and MANOVA

177

Roy, S. N., Gnanadesikan, R. and Srivastava, J. N, (1971). Analysis and Design of Certain Quantitative Multiresponse Experiments. Pergamon Press, Oxford, New York. Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika 6, 309-316. Wilk, M. B. and Gnanadesikan, R. (1961). Graphical analysis of multiresponse experimental data using ordered distances. Proc. Nat. Acad. Sci. U.S.A. 47, 1209-1212. Wilk, M. B. and Gnanadesikan, R. (1964). Graphical methods for internal comparisons in multirespouse experiments. Ann. Math. Statist. 35, 613-631. Wilk, M. B. and Gnanadesikan, R. (1968). Probability plotting methods for the analysis of data. Biotruetrika 55, 1-17. Wilk, M. B., Gnanadesikan, R. and Huyett, M. J. (1962a). Probability plots for the gamma distribution. Technometrics 4, 1-20. Wilk, M. B., Gnanadesikan, R. and Huyett, M. J. (1962b). Estimation of parameters of the gamma distribution using order statistics. Biometrika 49, 525-545.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 179- 197

Id

Monotonicity and Unbiasedness Properties of ANOVA and MANOVA Tests


Somesh Das Gupta*

1.

Introduction

The multivariate analysis of variance problem for the normal case m a y be posed as follows: Let X :p x n be a r a n d o m matrix such that its column vectors are independently distributed as %p (., E), where y is an unknown positive-definite matrix; moreover, (X')=AO, (1.1)

where A :n x m is a known matrix of rank r and O : m x p is a matrix of unknown parameters. The problem is to test H 0 : G q 9 - - 0 against Hi: G'(gv~0, where G ' is a known s x rn matrix of r a n k s such that G = A'B for some B :nXs. This problem can easily be reduced to the following canonical form: Let Y1. . . . . Y, be n independently distributedp x 1 r a n d o m vectors such that Y ~)Lp(/Z~,Z), where /#+l . . . . . f t , = 0 , and E along with /q ..... /~r are unknown, E being positive-definite. The problem is to test H0: if, . . . . . m--0 (1.2)

against Hi: "not H0", where s<r. In this set-up ~ 0 _~-s = , ~ = ~ ~v y,~ and S e = Y~]=r+l Y~ Y" are called the sums of products (s.p.) matrices due to the hypothesis H 0 and error, respectively; the corresponding degrees of freed o m are s and ne-~n - r . The following tests (represented by their acceptance regions) are mostoften considered in the literature: (a) Likelihood-ratio test:

det(Se)/det(S~+ So) >~kl, ( 0 < k I < 1).

(1.3)

*Partially supported by a grant from the Mathematics Division, U.S. Army Research Office, Durham, N.C. Grant no. DAAG-29-76-G-0038. 179

180

Somesh Das Gupta

(b) R o y ' s m a x i m u m - r o o t test: m a x [ c h a r a c t e r i s t i c root of SoSe -l] < k 2, (e) L a w l e y - H o t e l l i n g ' s trace test: tr(SoS~-' ) ~<k3, (0<k3). (1.5) (0<k2). (1.4)

(d) B a r t l e t t - N a n d a - P i t l a i trace test: i f [ S o ( S o + Se) -1] <.k 4,

(O<k4<min(p,s)).

(1.6)

N o t e that the first three tests are defined only when ne >~p in which case S e is non-singular with probability 1. T h e last test is defined when ne + s >p. All these four tests are m e m b e r s of a class of invariant tests which is defined as follows. Let Y(1) = ( Y, .... , Y~), Y(2)= ( Y, +1 .... , Y~),

r(3) = ( Yr+l . . . . . ]1,).


A set of sufficient statistics is given by

(1.7)

(Y(1), Y(2), St ~- r(,) Y(1) -i- Y(3)Y(3) ).

(1.8)

S t is positive-definite when n e + s ~>p. Consider the following t r a n s f o r m a tion:

(L,A,B)(Yo),Y(:),S,)=(AYo)L,AY(2)+B,

AStA' ),

(1.9)

where L E 0 s = the class of all s s o r t h o g o n a l matrices, A ~ Ep = the class of all p p nonsingular matrices, B E 6"~p,r_ s = the class of all p (r - s) matrices. This t r a n s f o r m a t i o n keeps the a b o v e m o d e l a n d the testing p r o b l e m invariant. The composition of two such transformations is given b y

( LI,A 1,B1)( L2,A2, B2) = ( L2L1,A,A2,A ,92 + B1).

(1.10)

T h e n the collection G of ( L , A , B ) with the a b o v e binary o p e r a t i o n is a group of transformations acting on ~Lp,s X ~qLp,r_~ X Sp+, where S + -- the collection of all p p positive-definite matrices. Let (I)~ be the class of all n o n - r a n d o m i z e d tests invariant under G.

Monotonicity and unbiasedness properties of A N O VA and MA N O VA tests'

18 l

LEMMA. When ne + s > p , a set o f m a x i m a l invariants under G in the space o f sufficient statistics (Yo), Y(z),St ) is given by the ordered non-zero characteristic roots o f SoSt -1, denoted by d I < . . . < d t where l = m i n ( s , p ) . When n~ + s < p there is no non-trivial invariant test. Suppose ne ~>p and let c 1 >1 . . . >1ct be the ordered non-zero characteristic roots of SoSe-" 1. Then di= ci/(1 + c ) . Next we shall consider two important special cases. (1) s = 1, ne >~p. The acceptance regions of all the above four tests reduce to (1.11)

This is the U M P invariant test for its size. (II) p---1, ne I> 1. The acceptance regions of all the above four tests reduce to
n

y2/
a=l

E
a=r+l

Y~<k.

(1.12)

This is also the U M P invariant test for its size. Except for these two special cases U M P invariant test does not exist. All the above four tests are known to be admissible. Instead of comparing the power functions of different tests we shall be concerned in this paper with the behavior of the power function of a given test with respect to the non-centrality parameters involved; in particular we shall study whether the unbiasedness property is satisfied by a given test. Let r~ ..... "r 2 be the possible non-zero characteristic roots of E - I M M ', where M = (/h ..... /~). Then the power function of any test in q~a involves M and Y, only through ~'1 2..... ~2 . We shall study conditions under which the power function of an invariant test increases monotonically in each ~-~. Under some additional conditions we shall get a more refined property of this monotonicity.

2. Monotonicity of the power functions of the U M P invariant tests in the two special cases

The monotonicity property in the above two special cases can be easily proved using the following elementary result.

182

Somesh Das Gupta

THEOREM 2.1.

Let Z be a random variable distributed as N(O, 1). Then (2.1)

~r(z) ~ P {IZ + r I < k }

for k > 0 is a symmetric function of rr and decreases monotonically as ~2 increases. T h e theorem is p r o v e d easily by respect to "r. Later we shall show density of Z is s y m m e t r i c a b o u t the the origin). It will also be extended studying the first derivative of rr with that this result also holds w h e n the origin a n d u n i m o d a l (with the m o d e at to the multivariate case.

COROLLARY 2.1. Let Z 1 and Z 2 be independently distributed according to 2 distributions, the non-central chi-square X~,(r 2) and the central chi-square )G2 respectively; n I and n 2 are positive integers and "r2 is the non-eentrali(y parameter of Z 1. Then Pr[Z,/Z2<k ] (0<k) (2.2)

is a monotonically decreasing function of r 2. PROOF. Write Z I = Z121+ + Z 2 where the Zli are independently distributed as N ( - , 1) with Z n --'r a n d E Z I , - - 0 for a > 1; m o r e o v e r the Zig are distributed independently of Z 2. Such a decomposition of Z~ is clearly possible. N o w apply T h e o r e m 2.1 for Z u holding Z 2 a n d Zl~'s for a > 1 fixed. T h e a b o v e corollary is true also for non-integral positive n I and n 2. One m a y use the m o n o t o n e likelihood-ratio p r o p e r t y of the non-central F-distribution. Let us n o w consider the two special cases given by s = 1 a n d p = 1. CASE 1. S = 1, n e >>-p. The critical region of the Hotelling's T2-test can be expressed as Y;( Y(3) Y ' O ) ) - ' Yo) > ( p / ( n ~ - p + 1)} F~,,o_,, + l, (2.3)

where F~b is the u p p e r a-fractile of the F-distribution with a and b degrees

Monotonieity and unbiasednessproperties of ANO VA and MANO IrA tests

183

of freedom. The power of this test is


, e - - p+ l(z 2) > F t Pr[ Fp, p,,o-p+l ],

(2.4)

where ~.2=/~,lE- 1/~1.It follows from Corollary 2.1 that the power of this test increases monotonically with ~.2 CASE 2. p = 1, n e/> 1. The critical region of the ANOVA F-test can be expressed as
y2/
a=l

~
a=r+l

y 2 > ( S / n e } F s a , n.

(2.5)

The power of this test is Pr[ F,,,, (~2) > Fs~ne

],

(2.6)

where ~-2=5'.~=l/tz/z. Again, Corollary 2.1 shows that the power of this test increases monotonically with ~.2.

3.

Mathematical preliminaries

The key to all the results in this paper is the following well-known inequality due to Brunn-Minkowski. THEOREM 3.1.
Let A 1 and A z be two non-empty convex sets in R ~. Then

V1/n(A1 + A2) ~>V~/"( A l) + v2/n(A2), where V n stands for the n-dimensional volume, and A 1 + A 2= ( x l + x 2 : x l ~ A 1 , x 2 E A 2 ) .

(3.1)

This inequality was first proved by Brunn [5] in 1887 and the conditions for equality to hold were derived by Minkowski [26] in 1910. Later in 1935 Lusternik [25] generalized this result for non-empty arbitrary measurable sets A 1 and A z and derived conditions for equality to hold. This inequality led Anderson [1] to generalize Theorem 2.1 to the multivariate case. We shall present here a minor extension of Anderson's result. Following Anderson we shall call a non-negative function f on R n

184

S o m e s h D a s Gupta

unimodal, if KT,~=-- { x ~ R " : f (x) >:u)


(3.2)

is convex for all u, 0 < u < ~ . W e shall call a (real-valued) function f on R" centrally symmetric i f f ( x ) = f ( - x ) for all x E R n. THEOREM 3.2. Let G be a group of linear Lebesgue measure preserving transformations of R n onto R n. Let f be a non-negative (Borel-measurable) function on R ~ such that f is unimodal, integrable with respect to the Lebesgue measure t~ on R ~, and f ( x ) = f ( gx ) for all g ~ G, x E R n. Let E be a convex set in R ~ such that E = g E for all g in G. Then for any fixed "cERn and any ,r* in the convex-hull of the G-orbit of "r defined by G(,c)-~ { g~: g ~ G }

f
E+~'*

f(x)dx>~

f
E+'r

f(x)dx.

(3.3)

PROOF.

First note that

f
E +'r

f(x)dx=

~ll,n[Kf, uCl(E-t-'t')ldu,
0

(3.4)

where Ky,u is defined in (3.2). T h e n for g E G, Ky,u=gKy, u, and

~o[ K:,un (e +~)]= ~o[ gK:,ung(E+ ~)] = ~.[/<:,.n (e+g~)].

(3.5)

N o t e that K:,. N ( E + ~) and K:, u A ( E + g z ) are both either empty or non~__ m m ~. empty. Let gl ..... gm be in G a n d z - Y, i= iXigiT, where 0 < ~'i < 1, Y. i= l~ki 1. T h e n

K:,.n(E+:)D E x,[K:,.n(s+g~)],
i~l

(3.6)

whenever Ky,u f ) ( E + ~-) is non-empty. T h e o r e m 3.1 now yields

~o[/<:,.n(e+:)]> E x:'./(~:,un(E+g~)}
i=l

= btn[ Kf,. N ( E + ~') ]. Integrating with respect to u yields the theorem.

(3.7)

Monotonieity and unbiasedness proper:ies of A N O V A and M A N O VA tests

185

We shall improve Theorem 3.2 by using a condition on f which is stronger than unimodality. Following Das Gupta [11] we shall call a non-negative function f on R n 0-unimodal (or, strongly unimodal) if for any x 0, x~ in R n and any 0 < 0 < 1 f [ (1 -- O)xo + Ox, ] ) fl-O(xo) fO( Xl)" (3.8)

THEOREM 3.3. Let f be a non-negative O-unimodal (Borel-measurable) function on R n such that f is integrable with respect to tL,,. Then for any two Borel-measurable non-empty sets E o and E 1

f
(i -

f(x)dx>~

O)Eo + OE l

ff(x)dx
Eo

]0[

ff(x)dx.
EI

(3.9)

PROOF. For u ~ R ~ define C = ((x,u) E R ~ RI: f ( x ) >1e x p ( - u)}. Let Cu be the u-section of C. Then for any measurable set E c R n (3.10)

ff(x)dx=

_ ~ Fn[ 6 ~ N E ] e x p ( - u ) d u .

(3.11)

We assume that the integrals in the left-hand side of (3.9) are positive (excluding the trivial cases). Define

ho(u ) =/z,[ C, N ((1 -- O)E o + OE, )].

(3.12)

Let Si be the support of hi(i=O , 1). Then for UoE So, u I ~ Sl,u = ( 1 - O)uo+

OuI ho(u ) >1[ ho(uo) ]l-[ h,(uO ] .


To see this, note that (3.13)

CuN((1-O)Eo+OE1)D(l-O)(6uoNEo)+O(Cu, NE1).
From Brunn-Minkowski-Lusternik inequality we get
~1/n[ Cu (--] ( ( 1 - -

(3.14)

O)Eo+ OEl) ] >/ ~> (1 - o) ~n ~/ (Cuo n e0) +

O~n/"(C.,n eO.
(3.15)

186

Somesh Das Gupta

Applying the arithmetic-mean geometric-mean inequality we finally get

Iz,[ C, N { ( 1 - O)Eo + OE, } ]~>[t~,( C, oN Eo) ] '-[ t~( C, N E1) ] .


(3.16) Multiplying both the sides by

exp(-u)=exp[(l--0)u0] exp[Ou,]
we get (3.13). The following lemma will now yield the theorem.

(3.iv)

LEMMA 3.3.1. Let go and gl be non-negative (Borel-measurable) integrable functions on R 1 with non-empty supports given by S o and S1, respectively. Let g be a non-negative Borel-measurable integrablefunetion on R 1 such that for 0 < 0 < 1, x =(1 -O)xo + Ox~,x~~_S~

g(x) >~g~-(xo)g(x,). Then

(3.18)

f g(x)dx> So fgo(x)dx]'fg,(x)dx. (1-O)So+OS1 J S~

o[

(3.19)

PROOF. First we shall assume that gi's are bounded. Let ei be the supremum of g;. c;'s are assumed to be positive (excluding the trivial case). Define
A i= {x* = ( x , z ) ~ R 2: gi(x) >ciz,z > O , x ~ Si} ,

(3.20)

i = 0, 1, and A = {x* = (x,z) ~R2: g(x) >ze~-c,z > 0 , x E(1 - O)So+ 0S1}. (3.21) Let Ai(z ) and A(z) be the z-sections of Ai and A, respectively. For 0 < z < 1 both Ao(z ) and A l(z) are non-empty, and

A(z) D (1 -- O)Ao(z ) + OAl(Z).

(3.22)

Monotonieity and unbiasedness properties of A N O V A and M A N O V A tests

187

Moreover,

gi(x) dx = c, f/~,(Ai(z)) dz.


--oo 0

(3.23)

We may assume that tile integrals in the left-hand side of (3.19) are positive, the result is trivial otherwise.

f g(x)dx)c~-clfo'Pq(A(z))dz. O - O)So+ OS~


By the one-dimensional Brunn-Minkowski-Lusternik inequality ~I(A (z)) i> (1 -- O)~,(Ao(z)) + Ot~,(Al(Z)), for 0 < z < 1. Now it follows that

(3.24)

(3.25)

f
(1--O)So+OSI

g(x)dx>~c~-c ( 1 - 0 ) c o I f

go(x)dx+OCl I
--~

gl(x)dx

>1 _ ~ go(x) dx
In the general case, define

_ ~ g,(x) dx

(3.26)

gik(X)

= I gi(x) Ik

if if

gi(x) <k, g~(x) >k.

(3.27)

Then gik(x)~gi(x) as k---~oc. Now apply the above result to gik's and appeal to the monotone convergence theorem. THEOREM 3.4. Let f be a function on R n satisfying the conditions in Theorem 3.3. Let E be a convex set in R n, and for I"ER ~ define

h(,)= f f(x)dx
E+~-

(3.28)

Then h is a O-unimodal funetion on R", i.e.


h[ (1 - 0)% + OT, ] >1h 1--(%)h(~c O (3.29)

for 0 < 0 < 1, t i E R n.

188

SomeshDosGupta
%, E I = E + ~ l, and note that

PROOF. Apply Theorem 3.3 with E o = E + (1 - O ) E o + OE l = E + [(1 - 0 ) % + 0q'l].

COROLLARY 3.4.1.

Define h as in Theorem 3.4. Suppose h(~'m) = h('r)

h(,rl) . . . . .

(3.30)

for ~i' s and in R". Then

h( i~_l Xi'ri) >h('r)


for 0 < ~ i < 1,
' ~ ,m i=1~/=

(3.31)

1.

4,

S t u d y on m o n o t o n i c i t y in t h e general case

For studying tests in D o we shall reduce the problem further. Recall that T~..... z~ are the l largest characteristic roots of Z - ~ M M ' . It is possible to write E - M = QA(~')L', where Q :p x p and L : s s are orthogonal matrices, and A*O") = diagOh ..... Tl), (4.1)

~ ' = ( ' I ..... ~9'.


Define
A=Q'E-, U=AY(oL, V=AY(3).

(4.2)

(4.3)

Then the columns of U and V are independently distributed as @Lp(.,Ip), and E U = A 0 - ) , E V = 0 . Note that the nonzero characteristic roots of ( U U ' ) ( U U ' + V V ' ) - 1 are the same as those of SoS t- 1. This shows that the power function of any test in D o depends on E , M only through ~-. We shall now write S o = UU', S e = V V ' , S t = S o + S e. F o r a non-randomized test % let A~o be its acceptance region. We shall first consider acceptance regions in the space of U and V. The power function of a test p is

EM,xvP(U, V) = PM,x[ ( U, V) ~A~].

(4.4)

Monotonicity and unbiasednessproperties of ANOVA and MANOVA tests

189

For q~@q~G the power function of q9 will be denoted by ~r0-; q0). Given -r/2's and the structure of h in (4.2) the diagonal elements of A in (4.2) are not uniquely defined. In particular, by choosing Q and L appropriately it is possible to write in (4.2) A = A(De'r ), as well as, A = A(F~-), where D e is an l 1 diagonal matrix w i t h diagonal elements as 1, and F is an l x l orthogonal permutation matrix, i.e. F~-=(~i,, .... ~,)' for some permutation (i 1..... it) of (1 ..... l). Hence for ep~q) c ~r(~,; ~) = 7 r ( D j ; cp) = ~r(F~-; q~) (4.5)

for any such matrices D e and r and for all ~ E Rt. Let U,. be the i th column vector of U and ~.~(i) be the matrix U with U~ deleted. For a region A in (U, V) space, let A(~J (i), V) be the section of A in the U~-space, i.e. A(~t(i),v) = { u i E R P : ( u , v ) C A }. For any test cpE qb~ and all ti(0 and v (4.6)

A~( a (i), v) = - A~( a u), v),


and for all v A~(v)= - A ~ ( v ) ,

(4.7)

(4.8)

where A r ( v ) is the section of Ar in the u-space. Later we shall require A~0 to be a region in the space of (U, VV'), or in the space of (U, UU' + VV'). For that purpose we denote the acceptance region of ~ as A~ to mean that it is a region in ~p,~ X S7 . Next we shall introduce four subclasses of c as follows: (1) O~ ) is the set of all ~ E O G such that the acceptance region A~o (in the space of U and V) is convex in the space of each column vector of U for each set of fixed values of V and of the other column vectors of U, i.e. for every i and all fi(0 and v the set Ar(~(i),v) is convex. (2) ~ ) is the set of all ~ ~ c such that the acceptance region A~ is convex in the space of U for each set of fixed value of V. (3) ~ ) is the set of all ~ E c such that the acceptance region A~ (in the space of (U, VV')) is convex in U and VV'. (4) ~ ) is the set of all ~ ~ ~ a such that the acceptance region A cp (in the space of ( U, St = UU' + VV')) is convex in U and S t. Note that ~b~)D 3) D O~ ). THEOREM 4.1. For ep E 0(~ ) the power function of ep given by 7r0",cp) is a symmetric function in each "r i and monotonically increases as each I il increases separately.

190

Somesh Das Gupta

PROOF.

T h e first part of the t h e o r e m follows f r o m (4.5). F o r i = 1. . . . . l

f(ui)du,,

(4.9)

where f is the p.d.f, corresponding the %p(0,Ip) a n d ei is the vector in R p with 1 at the i th position and the other c o m p o n e n t s being 0. N o w we shall use T h e o r e m 3.2. N o t e that the density function f is u n i m o d a l and centrally symmetric. A~(u(O,v) is convex and centrally symmetric. Specialize G in T h e o r e m 3.2 to be the group of sign transformations on R p. N o t e that the distribution of U ( and V is free f r o m 5. Hence

p[ Ui C A~( 5(i), v) + ~,riei I ~(i)= 5(0, V= 5] = = P[ U,. EAr(5(i),v)+ (1 +Xi)'riei/2-(1 -)ti)q'iei/21Ui = l~li, V "-~-"l)] >~P[ Ui~A~(f(O,v)+'se, ll~,.=fi, V = v ],
(4.10) where - 1 ~)~ ~ 1 and the conditional p.d.f, of U i is taken as jr. T a k i n g expectation with respect to U/ a n d V we find that 7r(T; 99) increases if ~-~ is replaced by ~ ' i , where - 1 ~<)ti ~< 1, holding the other c o m p o n e n t s of ~fixed. Since f is also 0-unimodal the result would also follow f r o m Corollary 3.4.1. In the above t h e o r e m we need only n e + s > p . COROLLARY 4.1.1. If 99E0(~ ) the power function of 99 is a symmetric function in each ~'i and increases monotonically in each ],ri]. PROOF. Simply note that

O~)CO~).

Let H be the group of t r a n s f o r m a t i o n s acting on R ~ defined as follows, F o r ~'ERt, h ~ H


hq- = ( e l - f / l , . . . .

el'fit),

(4.11)

where ei= _ 1 and (i 1. . . . . iz) is a p e r m u t a t i o n of (1 ..... l). THEOREM 4.2.

If 99E O~ ), and "rE R z

99)

99),

(4.12)

Monotonicity and unbiasednessproperties of ANO VA and MAN 0 VA tests

191

where .c* is any point in the convex-hull of the H-orbit of % provided ne>~p+ 1.
PRoov. T h e joint density Po of U and Se = VV' under H o is O-unimodal when ne>>-p+l. F o r h E H , ' r ~ R t ~r(hz; qo) = ~r(,r; qo). F o r hi E H and 0 < h i < 1, Y'~Xi = 1 (4.13)

~kiA(hir)=A ( ~ Xihir ).
i=l i=l

(4.14)

Moreover

P~[(U, Se)~A~IHI] =P[(U+A('r),Se)~A~IHo].


T h e t h e o r e m now follows f r o m Corollary 3.4.1. T h e o r e m 3.4 also yields the following.

(4.15)

If q) ~ ( ~ ) tke power function of p given by ~r(l-; p) is a O-unimodal.function of ,r, provided ne >~p+ 1.


COROLLARY 4.2.1. The joint density Po of U a n d S e under H o is given by

pO(U,Se) = C e x p ( - tr(s e + uu'))[det(se)](n,--p

1)/2,

Se ~ S ;

The following facts show that Po is a 0 - u n i m o d a l function when n e ~>p + 1 (i) If A o and A 1 are p p positive-definite matrices det((1 - 0)A o + OA 1) ) (detAo) ~- ( detA 1) o, (4.17)

for 0 < 0 < 1. (ii) Let U (), U ( be elements in 9gp, s and U=(1-O)U()+OU ) for 0 < 0 < 1. Then (1
- O) U () U ()' + O U (1) U ) ' = = UU'+

(1 - 0 ) 0 ( U () -- U(1))( U (0) - v(l)) '. (4.18)

192

Somesh Das Gupta

(iii) If A o and A 1 are non-negative definite p p matrices det(A o + A l) > det(Ao) + det(A l). (4.19)

Next we shall study the four standard invariant tests given in Section 1o
THEOREM 4.3.

The likelihood-ratio test is in flP(~). Roy's maximum root test is in ~P(3a). Lawley-Hotelling's trace test is in apt). Bartlett-Nanda-Pillai's trace test is in dp~). (e) Bartlett-Nanda-Pillai's trace test is in dp~) if and only if the cut-off point k 4 ~ max(1,p - G).
(a) (b) (c) (d) PROOF. (a) Let W~=(/.~(0, V) then the acceptance region of the likelihood-ratio test can easily be expressed as 1 + U/(W/W/')-' U/-<<(det VV')/kdet(WiWi' ), which is clearly convex in Ui for fixed W,. (b) Note that (4.20)

maxch[(UU')Se -1] < k 2 = ["] [(U, Se): a'UU'a<<.kza'Sea ].


a ER p

(4.21) It follows from (4.18) that the region a' UU'a <kza'Sea is convex in (U, S). (c) For a matrix B E ~?,,.

tr(SjB)'( S : ~U) < [ tr(B'SeB)tr( V'Se-lU) ] < ()tr(B'SeB+ U'Se-'U ).


Hence (4.22)

tr(B'U)-tr(B'SeB)<~t1 rU

S e--1 U,

(4.23)

the equality is attained when B = S e 1 g . Hence the region in (U, Se) given by tr(UU')Se -I < k 3 is the intersection of the regions tr(B' U ) - tr (B'S~B) < k3 for B E 91L?. s. However, each such region (4.24) is convex in (U, Se). (d) The proof is the same as in (c). (4.24)

Monotonicity and unbiasedness properties of ANOVA and MANOVA tests

193

(e) The proof of this result is rather involved and we refer to [29]. Note however that tables for k 4 are partially available and even then they were obtained when n~ ~>p. Examples of other tests in qs(~) ( i = 1,2,3,4) are given in [6,27, 17,36, !5]. A step-down test of H 0 vs. H 1 is given in [32]; however, this test is not in ~ a . This test can easily be shown to be unbiased since it is given in terms of F tests. Only partial results are known for the monotonicity property of this test; see [7] and [10]. For the case p = 1 the power function of the F-test increases monotonically in ne and decreases in s when the other parameters are held fixed. For s = 1, the power of the Hotelling's TZ-test increases if ne increases, or if p decreases when the other parameters are held fixed. The proofs of these two results are given in [10]. Similar results for the general case are only known in very special situations; see [9, 10].

5.

General MANOVA models

The general M A N O V A model introduced by Potthoff and Roy [30] m a y be described as follows: Let X : p n be a r a n d o m matrix such that its column vectors are independently distributed as Np(., E) with an unknown positive-definite matrix Y,; moreover EX'=A~A2, where A l : n r n is a known matrix of rank r, A 2 : q p is a known matrix of rank q, and O : m q is a matrix of unknown parameters. The problem is to test H0: A3OA4=0 against H i : A3OA4=/=0, where A 3 ~ ) A 4 is bilinearly estimable, and A 3 : s m and A 4 : q v are known matrices of ranks s and v, respectively. This problem can be reduced to the following canonical form: Let

Yll
Y= Y21

Yl2
Y22

YI3 ] q - v
Y23 J v (5.1)

Y31 Y32 Y33 P - q s r-s n-r


be a r a n d o m matrix such that its column vectors are independently distributed as Np(., Z), and

E Y=

M21
0

M22
0

(5.2)

194

SomeshDasGupta

The problem is to test H0: Mal = 0 against M21 =/=0. Let us partition Z as in the above. Nil ~21 Z31 12 Z22 ~32 El3 Z23 Z33

Z=

(5.3)

A class of tests invariant under a certain group of transformations which keeps the problem invariant is obtained by Gleser and Olkin [18]. However, this problem is generally viewed in the conditional set-up described below. The column vectors of Yll 1121 YI2 Y22 YI3 ] Y23

Y=

(5.4)

are conditionally independently distributed as Nq(-,52), given Y31, Y32 and Y33; 52 is the covariance matrix of the first q components given the last p - q components derived from >I,. The conditional expectation of I7 is

M21

M22

0 l +Z[

r;2 Y3 I,

(5.5)

where fl is the matrix of regression coefficients. In this conditional set-up the s.p. matrices due to error and the hypothesis H 0 are respectively defined by (assuming n - r >p - q)

Se =

Y23 Y23 - Y23 Y33( Y33 Y33) - 1 Y33 Y23 ^, Y3,) --1 M21,

(5.6) (5.7)

So= )l)121(Is + Y;,(Y33Y33) , , -1


where

A~21 = Y21 -- )I23 Y33(' Y33 Y33)' - 1 Y31.

(5.8)

In the conditional situations Se and S O are independently distributed as the Wishart distributions qffv(n - r - p + q, 51.22.3) and 6~v(s, Z22.3; 7~), respectively, where 51.22. 3 is the covariance matrix of the second set (of v) components given the third set of ( p - q) components, and
A=M21(I~+

Y31(Y33Y33) ' '

Iy31) -1 M z' v

(5.9)

Monotonicity and unbiasedness properties of A N O VA and M A N O V A tests

195

As in the M A N O V A one might consider those tests which depend only on the characteristic roots of SoS e- l. In particular, the acceptance region of the likelihood-ratio test is given by ISel/ISo + Sel >Ik. The column vectors of (Y31Y33) are independently distributed as Np_q(O, N33). It is clear that the distribution of Y;l(Y33 Y;3)-1y31 does not depend on Z33 and we shall assume it to be lp_ q. Also for considering the distribution of the roots of SoS e- l we might take N22.3 = I v and replace 3421 1 ! by Y~22.~M21.AS in the M A N O V A case, we can replace Y~2273M21 by a matrix A : v x s such that A = [ diag ('rl 0..... TI)

0 ]

(5.10)
of

where l = m i n ( v , s ) and ~'~(~-i> 0 ) are the characteristic roots M '21"~"22.3 ~ ' - l M 21" This discussion leads us to take/~ as

Y ,(Y33 r;3)-' y3,)-'A'.

(5.11)

Arguing as in Anderson and Das Gupta [3] we see that the characteristic roots of 7~ increase if any ~'i is increased. Thus Theorem 4.1 in the M A N O V A case can be applied now. 6. Bibliographical notes

On Section 1. For a general discussion of M A N O V A see Anderson [2], Roy [33], and Lehmann [23]. On Section 2. See Roy [33]. On Section 3. A proof of Theorem 3.1 is given in Bonneson and Fenchel [4]. For Lusternik's generalization of Theorem 3.1 see Hadwiger and Ohman [19] or Henstock and Macbeath [20]. Theorem 3.2 was proved by Anderson [1] when G is the group of sign transformations. Essentially the same proof also holds for any G defined in Theorem 3.2; the general statement is due to Mudholkar [28]. For further generalizations of this theorem see Das Gupta [11]. Theorem 3.3 was proved by Prekopa [31] and Leindler [24] (for n = 1); however, their proofs are quite obscure and somewhat incomplete. The present proof uses essentially the ideas given by Henstock and Macbeath [20]; see Das Gupta [13] for more general results. Theorem 3.4 was proved by Ibragimov [21] and Schoenberg [35] when n = 1; the general case was proved by Davidovic, Korenbljum and Hacet [14]. For a discussion of these results see Das Gupta [13].

196

Somesh Das Gupta

On Section 4. Theorem 4.1 is due to Das Gupta, Anderson and Mudholkar [6] where the monotonicity property of the power functions of tests (a), (b), and (c) are established. Roy and Mikhail [34] also proved the monotonicity property of the maximum root test. Srivastava [37] derived the result for tests (a)-(c) although his proofs are incomplete. The present proof of Theorem 4.2 is due to Das Gupta [12]; an alternative proof using Theorem 3.2 is given by Eaton and Perlman [15]. On Seetion 5. See Fujikoshi [16] and Khatri [22].

7.

S o m e new results

Using a Theorem of Holly-Preston-Kemperman (see Kemperman, J. H. B. (1977). On the FKG-inequality for measures on a partially ordered space. Indag. Math. 39, 313-331), the following result was proved by Olkin and Perlman (Tech. Report 70, Dept. of Statistics, University of Chicago): For the MANOVA problem any test with the acceptance region of the form g(d~ ..... dr) < c is strictly unbiased if g is nondecreasing in each argument. A similar result also holds for the general MANOVA problems.

References
[1] Anderson, T. W. (1955). The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities. Proc. Amer. Math. Soc. 6, 170-176. [2] Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis. Wiley, New York. [3] Anderson, T. W. and Das Gupta, S. (1964). Monotonicity of the power functions of some tests of independence between two sets of variates. Ann. Math. Statist. 35, 206-208. [4] Bonneson, T. and .F.enchel, W. (1948). Konvexe K~rper. Chelsea, New York. [5] Brunn, H. (1887). Uber Ovale und Eiflachen. Inaugural dissertation, Mfinchen. [6] Das Gupta, S., Anderson, T. W., and Mudholkar, G. S. (1964). Monotonicity of the power functions of some tests of the multivariate linear hypothesis. Ann. Math. Statist. 35, 200-205. [7] Das Gupta, S. (1970). Step-down multiple-decision rules. Essays in Probability and Statistics., Univ. of North Carolina Press, Chapel Hill. [8] Das Gupta, S. (1972). Noncentral matrix-variate beta distribution and Wilks' U-distri.. bution. Sankhy~ Ser. A, 34, 357-362. [9] Das Gupta, S. and Perlman, M. D. 0973). On the power of Wilks' U-test for MANOVA. J. Multivar. Anal. 3, 220-225. [10] Das Gupta, S. and Perlrnan, M. D. (1974). Power of the noncentral F-test: Effect of additional variates on Hotelling's T2-test. J. Amer. Statist. Assoc. 69, 174-180. Ill] Das Gupta, S. (1976). A generalization of Anderson's theorem on unimodal functions. Proc. Amer. Math. Soc. 60, 85-91.

Monotonicity and unbiasedness properties of A N O V A and M A N O VA tests

197

[12] Das Gupta, S. (1977). s-unimodal functions: related inequalities and statistical apphcations. Sankhy~, Ser. B. [13] Das Gupta, S. (1978). Brunn-Minkowski inequality and its aftermath. Tech. Report 310, School of Statistics, University of Minnesota. [14] Davidovic, Ju. S., Korenbljum, B. I. and Hacet, B. I. (1962). A property of logarithrnically concave functions. Soviet. Math. Dokl., 10 (2) 477-480. [15] Eaton, M. and Perlman, M. D. (1974). A monotonicity property of the power functions of some invariant tests. Ann. Statist. 2, 1022-1028. [16] Fujikoshi, Y. (1973). Monotonicity of the power functions of some tests in general MANOVA models. Ann. Statist. 1, 388-391. [17] Ghosh, M. N. (1964). On the admissibility of some tests of MANOVA. Ann. Math. Statist. 35, 789-794. [18] Gleser, L. and Olkin, I. (1970). Linear model in multivariate analysis. Essays in Probability and Statistics. Univ. of North Carolina Press, Chapel Hill. [19] Hadwiger, H. and Ohman, D. (1956). Brunn-Minkowskischer Satz und Isoperimetrie. Math. Zeit. 66, 1-8. [20] Henstock, R. and Macbeath, A. M. (1953). On the measure of sum sets I: The theorem of Brunn, Minkowski and Lusternik. Proc. Lond. Math. Soc. 3, 182-194. [21] Ibragimov, I. A. (1956). On the composition of unimodal distributions. Theor. Prob. Appl. (Translation) 1,255-266. [22] Khatri, C. G. (1966). A note on MANOVA model applied to problems in growth curves. Ann. Inst. Math. Statist. 18, 75-86. [23] Lehmann, E. L. (1959). Testing Statistical ttypotheses. Wiley, New York. [24] Leindler, L. (1972). On a certain converse of Holder's inequality II. Acta Scient. Mat. 33, 217-223. [25] Lusternik, L. (1925). Die Brunn-Minkowskische Ungleischung fur Beliebge Nessabare Mengen. Comptes Rendus ( Doklady) de l'Academie des Sciences de l" U R S S , 3 (8) 55-58. [26] Minkowski, H. (1910). Geometrie der Zahlen. Keipzig and Berlin. [27] Mudholkar, G. S. (1965). A class of tests with monotone power functions for two problems in multivariate statistical analysis Ann. Math. Statist. 36, 1794-1801. [28] Mudholkar, G. S. (1966). The integral of an invariant unimodal function over an invariant convex s e t - - a n inequality and applications. Proc. Amer. Math. Soc. 17, 1327-1333. [29] Perlman, M. D. (1974). Monotonicity of the power function of Pillai's trace test. J. Multivar. Anal. 4, 22-30. [30] Potthoff, R. F. and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313-326. [31] Prekopa, A. (1973). On logarithmic concave measures and functions. Acta Scient. Mat. 34, 335-343. [32] Roy, J. (1958). Step-down procedure in multivariate analysis. Ann. Math. Statist. 29, 1177-1187. [33] Roy, S. N. (1957). Some Aspects of Multivariate Analysis. Wiley, New York. [34] Roy, S. N. and Mikhail, W. F. (1961). On the monotonic character of the power functions of two multivariate tests. Ann. Math. Statist. 32, 1145-1151. [35] Sehoenberg, I. J. (1951). On Polya frequency functions I: the totally positive functions and their Laplace transforms. [36] Schwartz, R. E. (1967). Admissible tests in multivariate analysis of variance. Ann. Math. Statist. 38, 698-710. [37] Srivastava, J. N. (1964). On the monotonicity property of the three multivariate tests for Multivariate Analysis of Variance. J. Roy. Statist. Ser. B. 26, 77-81.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 199-236

Robustness of A N O V A and M A N O V A Test Procedures


P. K. I t o

1.

Introduction

A statistical hypothesis in the univariate analysis of variance (ANOVA) and the multivariate analysis of variance ( M A N O V A ) is usually tested on the assumption that the observations are (i) independently and (ii) normally distributed (iii) with a c o m m o n variance or variance-covariance (var-covar) matrix. A desirable characteristic of a test is that while it is powerful, i.e., sensitive to changes in the specified factors under test, it is robust, i.e.. insensitive to changes in extraneous factors not under test. Specifically, a test is called robust when its significance level (Type-! error probability) and power (one minus Type-II error probability) are insensitive to departures from the assumptions on which it is derived. As noted by Scheff6 [38], a study of the robustness properties of a test cannot be exhaustive, for one reason, because the assumptions as stated above can be violated in m a n y more ways than they can be satisfied. In what follows, we shall examine the magnitudes of the effects on A N O V A and M A N O V A test procedures when (i) the observations, while remaining independent of one another, are not normally distributed and (ii) the observations, while remaining independent of one another, are normally but heteroscedastically distributed, i.e., with different variances or varcovar matrices. In most of cases we shall treat violations of the assumptions of normality and of homoscedasticity one at a time. There have been some studies of the robustness of test procedures for multi-way classification, fixed-effects or random-effects A N O V A models, but we shall restrict ourselves in what follows to the one-way classification, fixed-effects A N O V A and M A N O V A models, partly because we shall not be able to treat all the basic designs under all the models within a given space limit and partly because studies of the robustness properties of M A N O V A tests have so far been confined to the one-way classification, fixed-effects model. As for the robustness properties of test procedures for 199

200

P. K. lto

multi-way, fixed-effects or random-effects models, the reader is referred to Box [5], Scheff6 [38], Kendall a n d Stuart [23], etc. The review of works in the study of the robustness properties of A N O V A and M A N O V A test procedures which is going to be given in the following sections is not intended to be exhaustive, and the choice of works is rather subjective on the part of the present author. It is also remarked as done by Scheff6 that standards of rigor possible in deducing a mathematical theory from certain assumptions generally cannot be maintained in the study of robustness properties of test procedures and that some of the conclusions are inductions from rather small numerical tables. Thus, the works which are going to be reviewed in the following sections to explain the present state of affairs of this important field of the theory of statistics are accompanied in most of cases with numerical tables, from which certain conclusions are derived.

2.

One-way classification, fixed-effects ANOVA and MANOVA models

Suppose that we have N= Y~= 1Nt observations classified into k groups. The a-th observation in the t-th group is a p 1 column vector x,~ which is assumed to be expressed as follows: xt~ = p + % + e,~, (2.1)

where t = 1,2 ..... k; a = 1,2 ..... N , ; / , and a, a r e p 1 constant vectors such that Y.~=xNtat = 0, and the p 1 r a n d o m error vector: e't~ = (el,e,

e2t...... em,~)

is assumed to be distributed according to a p-variate distribution which m a y be characterized by as m a n y moments as desired. Specifically, the first four moments of eta are given b y E(et~) = 0,
E(e,,;e't,~)= X t,

(2.2)

(2.3)
(2.4) (2.5)

E(ei,,:ej,,,'e,,~)

= ~,~'?,
If,(ijt.l)m -~ 0 ~ ) ' 0 } 2 "t- O}/)'Uj(.2"t- 0}2" Oj(-/t),

e( eit,.ejt~.e,t~.em,~) =

where the prime denotes the transpose of a vector (or a matrix); ]2t = (a~:)) is a p p positive definite, symmetric matrix called the var-covar matrix of et~ whose elements are assumed to be finite; ~(~2and ~(0:),-are the third and

Robustness of A N O VA and M A N O VA test procedures

201

fourth cumulants of every kind of e,~, all assumed to be finite for i,j,l, m = 1,2 ..... p, which express the skewness and kurtosis of eta, respectively (see [22], p. 319). We shall also assume (i) that any two observations belonging to two different groups are always independent a n d (ii) that for a fixed t, the vectors xt,, a = 1,2 ..... N t, are independently and identically distributed. If the error term et, follows a p-variate normal distribution, then not only all x~) and x ~ but also all cumulants of higher order vanish. In M A N O V A we are interested in testing the null hypothesis: Ho(P, k): a~ = a 2 . . . . . ak = O (2.6)

against the whole class of alternatives: H ~ ( p , k ) : violation of at least one of the equalities in (2.6). (2.7) In the usual situation test criteria are derived on the additional assumptions of normality and homoscedasticity of the error terms, i.e., the et, are assumed to be distributed according to N(0,Z), where Z = S z 1. . . . . Zk" Let s = m i n ( p , k - 1). Then it has been found that if s = 1 there exists the uniformly most powerful invariant test, while there exists no such test if s>l. 2.1. The case when s = 1 ( A N O V A and a special case of M A N O V A )

When p = 1 and k/> 2, we have a general A N O V A case, and to test the null hypothesis Ho(l,k): a l = a : . . . . . a k = 0 , the variance-ratio F-test was proposed, where F is defined by F= with QB/(kQw/(U-k)
k

1) '
k

(2.8) 2,
Q w = E lit $2, t~l k

Q.= E

t=l Nt

E
k

E r,Y,,
t~l

rt=N,/N,

N - - E Nt,
t=l

N, s2t = Z ( x t . - ~ ) 2 / n , ,
a=l

nt=Nt-1.

QB and Q w are called the "between-groups" and "within-groups" sums of squares, respectively. Under the assumptions of normality and homoscedasticity, F is known to be distributed according to the F-distribution

202

P. K. Ito

with ( k - l ) and ( N - k ) degrees of freedom when H0(1,k ) is true and according to the noncentral F-distribution with (k - 1) and ( N - k) degrees of freedom and the noncentrality parameter 82 when Ho(1,k ) is not true, where 82 = Zt= k ~Nta,/o 2 2. It has been found that this test is uniformly most powerful which is invariant with respect to linear transformations (see, e.g., [38] pp. 46-51). When k = 2 , we have a special case of A N O V A with two groups, and the F statistic of (2.8) is reduced to

t 2=

N'N2
N 1+ N 2

(x'-22)2
S2 '

(2.9)

where s2=(njs2+ n2sZ)/(nl + n2), and t is known to be distributed according to the Student t-distribution with (n I + n2) degrees of freedom when //0(1,2) is true and according to the noncentral t-distribution with (nt + hE) degrees of freedom and the noncentrality parameter 62 when Ho(1,2) is not true, where 8 2 = ( N l N z / ( N l + N2))(O~l--Ot2)2/o 2. T h i s t-test also has the feature of uniformly greatest power. When p > 1 and k = 2, we have the case of a special M A N O V A with two groups, and to test the null hypothesis Ho(P,2): a l - - a 2 = 0 , Hotelling's generalized Student T2-test was shown to be uniformly most powerful which is invariant with respect to affine transformations (see, e.g. [1] pp. 115-118). The T 2 statistic is defined by

T2 = N'N~2 (Y., - x2)'S-1(~, _ x2),


N 1+ N 2 where

(2.10)

Nt xt = E x . J N , ,
o~=l

Nt
S,= E
a=l

(x,,~-x,)(x,,~-xt)'/n,,

S = ( n l S 1 + n 2 S 2 ) / ( H 1 -I- H2).

It is clear that T 2 is a p-variate generalization of Student t defined in (2.9). The exact forms of the central and noncentral distributions of T 2 are known so that the exact significance level and power of the T2-test may be evaluated. The noncentrality parameter in this case is 82= {N1N2/(N1 + N2) } (al - ~3~2)'~]- 1(131 52). Thus, in the case of s = 1, we shall not be able to do better than use the F-test in A N O V A and the T2-test in M A N O V A with two groups if the assumptions of normality a n d homoscedasticity are satisfied. W h e n the assumptions are violated, one is naturally interested in the magnitudes of the effects of violations on the significance levels and powers of the F-test
-

Robustness o f A N O V A and M A N O V A

test procedures

203

and the T2-test, and if the effects are found to be rather serious, then one will try to modify the tests to make them robust in the sense that their actual levels of significance are at least as close to the specified level as possible.

2.2.

The case when s > l (MANOVA)

When p > 1 and k > 2 , we have a general M A N O V A case where no invariant test has the feature of uniformly greatest power to recommend it. Let Qn and Qw be the p p sums of squares and cross-products matrices for "between-groups" and "within-groups", respectively, where
k

Qs = E ~(xt-i)(~,-K)',
t=l

N,

Qw = ~ ntS,,
t=l

Y,t = ~ x , , / N , ,
a=l

~= ~, rtg~,,
t=l

Nt

St=
a=l

(xt,~-Y~t)(xt~-Y~t)'/n t,

nt=Nt-1,

t = l , 2 ..... k.

For the sake of simplicity, it is assumed that p < k - 1 and N - k , unless otherwise stated. It has been found that all invariant test criteria for testing Ho(p,k ) are functions of the characteristic roots of QAQw 1, denoted collectively by c(QBQwl), or individually by 0 < c 1< c 2 < - - . <cp, but the choice of a particular function is a matter of debate. So far there have been proposed, among others, the following six test criteria of the largest root, trace and determinantal types: Roy's largest root [36]:

R = Cp/(1 + cp),
Lawley-Hotelling test statistic [18, 28]:
P

(2.11)

T= T~/(N-K)=tr(QsQw~)
Wilks' likelihood ratio [44]:
P

= ~, ci,
i=l

(2.12)

W = I Q w I / I Q ~ Q w l = 1~ l / ( l + c , ) .
i=1

(2.13)

204

P. K. Bo

Multivariate beta trace [3, 30, 33]:


P

v=trQB(Qs+Qw)
P

-1= ~
i=l

Ci/(l+ci),

(2.14) (2.15) (2.16)

U= ]-[ ci/(l+ci),
i~l P

(e.g.,see [37])

S = I-[ c,
i=l

(e.g., see [31])

It is easily seen that when p = 1, all these criteria are reduced to the A N O V A F criterion. In addition, there is a test criterion T~ proposed by James [21], which is a multivariate extension of Welch's v criterion and is defined by

E
t=l

(2.17)
-1, x=W k 1 E (Wt'K,), t=l k W~-" E Wtt=l

with
Wt=(St/Nt)

Some other test statistics considered in the literature are cp/c I and max(c/+l/c/) by Krishnaiah a n d Waikar [26, 27]. Also, J. Roy [35] proposed Step-down Procedure whereas Krishnaiah [25] proposed Finite Intersection Tests for the M A N O V A problem; these procedures are based upon conditional distributions. However, we do not discuss these procedures in this paper. There have been numerous works to obtain exact or asymptotic forms of the central and noncentral distributions of these test statistics so that exact or approximate significance levels and powers of the tests for Ho( p, k) by means of these statistics m a y be obtained. While no M A N O V A test is uniformly most powerful, comparisons of the powers of these tests to detect the differences a m o n g group mean vectors can be of use in selecting a test for a particular situation. The general conclusion to be drawn from comparative power studies of M A N O V A tests (see, e.g., [31]), is that the R-test has greater power than the others when the differences a m o n g group mean vectors are concentrated in one canonical dimension (i.e., when all but one of the c(A) are zero, where A=~tffi|NtoztOlt k , ~ - 1, while the other tests have greater power than the R-test when the noncentrality is not so heavily concentrated in a single root (the overall noncentrality in the case of M A N O V A is expressed by the sum of c(A), i.e., 62=trA). This result is not surprising, since R is the only criterion which makes use of only one sample root. The T-, W- and V-tests are asymptotically equivalent for very

Robustness of A N O VA and M A N 0 VA test procedures

205

large samples, but some results are available to distinguish among the criteria for small to moderately large samples. These results indicate that the tests are ordered R >/T/> W/> V/> U in terms of their powers when there is just one nonzero e(Z~), but that the ordering is reversed (U/> V >/W T ~ R) in the nonlinear case, with the exception that the V-test tends to have higher power than the U-test when the number of degrees of freedom of Qw, N - k, is greater than 10. Studies of comparative powers of the tests thus give some indication of the sorts of alternatives to which each test is most sensitive when Ho(p, k) is tested under the assumptions of normality and homoscedasticity, but such considerations provide no simple basis for the choice of a "best" test when we do not know whether the assumptions of normality and homoscedasticity are satisfied or not. Consequently, one is led to consider whether the robustness properties of the tests when the assumptions are violated might provide further basis for choosing among the tests. In what follows, we shall denote the usual test criteria such as F, T 2, R, V, etc. by F*, T 2., R*, V*, etc. when they are used under violations of the assumptions of normality a n d / o r homoscedasticity.

3. Effects of nonnormality and/or heteroscedasticity on the ANOVA F-test


Scheff6 [38], Plackett [34] and Kendall and Stuart [23] provided thorough introductions of some of the important investigations of the effects of nonnormality a n d / o r heteroscedasticity on the A N O V A F-test which had been carried out before the publication of their books. In this section we shall review some of the more recent theoretical works and Monte Carlo studies in this field which serve to substantiate the previously known conclusion that the F-test is rather robust against heterogeneity of variance and especially against nonnormality. The F-test is found to be remarkably insensitive to general nonnormality. In the commonly occurring case where the group sample sizes are equal, it is not very sensitive to heterogeneity of variance from group to group. Hence, the ANOVA F-test for equal group size can probably be used with confidence in most practical situations. However, with unequal groups much greater effects can occur on the F-test when variances are different from group to group.

3.1. Theoretical investigations of the effect of nonnormality


Many investigations have been made to study the effect of nonnormality on the level of significance of the F-test employed in ANOVA. The problem was first studied by E. S. Pearson [32] by way of sampling

206

P. K. Ito

experiments. Theoretical investigations of the central distribution of the F* statistic under violation of the assumption of normality were carried out by Bartlett [2], Geary [13, 14], G a y e n [11, 121, Tiku [40] and others, while Box, Andersen and Watson [6, 7] assessed the effect of nonnormality by means of an approximation in the form of a correction to the numbers of degrees of freedom of F when the permutation test based on F was worked out. Let H0(1, k) be tested by means of the normal-theory F statistic defined in (2.8) at the nominal significance level a as if the assumption of normality were not violated. Then the actual level of significance of the F*-test is given by a * = P(F* >F,(pl, p2)ln0(1, k)), (3.1)

where F~(ul,u2) is the upper 100a% point of the F-distribution with p] and g2 degrees of freedom, with Pl = k - 1 and p2= N - k . The effect of moderate nonnormality on the level of significance is known to be not very serious, i.e., the difference between a and a* is not large as far as the practical application of the F-test is concerned. This result was first established by Pearson, mainly for groups of equal size, i.e., N I = N 2 ..... N k, by taking experimental samples from six nonnormal populations. Mathematical confirmation was provided later by theoretical investigations. Here we shall explain some of the results on the magnitude of a* due to Gayen [12] and Tiku [40]. Suppose that under H0(l,k ) the standardized variates y,,~ = (x,. - / ~ ) / o , = % / o r (3.2)

follow a distribution with the density function expressed as an Edgeworth series: f ( y ) = q S ( y ) - ~-.iq5( ) ( y ) +

~'3 3

~ }k4 b(4)(y )

~2 + ~22q)(6)(y)

(3.3)

for all t and a, where q~(y) is the density function of the standardized normal distribution N(0, 1), ~(r)(y) its rth derivative, and

)k3= glll/O 3,
=

)k4= gllll/(I 4

(3.4)

with at = o,~t~l h ~ l l l~.(t) , , . l l l_ l - ~1111 for all t. X3 and )t4 are the measures of skewness and kurtosis of the parent population, respectively. By means of the characteristic function, G a y e n derived the density function of F* in the form: P( f * l n o ) =Po( f * l H o ) - X4"a( f * ) + X~.b(f*), (3.5)

Robustness of ANOVA and MANOVA test procedures

207

where po(F*lHo) is the density function of the central F-distribution with u I and v2 degrees of freedom when H0(1,k ) is true, and a(F*) and b(F*) are corrective factors due to nonzero 2`4 and 2`2, respectively. He remarked that (3.5) gives the density function of F* for samples of any size drawn from the Edgeworth series (3.3) when the terms in cumulants other than 2,3, 2`4 and 2`2 are negligible and also that it approximates fairly closely to the actual distribution of F* for samples drawn f r o m any population with finite cumulants provided the samples are so large that terms in N - 3 can be neglected. Using this result in (3.1) he obtained c~* = c~- 2`4.A +2`3Z-B, (3.6)

where A and B are certain functions of Fa(Pl, P2),Pt,u2, N 1. . . . . Nk, N and k involving incomplete beta function ratios. For several combinations of values of 2`4 and 2`2 he evaluated the actual levels of significance a* of the F*-test at the nominal level c~= 0.05. Tiku considered the case where the distributions of Yr, of (3.2) are not necessarily identical from group to group, and assumed that the error terms % have standard cumulants:

2`(')=~}I)...~1o r, r = 3 , 4 .....

(3.7)

while they are assumed to have the same variance o 2. Using Laguerre polynomials he derived the distribution of F* when H0(1,k ) is true, and evaluated a* of (3.1) as follows:

o~*= o~- Xa'A + X2"B + A2"C + X6"D- X~'E- AZ4"H,


where
k ~= t:lE 2`(rt)lk'

(3.8)
,

arA,=

I k

1 Z 7h(t))~(t)--)~ "~ --k l : l " r "'s "'r''s

and A, B, etc. are certain functions of F~(el, v2), lq, v2, N l . . . . . N k, N and k involving incomplete beta function ratios. If the cumulants vary from group to group, ~ and ArA ~ represent their first and second moments. A and B are the same as the corresponding factors in (3.6). G a y e n restricted the population density function to the first four terms of an Edgeworth series, while Tiku did not restrict the population to any special form, and expanded the density functions of "between-groups" and "within-groups" sums of squares in series of Laguerre polynomials. As a result the terms in the sixth and the square of the fourth cumulant are now included in addition to the terms in the fourth and the square of the third cumulant in Gayen's formula (3.6). To compare his results with Gayen's, Tiku provided

208

P. K. Ito

the c o r r e c t i v e t e r m s o t h e r t h a n A a n d B in (3.8) f o r t h e s a m e r e p r e s e n t a t i v e v a l u e s of d e g r e e s of f r e e d o m as G a y e n ' s , a n d r e m a r k e d t h a t l a r g e r e f f e c t s d u e to n o n n o r m a l i t y a p p e a r if t h e s k e w n e s s is in d i f f e r e n t d i r e c t i o n s in d i f f e r e n t g r o u p s a n d t h a t t h e y a r e a p p r e c i a b l e u n l e s s t h e d e g r e e s of f r e e d o m f o r error, v2, a r e f a i r l y large. T a b l e 1 s h o w s s o m e o f his r e s u l t s f o r a * w h e n a = 0 . 0 5 in (i) cases w h e r e X3 (t) a n d ~(4t) d i f f e r f r o m g r o u p to g r o u p a n d (ii) c a s e s w h e r e ~3 (t) a n d ~4 (0 a r e i d e n t i c a l f o r all g r o u p s (in all t h e s e n u m e r i c a l e v a l u a t i o n s t h e X6 t e r m in (3.8) h a s b e e n o m i t t e d ) .

Table 1 Actual levels of significance a* of the F*-test under violation of the normality assumption when the nominal level a = 0.05 (i) Cases where X~ 0, ~0 differ with the group, t k=2 (1)
1 1.5 1.0 1.5

k=4 (2)
3.0 1 -1.5

(3)
1.0 -1.5

(4)
3.0

-1.5

1.0

-1.5

3.0

2 3 4

1.5 0.5 -0.5 v2 6 8 12 24 40

1.0 1.0 1.0

1.5 0.5 -0.5

3.0 3.0 3.0

v2 6 8 12 24 40

Values of a* 0.0910 0.0867 0.0797 0.0681 0.0619 0.0730 0.0743 0.0733 0.0655 0.0609

Values of a* 0.0612 0.0600 0.0583 0.0557 0.0540 0.0398 0.0412 0.0451 0.0495 0.0508

Oi) Cases where A~),~t)areidenticalforallt (5) h 3 = 1.5,~4= 1.0 6 8 12 24 40 0.0512 (0.0525) 0.0496 (0.0504) 0.0488 (0.0491) 0.0490 (0.0491) 0.0495 (0.0495) (6) ~3 = 1.5,~4=3.0 0.0332 (0.0449) 0.0372 (0.0444) 0.0424 (0.0451 ) 0.0464 (0.0473) 0.0485 (0.0488) (7)
~3=~4 =

1.0

(8) ~3=1.0,~4=3.0 0.0293 (0.0428) 0.0304 (0.0421) 0.0348 (0.0420) 0.04 18 (0.0445) 0.0453 (0.0462)

0.0507 (0.0522) 0.0492 (0.0505) 0.0480 (0.0488) 0.0480 (0.0483) 0.0485 (0.0486)

From p. 89 of "Approximating the general non-normal variance-ratio sampling distributions," by M. L. Tiku, Biometrika, 51(1964). Reproduced with the kind permission of the author and the editor. The figures in parenthesis are from Gayen's formula.

Robustness of ANOVA and MANOVA test procedures

209

There have been less attempts to investigate the effect of nonnormality on the power of the F-test. David and Johnson [9] calculated the product moments of Q~ and Q~v under the general assumption of nonnormality. Srivastava [39] generalized Gayen's results to obtain the actual power of the F*-test when H o ( l , k ) is not true, based on the first four terms of the Edgeworth series, while Tiku [41] generalized his earlier results [40] to obtain the power function of the F*-test under the general nonnormal situation. The actual power of the F*-test is given by

1 -/3* = P(F* >F,~(v,, v2)lH~(1,k)),

(3.9)

where/3* is the actual Type-II error probability. For the sake of simplicity Srivastava considered the case of k groups of n observations each, and derived an expression for/3* in terms of confluent hypergeometric functions as follows:
/3*= /3 + )k3"P + ~k4"Q + ~.~'R,

(3.10)

where/3 is the normal-theory Type-II error probability of the F-test, and P, Q and R are certain functions of F~(gl, g2),vl,v2,k,n,N and three noncentrality parameters 82, 8 3 and 8 4, where k 82---=-El E 0/2/.2, t=l k 83 ~- n E 0/310"3, t=l k 84~---'-" n E /4/4. t=l (3.11) In evaluating the actual power of the F*-test for a given value of noncentrality parameter 82, Srivastava considered two extreme cases where (a) 83-- 0 and 84 takes on a minimum and (b) 83 has an extreme positive or negative value and 8 4 takes on a maximum. In practice the situation may be somewhere between these two cases. Table 2 shows comparison of the powers of the F-test in the normal case and of the F*-test in nonnormal cases when the nominal level a--0.05 and v I =4,v 2 =20 for certain values of X3, X4 and ~ = \ / ( 8 2 / k ). Here two entries before any value of q in a column with X3 = 0 correspond to the cases (a) and (b), respectively, and the three entries for X3ea0 correspond to the cases (a), CoO and (b2), where 83 takes on an extreme positive value in (bl) and an extreme negative value in (b2). It is to be noted that the values for the cases (bl) and (b2) will interchange as X3 changes sign. F r o m the table it is seen that when X3=0 , there is practically no difference between the values of power in (a) and (b). When X3 ~a 0, in (a), (bl) and (b2) that arise, the powers differ to some extent though not vary

210

P. K. Ito

Table 2 Comparison of the powers of the F-test in the normal case and the F*-test in nonnolTnal cases when the nominal level a =0.05, ul =4 and p2=20 X3 0.0 0.0 0.5 l.O 2.0 2.4 --1.0 0.0 0.5 0.5 2.0 0.7 2.4

~4 --1.0
0.0 1.0

0.052 0.050 0.049 0.048 0.045 0.044 0.308 0.319 0.325 0,330 0.342 0.346 0.307 - 0.325 0.331 0.343 0,347

0.053 0.050 0.049 0.045 0.045 0.307 0.318 0.324 0.341 0.345 0.303 0.315 0.321 0.339 0.342 0.310 0.321 0.327 0.345 0.350 0,649 0.659 0.664 0,679 0.683 0.648 0.658 0.664 0,680 0.684 0.649 0.659 0.665 0,681 0.685 0.908 0.907 0.907 0.906 0.906 0.913 0.912 0.912 0.910 0.912 0.904 0.903 0.902 0.901 0.899 0.993 0.988 0.985 0.978 0.976 0.996 0.990 0.988 0.979 0.978 0.991 0.985 0.983 0.974 0.971

1.5

0.649 0.659 0.664 0,669 0.679 0.683 0.648 - 0.664 0,670 0.680 0.685

2.0

0.908 0.907 0.907 0.906 0.906 0.905 0.908 - 0.907 0.906 0.905 0.905

2.5

0.993 0.988 0.986 0.983 0.978 0.976 0.994 --- 0.985 0.982 0.977 0.975

From p. 121 of "Effect of non-normality on the power of the analysis of variance test" by A, B. L. Srivastava, Biometrika, 46(1959). Reproduced with the kind permission of the editor.

significantly from the practical p o i n t of view. T h e situation (a), in this case, gives rise to values which are generally i n t e r m e d i a t e b e t w e e n those of (bl) a n d (bE). It is also n o t e d that w h e n q~=0, i.e., w h e n H0(1,k ) is true, the power of the F*-test is r e d u c e d to the level of significance, a n d the values for this are in c o n f o r m i t y with T i k u ' s results in T a b l e 1. It is also f o u n d in T a b l e 2 that the effect of skewness is n o t m u c h o n the power of the F*-test, at least w h e n the values of ~3 are c o n f i n e d to those specified in the table. I n practice, however, higher values of ~3 m a y occur a n d t h e n the effect of skewness m a y he s o m e w h a t larger a n d c o m p a r a b l e to t h a t of kurtosis which is, i n general, high. T h e presence of a fair degree of kurtosis, as is n o t u n c o m m o n in practice, leads to a n o t i c e a b l e change i n the power, particularly in the case of small samples. But a small d e p a r t u r e from n o r m a l i t y i n respect of kurtosis (say, of the order ?~4= +- 0.5) a g a i n does n o t cause a n y significant d e v i a t i o n i n the power. W h e n the p o p u l a t i o n is leptokurtic (?~4>0), the power increases in the b e g i n n i n g (e.g., u p to the p o i n t for which power is a p p r o x i m a t e l y 0.8), b u t in the region of very high power it s u b s e q u e n t l y decreases i n c o m p a r i s o n with the n o r m a l - t h e o r y power. T h e reverse h a p p e n s w h e n the p o p u l a t i o n is platykurtic (~4 < 0).

Robustness of A N O VA and M A N O VA test procedures

211

These conclusions drawn from numerical results shown in Table 2 are expected to be valid in general. Other numerical results not presented in the table indicate that the effect of nonnormality on the power diminishes with increasing sample sizes as expected. On the whole it m a y be said that from the practical point of view, the effect of nonnormality on the power will not be of much consequence in the case of near-normal populations. Different from Srivastava who obtained the power function of the F*-test basing his derivation on the first four terms of the Edgeworth series, Tiku obtained an expression for the power function from Laguerre series expansions of Q~ and Q~v- His result is as follows;

1--/3* = (1 - / 3 ) -~'3" 83"A1-k-~k4(B 1 "4- 0 2"84)


- ~ 2 - C 1 +)k5"83-D 1 -~k6"E 1+ ~2.HI,

(3.12)

where ( 1 - / 3 ) is the power of the normal-theory F-test, A l, B1, etc. are corrective functions of F~(v l, v2), v 1, v2, k, n, N, 82, 83 and 84 due to nonnormality. In giving numerical examples Tiku used combinations of X3 and ~k4 different from those of Srivastava, but it is noted that both of the results are fairly in good agreement.

3.2. Theoretical investigations of the effect of heteroscedasticity


We shall now investigate the effect of heterogeneity of variance on the ANOVA F-test when the observations are assumed to be normally distributed with variances different from group to group. It is well known (see, e.g., [38]) that in the case of two equal groups the two-tailed t*-test is exceedingly well behaved with respect to violation of the equality-of-variance assumption, since it shows no effect on the level of significance for large values of N. However, when we consider the case of k groups we shall find that violating the equality-of-variance assumption has some effect even when group sample sizes are equal. Horsnell [17] evaluated approximate values of the actual power ( 1 - fl*) of the F*-test when it is used to test H0(1,k ) under violation of the assumption of homoscedasticity, where ( 1 - / 3 * ) is formally the same as that of (3.9), but the distribution of F* under violation of the assumption of equality of variance is different from that under violation of the normality assumption. Inequality in the parentheses on the right hand of (3.9): F* >F~(v 1, v2) is equivalent to
Pl

In deriving approximations to the distribution of x Horsnell used the

212

P. K. Ito

E d g e w o r t h series c o n s i s t i n g o f f o u r t e r m s a n d t w o o t h e r t y p e s o f c u r v e . T h e c u r v e s w e r e f i t t e d u s i n g t h e first f o u r m o m e n t s of x w h i c h w e r e o b t a i n e d f r o m t h e m o m e n t s o f Q~ a n d Q~v g i v e n b y D a v i d a n d J o h n s o n [9]. I n T a b l e 3 a r e s h o w n s o m e o f his results o n a p p r o x i m a t e a c t u a l p o w e r s b a s e d o n the E d g e w o r t h series w h e r e k = 4 , t h e set of v a r i a n c e s ot2, ( 1 , 1 , 1 , 3 ) , a n d f o u r c o m b i n a t i o n s of t h e N t, ( 7 , 7 , 7 , 1 9 } , { 9 , 9 , 1 0 , 1 2 ) , (10, 10, 10, 10) a n d (12, 12, 1 2 , 4 ) , w h e r e N = 4 0 in all cases, a r e c o n s i d e r e d f o r t w o d i f f e r e n t cases of d i v e r g e n t m e a n s (i) 0~1=~=0/2=0/3=0~4 a n d (ii) a 1 = a 2 = a 3 ~ a 4. T o i l l u s t r a t e t h e e f f e c t of u n e q u a l v a r i a n c e s , t h e p o w e r s c o m p u t e d f o r c a s e s (i) a n d (ii) are c o m p a r e d w i t h the p o w e r of t h e F - t e s t with noncentrality parameter:
1

(3.14)
d?=~l = t=l

o~'k

'

w h e r e o~ is the c o m m o n

l o w v a r i a n c e of t h e first t h r e e g r o u p s ( t a k e n as

Table 3 Approximate actual levels of significance and powers of the F*-test for k = 4, 02: o2: 02:o42= 1: 1: l: 3 and N = 40 when the nominal level a = 0.05
t Nt

Actual level of significance a* 0.021 (1 - 13") (i) (ii) (1 -/3)1 (1 -/3)2 (1 - f l * ) (i) (ii) (1 -/3)1 (l-/3)2 (1 -/3*) (i) (ii) (1 -/3)1 (1 -/3)2 (1 - 13") (i) (ii) (1 -/3)1 (1 - t9)2

Actual power ~=q~l= 1 ~ = q ' l = 2 ~=q~1=2.5 0.128 0.115 0.325 0.18 0.211 0.201 0.325 0.21 0.216 0.233 0.325 0.23 0.365 0.339 0.325 0.28 0.576 0.540 0.904 0.61 0.724 0.644 0.904 0.71 0.764 0.672 0.904 0.74 0.880 0.759 0.904 0.84 0.835 0.768 0.987 0.82 0.918 0.830 0.987 0.89 0.936 0.846 0.987 0.91 0.979 0.892 0.987 0.96

1 (1) 2 3 4 1 2 3 4

7 7 7 19 9 9 10 12 10 10 10 10 12 12 12 4

(2)

0.054

t (3) 2 3 4 1 (4) 2 3 4

0.064

0.103

From p. 133 of "The effect of unequal group variances on the F-test for the homogeneity of group means," by G. Horsnell, Biometrika, 40(1953). Reproduced with the kind permission of the author and the editor.

Robustness of ANOVA and MANOVA test procedures

213

unity in the calculations). In this w a y we can j u d g e how, in the cases illustrated, the unrecognized presence of a single high group-variance will reduce the chance of detecting a divergent m e a n value. A n alternative a p p r o a c h is to c o m p a r e the powers for (i) a n d (ii) with the power of the F-test with noncentrality p a r a m e t e r :

q,=-,~2 =

- t= 1 62"k

(3.15)

with

02= Z r,o
t=l

For b o t h cases (i) a n d (ii) the a t were chosen for convenience to m a k e the ffl of (3.14) assume the exact values of 1, 2 a n d 2.5. F o r a given value of Y ~ N t a t 2 a n d hence for a fixed q,, (since a 2 = 1), q'2 will be constant between cases of (i) and (ii) within a cell of the table, b u t will alter with the N t in passing d o w n a column. T a k i n g first the actual p o w e r ( 1 - fl*) (as calculated f r o m the approximation b a s e d on the E d g e w o r t h series), it is seen that within a given cell the actual power is always less for case (ii) than for case (i), apart from one exception where the p o w e r is small. This m e a n s that for a worthwhile power, we are less likely to detect a divergent m e a n w h e n it occurs in a group with larger variance. W h e n N 4 is well a b o v e or b e l o w the average sample size 10, the actual level of significance a* is seriously affected. W h e n comparisons are m a d e with (1 - fl)l, which is the p o w e r of the F-test with noncentrality p a r a m e t e r ~ = q~, it is f o u n d that for a given set of the a t an increase in the variance of one group lowers the p o w e r of the F*-test, except in the case w h e n 4~1= 1 a n d the N t are {12,12,12,4}, where the actual level of significance a* is m u c h higher than the nominal level a =0.05. However, m o r e instructive c o m p a r i s o n s are m a d e with ( 1 - fl)2, which is the power of the F-test with noncentrality p a r a m e t e r q~--q~2. H e r e in a n u m b e r of eases (1 - fl)2 lies between (1 - fl*) for ease (i) and case (ii). F o r low values of power, say less t h a n 0.50, its value m a y be considerably influenced by a w r o n g start at q~= 0, i.e., by the difference between a a n d a*. But it is seen that as the p o w e r b e c o m e s large so that there is a worthwhile chance of establishing the significance of difference in means, ( 1 - f l ) 2 gives a very reasonable a p p r o x i m a t i o n to the actual p o w e r ( 1 - fl*), particularly for case (i). It m a y be concluded f r o m T a b l e 3 that where there is no very clear information as to h o w a n y heterogeneity in group variances is apportioned, it will be best to w o r k with equal sample sizes for each group. If, however, there are definite grounds for believing that the observations in one or

214

P. K, Ito

more groups have a variance above the average, we must avoid taking less than the average number of observations from these groups. This is because we do not want to run the risk of claiming a significant difference (when there is no real difference in means) considerably more often than has been allowed for on the basis of the significance level chosen. It wilt, indeed, be worthwhile taking a few more observations f r o m the groups whose variance we believe to be above the average, and there is no great danger of overdoing this. F o r there can be no objection to the risk of wrongly claiming significance being less than we have been allowed for, if the actual power makes a good recovery as ( 1 - fl)2 increases above 0.50. Box [4] derived certain theorems concerning the distribution of quadratic forms in multi-normally distributed variables and obtained the exact distribution of F* when H0(1,k ) is true in the form:
k 1

F* . . . . . k-1 Qw

N-k

Q~

N--k
k-1

t=l
k
t=l

Z o,.Z(l)
(3.16)

where the 0, are ( k - 1 ) characteristic roots of a k x k matrix whose t-th row, s-th column element is o,2(6t,- r,), with 6t, being the Kronecker delta and r s = N s / N , and Xz(u) is the central chi-square with p degrees of freedom. He expressed the actual level of significance a* of the F*-test as an infinite series in which each term contains a probability calculated from the F-distribution. He also approximated the central distribution of F* by bF(h', h), where b is the bias due to heterogeneity of variance from group to group, F(h',h) is distributed according to the F-distribution with h' and h degrees of freedom, and b, h' and h are determined by
k

b------

N-k E(Q~) k- 1 E(Qw)

N-k ,=l N ( k - 1) ~k n, 2
t=l

E (N- N,)",

Var(Q~)

t=lE (N-Nt) 02

t=l N, ot2 + N t = ~-" , (N-2Nt)o4

'

,2(
h= Var(Qw ) = ,=1

(3.171
,,/ t=,

Robustness of A N O VA and M A N O VA test procedures

215

By means of the above two results, Box evaluated exact and approximate values of the actual significance level a* of the F*-test when the nominal level c~--0.05 for a n u m b e r of combinations of the o,2 and the N t. The results are in good agreement with those obtained by Horsnell, which are given in Table 3. F o r the details of Box's work the reader is referred to [4] and [38].

3.3. Monte Carlo studies


Donaldson [10] investigated the robustness of the F-test for two nonnormal distributions: the exponential and the log-normal in the two cases of equal and unequal variances b y means of Monte Carlo methods. A computer was p r o g r a m m e d to sample n numbers from the distribution specified in each of k groups in the one-way classification A N O V A model, and the F ratio was computed. This operation was replicated 10,000 times, and the computer listed the frequency distribution of F. Thus, when H0(1, k) is true under the assumptions of normality and homoscedasticity, approximately 10,000 e~ of the observed F values exceeded F~,(k- 1,N-k). For the nonnormal distributions, the similarity between this empirical significance level and the nominal level c~ under normality indicated the robustness of F with respect to the Type-I error probability. The same procedures were used to compute power, except that the parent populations have unequal means ( a n d / o r variances). The same noncentrality parameter q, as used by Srivastava and Tiku was used to indicate the degree of inequality between means, where q~= ~/(62/k)= ,v/(n~.tx2t/ko2), a 2 being the c o m m o n variance when homoscedasticity is assumed and the average of variances when it is not assumed. The critical region was determined from the value of F ~ ( k - 1 , N - k ) under the null hypothesis, and the power of the test was approximated by counting the number of F ' s falling in the critical region. Table 4 (a) and (b) give some of his findings. In (a) are shown the observed significance levels and powers of the F- and F*-tests for three values of a = 0.10, 0.05 and 0.01 for each of the normal, exponential and lognormal distributions in the case of k = 4 groups with a c o m m o n variance. In the table /~/~ is the difference between successive means with /x= 10 in all cases. The observed significance level a ' in the case of the normal distribution and c~* in the case of nonnormal distributions are indicated in the table at ~ - - 0 . The difference between c~ and cf for the normal distribution is due to sampling error and linear interpolation, which is small in all cases. It m a y be observed that the nonnormat distributions lead to conservative levels of significance, i.e., the observed c~* values are always smaller than the normal-theory a level. Thus, if a test is designed with c~ level protection against Type-I error under the assumption of a normal distribution, even more protection against Type-! error

216

P. K. Ito

,iT 0 c5
0 0 0 0 0

~.o
0

~o~

(",,I

e~ 0 0 0 0 ~

ao
1"--

c5ooc5oc5

~ c 5 ~ o

c5c5oc5o

o0o
c5

Z
o

c5

"0-

co

Robustness of A N O VA and MA N O VA test procedures

217

z~
u~

-!
~ d ~ "~ ~ o~'~
0 '-4 "~ ,,_.-

218

P. K. Ito

exists if the distribution is of the nonnormal type specified here. As either n or k increases, the difference between a and a* decreases. Further, the size of ] a * - a [ is greatest under the condition of the lognormal distribution which has the highest skewness and kurtosis. All of these results agree with the previous theoretical findings. It is also observed in the table that over most of the investigated values of ~, the power in the normal case is lower by a substantial amount compared to the case of either the exponential or the lognormal distributions. Further, the power based on the lognormal is greater than it is when based on either the normal or the exponential. As the sample size increases, the power based on the nonnormal distributions approaches that based on the normal distribution. It m a y be observed that small values of n result in larger difference in power. In general the level of significance of the F*-test for the nonnorlnal distribution is smaller than that of the F-test for the normal distribution, but for even small values of ~ ( = 2), the situation with respect to power is reversed, and the power is greater for the nonnormal distribution until ~ gets quite large. These results are based on equal differences between successive means. As long as the groups variances are equal, however, o 2 is independent of the location of % and ~ is proportional to ~ a t 2. The results are therefore valid for variable differences between successive means. Donaldson also considered the case where the group variances are equal to the group means squared. U n d e r H0(1,k ) the group variances are equal, and the previous results apply. The power ( 1 - / 3 * ) of the F*-test together with ( 1 - / 3 ' ) of the F-test are shown in Table 4(b) for k = 2 , n = 1 6 ; k = 4 , n = 4 ; k = 4 , n = 16 at the nominal level a =0.10, 0.05 and 0.01. It is observed that the normal distribution leads to slightly more powerful tests for small values of ~, but as e0 increases, the power for the normal case falls below that of either the exponential or lognormal distributions (except in the case of k = 4, n = 4 for which large values of ep were not obtained). As in the case of equal variances, the lognormal distribution leads to the most powerful tests. Compared to the case of equal variances, however, the power shows less difference for small ~ and greater difference for large ~.
3.4. Modified test procedures

If the usual normal-theory test procedures are found to be sensitive to violations of the assumptions, there is a need to find some alternative or modified robust test procedures, of which significance levels are at least as close to the specified level as possible. Degrees of freedom modifications, intended to improve the correspondence between the distribution of the test statistic and a beta distribution, arose from the permutation approach, and were used by Box and Andersen [6] and others.

Robustness of ANOVA and MANOVA test procedures

219

Let the beta transform U of the F statistic with ~l and v2 degrees of freedom be defined by

U= QB/( Qa + Qw) = u , F / (u2 + ~,F),

(3.18)

where under the assumption of normality U is distributed according to the 1 1 beta distribution with parameters gPl and gu2 when H0(1,k ) is true. To study the distribution of U* in the nonnormal case, Box and Andersen obtained the first two moments of U* under the permutation distribution, respectively denoted by Ee(U* ) and Vare(U* ), assuming that the error terms are independently and identically distributed. The permutation distribution of U* may be approximated by equating these two moments to those of the beta distribution. When there are k groups of n observations each and N = kn, they showed that the permutation distribution of U* may 1 1 be approximated by a beta distribution with parameters 5pld and ~uad, where d= 1+ -N to order N I with c2 k4 k2 1)(N - 2 ) ( N - 3),
C2

(3.19)

k 4 = { N ( N + I)S 4 - 3 ( N - I ) S ~ ) / ( N k

k 2 = S 2 / ( N - 1),

S r = ~,
t=l

~, (xt~ - 2) r.
a=l

Thus, we have a modified ANOVA F*-test with critical region: {F*: F*>F~(u,a, vzd)},

(3.20)

which is robust in the sense that the significance level is approximately equal to the specified level a. Permutation approach also provides an additional method for assessing the consequences of departure from the assumptions in which the effects are shown in the convenient and readily comprehended form of a correction or modification in degrees of freedom in the standard tests. Thus, the distribution of U* under the general nonnormal distribution may be approximated by equating Ef(Ee(U*)) and Ef{Varp(U*)) to the first two moments of the beta distribution, respectively, where Ef denotes the

220

P. K. Ito

expectation under the general nonnormal distribution. By comparing the values of parameters of this approximating beta distribution with those of the normal-theory U, the effect of nonnormality may be expressed as a modification on the values of parameters of U, and hence on the numbers of degrees of freedom of F. Box and Andersen showed that the numbers of degrees of freedom of the normal-theory F must be multiplied by
8 = 1+ = 1 + --

~k4

(3.20

to approximate to order N - ~ the distribution of U* under the general nonnormality.

4.

Effects of nonnormality and/or heteroscedasticity on MANOVA tests

It is only recently that attempts have been started for theoretical investigations of the central and noncentral distributions of M A N O V A test criteria such as R*, T*, W*, V*, U* and S*, when the assumptions of normality a n d / o r homoscedasticity are violated, to study the robustness of the test procedures based on these criteria. There are some small sample studies of the problem, but in most of cases we shall have to content ourselves with asymptotic treatment of the problem for large samples or with Monte Carlo studies. As shown in Section 3, we have rather strong evidence, both theoretical and empirical, that the A N O V A F-test which is derived under the assumptions of normality and homogeneity of variance is in fact extremely robust under violation of these assumptions. The major exception to this statement occurs for small and unequal sample sizes. In a special M A N O V A case with two groups, the TZ-test has been found on the basis of some large sample theory and Monte Carlo studies to be rather robust against heterogeneity of var-covar matrices and especially against nonnormality Unfortunately, however, theoretical or empirical evidence is still in short supply for the study of robustness properties of M A N O V A tests in general. Most indications that are gathered from recent works are that at least some of M A N O V A tests will display robustness properties similar to those of the ANOVA F-test, although the higher the dimensionality, the less robust. 4.1. Theoretical investigations

Ito [19] obtained asymptotic expressions for the central and noncentral distributions of To 2. (and T 2. as a special case when k = 2) for large values of sample sizes. In what follows, when we say that the sample sizes are

Robustness of ANOVA and MANOVA test procedures

221

large or become infinite we mean that the Nt's, and hence N, are large or become infinite with rt= Nt/N held fixed for t = 1,2 ..... k. Let us assume (a) that the N t are so large that the elements of the St provide the exact values of the elements of the Nt and (b) that the statistic X0 z* obtained under assumption (a) is approximately distributed like a constant multiple of a central chi-square. Under (a), T~* is distributed like X02.= trQ~. ~ - l where I~ = 52t=l(rtZt). ,k The first two moments of Xg* are found to be
k

(4.1)

E(X2.) = t r
and Var(x2*)=2tr

Z {(1 - rt)Zt]~ ' + Nt~ta'fi2 -1 }


t=l

(4.2)

t=l k t=l

(l

. . - ,) 2 + l(p) + 2 ~ 2rt)(ZtN
t=l P 1

Nta,~t't~,-IZtSg-'

+ 4 E ( 1 - r , ) x ~b'2(a:~-l);(l~-')j,

+ t~" =l

N,

where (atl~ , "- 1 )i is the i-th element of 1 p row vector a'tl~- 1 and (l~- l)ij is the i-th row, j-th column element of l~-1, etc. When Ho(p, k) is true, they become

E(X2.) = t r
and

Z (1 - rt)Zgtl~-'
t=l

(4.4)

Var(x2*)=2tr

~ (l-2rt)(Z,-)
t=l

12

+l(p)

k (1-rf{ p + tE Z ~l Nt 1

(4.5)

Under (b), Xg* is approximately distributed like cx2(f), where c is a constant and x 2 ( f ) is a central chi-square with f degrees of freedom, and c and f are determined in such a way that the first two moments of X2. are equal to those of cx2(f), respectively. Thus, c and f are obtained by

222

P. K. lto

solving simultaneously the equations:

cI= e(xg*),
2c2f= Var(x~* ) i.e.,
=-

(4.6)

Var(x02.) 2E(xg*) ' (4.7)

i=

Var(x~*)

Hence the central and noncentral distributions of T 2. are approximated for large values of sample sizes by cx2(f), where e a n d f for the former are given by (4.7) on substitution of (4.4) and (4.5) for E(X 2.) and Var(x2*), respectively, while those for the latter are given by (4.7) on substitution of (4.2) and (4.3) for E(X 2.) and Var(x~*), respectively. Now it is well known that under the assumptions of normality and homoscedasticity, T 2 is distributed asymptotically as a central chi-square with p ( k - 1) degrees of freedom as the sample sizes become infinite when Ho(p,k ) is true. Therefore, the actual level of significance a* and power (1-/3*) of the To2*-test may be approximated for large values of N as follows:

a* = P( Tg* > r~,~(p, k - l, N - k)lgo(p,

k))
(4.8)

--p(x2(f) > X~(p(k - 1))/clHo(p,k) ),

where To2,~(p,k- 1 , N - k ) is the upper 100a% point of the normal-theory T02-distribution and X~(p(k- 1)) is that of a central chi-square distribution with p ( k - 1) degrees of freedom, c and f are given by (4.7) together with (4.4) and (4.5), and
1 -

fl* = P(rg* > r 2 ( p , k - I , N -

k)[H,(p,k))
(4.9)

--P(x2(f) > X~(p(k - 1))/c]Hl(p,k)), Ho( p, k) is true, (4.5) is expressed as


Var(x02*)=2tr ~] (1
t=l

where c and f are given by (4.7) together with (4.2) and (4.3). When

,) 2 +l(p) +O(N-') (4.1o) -:r,)(z,z-

for large values of the sample sizes. That is to say, as long as the Art's are large, approximate values of a* are not much affected from violation of

Robustness of A N O V A and MANOVA test procedures

223

the assumption of normality. When Ho(p,k ) is not true, and if lvtl/2olt = O(1) for all t, and if
k t=l

and

~P2(N) = tr ~] Ntatot't'~-IXt~-1---~2 o
t=l

as N ~ o o , where tpi0 and ~20 are constants, then (4.2) and (4.3) become, as N--->oo,
k

E(Xg*) = t r E (1 - rt)Xt~.-'+ qqo,


t=l

(4.1 1)
1 2 }

and Var(xg*)=2tr ~ (1-2rt)(XtZ-)


t=l k "

+I(p)

+4~b20,

(4.12)

respectively. Therefore, as long as the N t are large, approximate values of (1 fl*) are not much affected from violation of the assumption of normality. Thus, it may be said that for sufficiently large sample sizes, the T02-test is quite robust under violation of the assumption of normality. However, it is very difficult to investigate theoretically how large "sufficiently large" is, and we shall have to relY on Monte Carlo studies to see this, some of which results are presented in Section 4.2. If the observations are assumed to follow normal distributions with var-covar matrices different from group to group, then all ~,.(j] and x/(j]m vanish in (4.3). This expression together with (4.2) was obtained by Ito and Schull [20] when they studied the robustness of the T02-test under violation of the assumption of homoscedasticity. By means of these results they evaluated approximate actual levels of significance and powers of the T02*-test at the nominal level a =0.05 to show the effect of inequality of var-covar matrices (i) in case of k--2, when c(Nl~2 1) are equal for p = 1,2, 3, 4 and c('Zl~ ~- 1) are distinct f o r p = 2, and (ii) in cases of k = 3 and 5 for some combinations of the IEt and the rt for p = 1,2, 3, 4. In evaluating actual powers of the test, the noncentrality parameter
-

(x, x2t-'
was used when k = 2, where the difference in mean vectors is concentrated in one canonical dimension, and two kinds of concentrated structure of

224

P. K. Ito

noncentrality, H[l)(p, k) and H(12)(p,k), were considered when k = 3 and 5, where H[(p,k): alvsa2vaa3 . . . . . ak----0 and H[2)(p,k): 0 = a I . . . . . a~-2g=a~-1 =/=ak. F r o m their numerical results it may be concluded that in the case of two groups of nearly equal sizes, the effects of inequality of ~1 to 2;2 on the significance level and power of the T~*-test are not pro~ nounced as long as c(]~l~2-1) remain within the range (0.5,2) if both samples artlqt~,et, large. It is also noted in the case of k samples that if the groups are oI equal size, moderate inequality of var-covar matrices does not affect the T0Z-test seriously as long as the samples are very large, but when they are of unequal size, quite large effects occur on the significance level and power of the test. These results are to be compared with those obtained by Box [4] in the case of the ANOVA F-test, although his results are exact in the sense of being based on the small sample theory, while the results of Ito and Schull are approximate because they are derived on the basis of the asymptotic theory. It is noted, however, that both results point to the same direction of discrepancies in probability and are of the same order of magnitude. Ito [19] also obtained asymptotic expressions for the central and noncentral distributions of Tff* for large values of sample sizes in a similar way. The first two moments of the X~* statistic which is obtained on substituting l~t for S t, t = 1,2 .... ,k, in Tv 2. of (2.17), are found to be

E(X2v*) =p(k and

1) + tr ~
t=l

Ntot(~)(a(,t))']Et I
k

(4.13)

Var(x~*)=2p(k-1)+4tr
k P

~
t=l

Nto~(~)(a(~))'X; 1 NtZtlA-IZtl)jl
t l)(] (4.14)

+ 4 X ~] g ~ I ( ( " ( * t ) ) ' ~ t l ) i ( ] ~ t - 1 - -

t=l 1 k P + Z N N,-1/~}j-/)m(,~--~t1 -- N t X t l h - i x t=l 1 )< (~2t-- 1 -- N t Z t

1A-~X; 1)lm,

where it is assumed that the M A N O V A model is, instead of (2.1), ext. pressed as follows: xt~ = / ~ , + a(~ / + eta, (4.15)

A kt = l A t- It is noted that under the with g"t=~"klZXt~*A ,~(t)=O, At==(~.t/Nt)- i , ~ l = ~. ~ assumptions of normality and homoscedasticity both T~ and To 2 are

Robustness of A N O V A and M A N O V A test procedures

225

asymptotically distributed as a central chi-square with p ( k - 1 ) degrees of freedom when Ho(p, k) is true. Hence, comparison of (4.13) and (4.14) with (4.2) and (4.3) shows that while the T~*-test and T2*-test are affected by violation of the assumption of normality asymptotically in a similar way, the former is asymptotically less affected than the latter by violation of the assumption of homoscedasticity.

4.2. Monte Carlo studies


Hopkins and Clay [16] reported the results of their Monte Carlo studies on the effects of inequality of var-covar matrices and of kurtosis on the central distribution of T 2. when p = 2 . 1,000 pairs of samples of size N 1 and N 2 were taken from bivariate normal populations N(0,o2I) and N(O, o~I), respectively, where 02/01 = 1, 1.6 and 3.2, and also from bivariate symmetrical leptokurtic populations with kurtosis 3.2 and 6.2 by mixture using 80% of N(0,I) and 20% of N(0,o21) with o =2.5 and 3.7. In both cases, the frequencies with which calculated values of the 1,000 T 2. exceeded specified upper percentage points of the central normal-theory T2-distribution were recorded. Some of their results are presented in Table 5 (a) and (b). Table 5(a) suggests that the central distribution of T 2. for pairs of bivariate normal samples with N~,N2>~ 10 is rather robust in respect of inequality of var-covar matrices, but that as in the univariate case, this robustness does not extend to unequal sample sizes. From Table 5(b), it is observed that leptokurtosis has no substantial effect on the upper tail frequencies listed for any case with both N1,N2>~ 10. The last two columns of the table suggest that for smaller N the central normal-theory level of significance a =0.05 may provide slightly conservative tests of significance of differences in mean vectors from bivariate symmetrical leptokurtic populations. This is in conformity with Gayen's results for univariate symmetrical leptokurtic Edgeworth series distributions (1949). Holloway and Dunn [15] approximated the central and non-central distributions of T 2. by means of drawing 5,000 or 10,000 pairs of samples of size N 1 and N 2 from multivariate normal distributions for p = 1,2,3, 5, 7 and 10 to study the effect of inequality of var-covar matrices on the level of significance and power of the T2*-test. For the nominal level a =0.05 and 0.01 the actual level a* and power (1 - f l ~ ) were calculated for various departures from homoscedasticity using both equal and unequal sample sizes. From the results of their studies it may be said that under inequality of var-covar matrices one has a test whose actual level of significance a* may be very different from the nominal level and that a* tends to be too large rather than too small if the departure from equality is pronounced. For such "too large" significance levels, the power ( 1 - f l * ) for small

226

P. K. Ito

Table 5 (a) Observed relative frequencies of exceedance of specified null normal percentage points by T 2. for 1,000 pairs of samples of size N 1 a n d N 2 from bivariate N(0,o211) a n d N(0,o~I) Parameters N~ 5 10 20 5 10 10 20 5 10 20 5 10 10 20 N2 5 10 20 10 20 5 10 5 10 20 10 20 5 10 o2/ol 1.6 1.6 1.6 1.6 1.6 1.6 1.6 3.2 3.2 3.2 3.2 3.2 3.2 3.2 0.75 pt. 0.754 0.767 0.741 0.686 0.697 0.804 0.816 0.777 0.757 0.769 0.639 0.593 0.866 0.859 Relative frequencies of exceedance 0.50 pt. 0.506 0.512 0.493 0.405 0.395 0.626 0.607 0.554 0.519 0.519 0.357 0.282 0.718 0.685 0.25 pt. 0.263 0.259 0.237 0.174 0.163 0.376 0.340 0.328 0.262 0.268 0.145 0.095 0.551 0.476 0.10 pt. 0.110 0.096 0.097 0.058 0.047 0.181 0.175 0.159 0.121 0.122 0.043 0.032 0.338 0.311 0.05 pt. 0.055 0.052 0.052 0.026 0.019 0.110 0.094 0.083 0.070 0.068 0.015 0.010 0.242 0.214

(b) Observed relative frequencies of exceedance of specified homoscedastic normal percentage points by T 2. for 1,000 pairs of samples of size N 1 a n d N 2 from bivariate symmetrical leptokurtic populations Parameters Nl 5 10 20 5 10 5 10 20 5 10 N2 5 10 20 10 20 5 10 20 10 20 Kurtosis 3.2 3.2 3.2 3.2 3.2 6.2 6.2 6.2 6.2 6.2 0.75 pt. 0.775 0.776 0.769 0.780 0.753 0.814 0.789 0.771 0.806 0.795 Relative frequencies of exceedance 0.50 pt. 0.515 0.514 0.515 0.526 0.518 0.540 0.531 0.535 0.526 0.547 0.25 pt. 0.228 0.243 0.257 0.249 0.286 0.220 0.237 0.266 0.219 0.256 0.10 pt. 0.074 0.089 0.107 0.079 0.113 0.081 0.084 0.095 0.067 0.082 0.05 pt. 0.039 0.037 0.059 0.046 0.063 0.035 0.032 0.041 0.019 0.037

F r o m p. 1050 and p. 1052 of "Some empirical distributions of bivariate T 2 a n d h o m o scedasticity criterion M under unequal variance a n d leptokurtosis," by J. W. Hopkins and P. P. F. Clay, J. Amer. Statist. Assoc., 58(1963). Reproduced with the kind permission of the authors and the editor.

Robustness of A N O V A and MANOVA test procedures

227

departures from the null hypothesis is higher than would be expected (here the power is low anyway); for large departures from the null hypothesis it is lower than would be expected. These tendencies increase with the number of variates, with the size of departures from the assumption of homoscedasticity, and with decrease in sample size. It should be noted as of special importance that power is often considerably reduced by deparo tures which leave the level of significance satisfactory. Equality of sample sizes is advisable for moderate departures for maintaining the level of significance close to the nominal level, but does not help in maintaining power. With unequal sample sizes one may have a test with unreasonably large level of significance and a somewhat higher value of the power, or a test with a very low significance level and a very low power. Chase and Bulgren [8] presented some results of Monte Carlo studies on the robustness of the central T 2 for the bivariate one-sample problem involving samples from skewed and correlated populations. The distributions sampled were (i) the bivariate normal (as a check on the procedures), (ii) the bivariate uniform, (iii) the bivariate exponential, (iv) the bivariate gamma, (v) the bivariate lognormal, and (vi) the bivariate double exponential distributions with sample sizes of 5, 10 and 20. Their general conclusion is that highly skewed distributions resulted in too many extreme values of T 2. while other distributions gave conservative results. Korin [24] reported results of his Monte Carlo studies of the effect of heteroscedasticity on the levels of significance of the T-, W- and R-tests. They are presented in Table 6 where N 1= N 2 . . . . N~ = n. Departures from equality of var-covar matrices are specified by diagonal matrices of two different forms, indicated by A(d) and B(d). The symbol A(d) indicates matrices of the type {I,I, dI} for k = 3 and {I,I,I,I,I, dI} for k = 6 , while B(d) indicates forms (I, d l , 2 d l } for k = 3 and {I,I,I,I, dl,2dI} for k--6. The procedure followed consisted of setting the nominal level of significance a at 0.05 under the assumption of homoscedasticity and of drawing at least 1,000 samples for each parameter set, computing the values of the three statistics from c(Q~Q~v-1). Samples were also taken with no violation of the assumption in order to observe whether the expected proportion of the computed statistics would fall in the specified critical regions. Close agreement between expected and empirical results occurred. It can be noted from their results that heterogeneity of var-covar matrices produces somewhat too many significant results in the W*-test, even more in the T*-test and still more in the R*-test. Although Ito and Schull [20] concluded that if the sample sizes are large, T is not seriously affected by violation of the assumption of equality of var-covar matrices, it appears that for small samples, even when all are of the same size, the same conclusion is not appropriate.

228

P. K. Ito

Table 6 Observed significance levels of the T*-, W*- and R*-tests under violation of the assumption of homoscedasticity when the nominal level= 0.05

(p, k, n)
(2, 3, 5)

Var-covar matrix form A(1.5) B(1.5) A(10.0) A(1.5) B(1.5) A(10.0) A(I.5) B(1.5) A(10.0) B(10.0) B(1.5) A(10.0) B(10.0) A(1.5) B(1.5) A(10.0) A(I.5) B(1.5) A(10.0)

Observed significance levels T*-test 0.06 0.08 0.12 0.05 0.06 0.09 0.05 0.06 0.14 0.17 0.07 0.14 0.13 0.07 0.07 0.14 0.05 0.06 0.22 W*-test 0.06 0.07 0.12 0.05 0.05 0.08 0.05 0.06 0.13 0.15 0.07 0.13 0.12 0.06 0.06 0.12 0.04 0.06 0.18 R*-test 0.06 0.08 0.13 0.05 0.07 0.10 0.05 0.08 0.17 0.19 0.08 0.17 0.16 0.07 0.08 0.20 0.06 0.08 0.31

(2,3,10)

(2, 6, 5)

(2,6,10)

(4,3,10)

(4, 6, 7)

From p. 216 of "Some comments on the homoscedasticity criterion M and the multivariate analysis of variance tests T 2, W and R," by B. P. Korin, Biometrika, 59(1972). Reproduced with the kind permission of the author and the editor.

O l s o n [31] reported the results of his very c o m p r e h e n s i v e M o n t e Carlo studies o n a c o m p a r a t i v e r o b u s t n e s s of six M A N O V A tests, i.e., R- , T~, W-, V-, U- a n d S-tests b o t h in terms of significance level a n d power u n d e r c e r t a i n violations of n o r m a l i t y a n d h o m o g e n e i t y of var-covar matrices. I n w h a t follows, we shall show s o m e of his general c o n c l u s i o n s (for the details the reader is referred to his original paper). D i m e n s i o n a l i t y p, n u m b e r of groups k a n d n u m b e r of o b s e r v a t i o n s per g r o u p n have some b e a r i n g o n the r o b u s t n e s s of M A N O V A tests~ F o r example, o n e generally will n o t do worse b y m a k i n g the d i m e n s i o n a l i t y p smaller insofar as it is u n d e r control. Similarly, one generally will n o t do worse by r e d u c i n g the n u m b e r of groups k insofar as it is a flexible p a r a m e t e r . Surprisingly e n o u g h , r o b u s t n e s s properties are n o t always opti-

Robustness of ANOVA and MANOVA test procedures

229

mized by increasing the group size n. Larger groups are generally an advantage for R*, T*, W* and V*, but not for U* and S*. Small groups are also preferable with respect to robustness when the V*-test is used if the homogeneity assumption is violated. However, any robustness advan. tage in smaller groups must always be balanced against the corresponding loss in power to detect a given group-mean difference. Above all, effort should be made to maintain groups of equal size. Departures from the assumptions of M A N O V A have substantially dif-ferent effects on the rival test statistics, but recommendation of one of the six criteria depends in part on the relative weight one attaches to the level of significance and the power. Olson's view is that very high Type-I error probability makes a test dangerous; low power merely makes it less useful. Accordingly, the R*-test, which produces excessive rejections of the null hypothesis under both kurtosis and var-covar heterogeneity, is rejected. For protection against kurtosis, the choice will be from among T*, W* and V*, any of which could be acceptable. The V*-test is generally better than the others in terms of significance level, but T* and W* are sometimes more powerful. Criteria U* and S* are appreciably less powerful than T*, W* and V* in concentrated noncentrality structure under kurtosis. Moreover, U* and S* have liberal Type-I error probabilities when p and k are both greater than about 5. For protection against heterogeneity of var-covar matrices, the T*-test and W*-test should be avoided, as they tend to behave rather like the R*-test in this case. The V*-test stands up best to violation of homogeneity of var-covar matrices, although its significance level is somewhat high. For the range of p and k included in the present study, U* and S* are generally conservative and less powerful. With these observations on the results of his Monte Carlo studies, Olson concluded that for general protection against departures from normality and from homoscedasticity in MANOVA, the V*-test is recommended as the most robust of the M A N O V A tests, with adequate power against a variety of alternatives.
4. 3. Modified test procedures

Following the permutation approach of Box and Andersen [6] and Box and Watson [7], Mardia [29] studied the effect of nonnormality on the actual levels of significance of M A N O V A tests, finding that in the case of M A N O V A this approach is applicable only to the V statistic of (2.14), which is of a Mahalanobis distance type. It is expressed as
k

v=trQB(Qs+Qw)
=D2/(N - 1),

- 1 = ~] N t ( x t - ' x ) ' Q - l ( ' A t - x )


t=l

(4.16)

230

P. K. lto

where Q = Q B + Q w and D 2 is the generalized Mahalanobis distance statistic. Mardia obtained the first two permutation moments of V* under nonnormality as follows:

Ee(V*)=p(k- 1 ) / ( U - 1)
and

(4.17) (4.18)

Vare(V*)=VarN(V ) 1+ 2 N ( N - 1) Cx'Cr '


where the normal-theory variance, VarN(V), of V is given by VarN(V) = 2p(k - I ) ( N - k)(N - p - 1) ( N + 1 ) ( N - 1)2(N-2) and

Cx= ( k - 1 ) ( N - 3 ) ( N - k) { N ( N + 1 ) R - 2 ( K - 1 ) ( N - k)}, Cr= p ( W - 3 ) ( N - p with


k k2

N-1

N-1

1) { ( N + 1)bp--(N-1)p(p +2)}

R=~'Nt-'
t=l k N,

N'
{(Xta - x ) ' Q - l ( x t - ~ ) } 2.

bp=N ~_,
t=l a=l

To approximate to the permutation distribution of V* it is noted that the normal-theory V divided by p may be approximated by the beta distribu1 1 tion with parameters 5v 1 and ~v2, where vl=p(k-1) and v 2 = p ( N - k ). Now a beta distribution is fitted to the permutation distribution of V*/p 1 1 by estimating the values of parameters ~ v~' and ~ v~ so that the first two moments of the beta distribution are equal to Ee(V*/p) and Vare(V*/p), respectively, where the former is obtained from (4.17) and the latter is approximated from (4.18) as follows:

Vare( V* /p)= VarN( V )'( I + c)


- - V a r B ( V ) . ( 1 + c) (4.19)

where VarB(V/p) is the variance of the beta distribution with parameters I vI and 7 vz, which approximates to the distribution of V/p under normality, and is given by VarB(V/p) = 2(k - I)(N - k) ( N - 1 ) 2 { p ( N - 1) +2}

Robustness of A N O V A and M A N O V A test procedures

231

and
C-~

N-3

2N(N-

1)

Cx'Cy"

Now for a beta distribution with mean and variance, /t and 0 2, respec1 1 tively, and parameters g v~ and ~ P2, we have p, =2/~(/~ - / z 2 ~2 = ~,(1 - ~ ) / ~ . By substituting Ee(V*/p ) and Vare(V*/p ) for /z and a 2, respectively, a I , 1 , beta distribution is obtained with modified parameters ~ v~ and ~P2, which approximates to the permutation distribution of V*/p, where

02)/02,

v~ = pl.d--p(k- 1)-d, ~ = pfd=p(U- k).d


with

(4.20) or d--~= 1+

d= p(U- 1 ) - 2 c p(U- 1)(1 +c)

c{p(U-1)+2} { p(U- 1 ) - 2 c } "

To terms of order N - 1, we have d = 1 + Cx. Cv/2N. For M A N O V A with equal groups, Nt=N/k, t= 1,2 ..... k, we have (N-3)Cx/(N- 1)-- - 2 , and hence d-l--" 1 - ~

1{

P(N-1)+2 }.Cr ' p(N_I)+2Cr/N


Hence the V*-test for

or to order N -l, d = 1 + region:

Cv/N.

Ho(p, k) with

critical

V*: B,, -~q,-~p~

<--<1 P

(4.21)

is a robust test whose significance level is approximately equal to a, where 1 , 1 , B,~(~v~ ,5v2) is the upper 100a% point of the beta distribution with 1 ~ 1 parameter 5vl and 7p~. Now let f be the nonnormal distribution of the errors. Then from (4.17) and (4.18), the unconditional moments of V* are obtained:

E( V*)= Ef{ Ep( V*)) =p(k- I ) / ( N - 1)


and

(4.22)

Var(V*) = EAVar A V*))


=Var~v(V )

( l+

2 N N-3 (N-1)

Cx.Vr} ~

(4.23)

232

P. K. Ito

where Fr=Ey(Cv ). If f is normal, then Var(V*) is equal to VarN(V ) so that EN(Cv)=0. Using these results, the central distribution of V*/p under general nonnormality of the errors is approximated by the beta distribution with parameters given by (4.20), where d is replaced by

6=- p ( N - 1 ) - 2 E f ( c ) p ( N - 1)( 1 + Ef(c))

(4.24)

with Ey(c)=(N-3)Cx'Fr/2N(N-1). Thus, when all the groups are of equal size, the corrective factor 8 for the unconditional distribution of V*/p is 1 +Fy/N to terms of order N -1, which is negligible for large samples. Consequently, the significance level of the V*-test is not much affected when all the groups are of equal size as long as the sample sizes are large. This result is a multivariate generalization of Box and Andersen's result (3.21) for the ANOVA F*-test. To assess numerically the effect for moderately unequal groups, Mardia proceeded as follows. The normaltheory significance level c~ of the V-test is approximated by means of the I 1 beta distribution with parameters ~ v~ and g v2, i.e.,

V llHo(p,k))"

(4.25)

Hence the actual level of significance a* of the V*-test may be approximated by


1 a $ .- - P ( B ~ ( 51 v i, 5v2) <

V*/p < llHo(P,k)),

(4.26)

where V*/p follows approximately the beta distribution with parameters ~v l ~ * and ~ v*2, which are given by (4.20) together with (4.24). Table 7 gives approximate values of a* for N = 20, k = 2 and a = 0.05. It is observed from the table that the actual level of significance of the V*-test when applied to moderately nonnormal data is not likely to differ by more than 2% at the nominal level of 0.05. It is found that a* is greater than a when 8 < 1 and it is less than a when 8 > 1. Further, the higher the value of p, the larger is the discrepancy. Yao [45] studied an approximate degrees of freedom solution to the multivariate Behrens-Fisher problem, which is to test H0(p,2 ) under violation of the assumption of equality of var-covar matrices. Extending the work of Welch [42, 43], he proposed to use a test criterion:
Sl -i

Robustness of ANOVA and MANOVA test procedures

233

Table 7 Approximate actual significance levels 100a*% of the V*-test under violation of the assumption of normality for ct = 0.05, k = 2, N = 20, r I = N I / N and F r o ~ ( N - 3 ) F y / ( N - 1) P
r I

Fy0 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0A 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5

- 1.00 4.00 4.91 5.16 5.26 5.28 3.75 4.88 5.21 5.34 5.37 3.63 4.87 5.23 5.34 5.42 3.49 4.85 5.26 5.43 5.47 3.36 4.84 5.29 5.45 5.53

-0.75 4.26 4.93 5.12 5.19 5.21 4.07 4.91 5.16 5.25 5.28 3.98 4.90 5.17 5.28 5.31 3.87 4.89 5.20 5.32 5.36 3.76 4.88 5.22 5.36 5.40

-0.50 4.52 4.95 5.08 5.13 5.14 4.39 4.94 5.10 5.17 5.19 4.32 4.93 5.12 5.19 5.21 4.25 4.92 5.13 5.21 5.24 4.17 4.92 5.15 5.24 5.26

-0.25 4.77 4.98 5.04 5.07 5.07 4.70 4.97 5.05 5.09 5.09 4.66 4.97 5.06 5.09 5.10 4.62 4.96 5.07 5.11 5.12 4.59 4.96 5.07 5.12 5.13

0.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00

0.25 5.22 5.02 4.96 4.93 4.93 5.29 5.03 4.95 4.91 4.91 5.33 5.03 4.94 4.90 4.89 5.37 5.04 4.93 4.89 4.88 5.42 5.04 4.93 4.88 4.87

0.50 5.44 5.05 4.92 4.87 4.85 5.58 5.06 4.89 4.83 4.81 5.65 5.07 4.88 4.81 4.79 5.74 5.07 4.87 4.78 4.76 5.83 5.08 4.85 4.76 4.74

0.75 5.64 5.07 4.88 4.80 4.78 5.86 5.09 4.84 4.74 4.71 5.97 5.10 4.82 4.71 4.68 6.11 5.11 4.80 4.68 4.64 6.25 5.12 4.78 4.64 4.60

1.00 5.83 5.09 4.84 4.73 4.70 6.13 5.12 4.79 4.65 4.62 6.28 5.13 4.76 4.61 4.57 6.47 5.15 4.73 4.57 4.52 6.66 5.17 4.71 4.52 4.47

From p. 116 of "The effect of nonnormality on some multivariate tests and robustness to nonnormality in the linear model," by K. V. Mardia, Biometrika, 58(1971). Reproduced with the kind permission of the author and the editor.

with critical region:

T~ > Td(p,fr))
of approximate level of significance o~. T~(p,fr ) is the upper 100a% point of the central Hotelling's T2-distribution with dimensionality p and degrees of freedom fr, where f r is given by 1 = 1 y'V-lVlV-ly fT 171 T2

1 + ~2

y'V IVzV-T2

ly):

(4.28)

234

P. K. lto

with y = x l - x 2 ; V t = S t / N t , t = I , 2 ; V = V ~ + V 2. This approach is to be compared with James' asymptotic series approach, where the same test statistic Tf of (4.27) is used. This statistic is a special case of (2.17) when k = 2. James [21] proposed to use for testing Ho(p, 2) under violation of the assumption of equality of var-covar matrices a critical region of approxi-mate significance level a:

{ v:: v[ > hi(v,, v:;.)},


where

(4.29)

P( v: > hi(v,, v:;-)1Ho(p, 2)) = . + O(N,-J-').

It was found that

ho(V,, v~; ,~) = x~(p)


and

(4.30)

hl(VpV2; a) = X.~(p) 1+ ~ p
with

{ l(k, k2d(p))}
+ p(p+~) ,

(4.31)

k,= E (trV- 1V,) 2 /n,,


t=l 2

k2 = x~, { ( t r V _ W t ) 2 + 2 t r V _ l V t V _ l V t } / n t .
t=l

Comparative Monte Carlo studies for according to the approximate degrees to James' asymptotic series approach sets of (E~, E2), which suggest a slight

the significance levels of the T2-test of freedom approach and according were carried out for some selected superiority for the former.

References
[1] Anderson, T. W. (1958). An Introduction to Multivariate StatisticalAnalysis. Wiley, New York. [2] Bartlett, M. S. (1935). The effect of non-normality on the t-distribution. Proc. Cambridge Philos. Soc. 31, 223-231. [3] Bartlett, M. S. (1939). A note on tests of significance in multivariate analysis. Proc. Cambridge Philos. Soc. 35, 180-185. [4] Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems. I. Effects of inequality of variance in the one-way classification. Ann. Math. Statist. 25, 290-302. [5] Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems. II. Effects of inequality of variance and of correlation between errors in the two-way classification. Ann. Math. Statist. 25, 484--498.

Robustness of A N O V A and M A N O V A test procedures

235

[6] Box, G. E. P. and Andersen, S. L (1955). Permutation theory in the derivation of robust criteria and the study of departures from assumption. J. Roy. Statist. Soc. B 17, 1-34. [7] Box, G. E. P. and Watson, G. S. (1962). Robustness to non-normality of regression tests. Biometrika 49, 93-106. [8] Chase, G. R. and Bulgren, W. G. (1971). A Monte Carlo investigation of the robustness of T 2. J. Amer. Statist. Assoc. 66, 499-502. [9] David, F. N. and Johnson, N. L. (1951). The effect of non-normality on the power function of the F-test in the analysis of variance. Biometrika 38, 43-57. [10] Donaldson, T. S. (1968). Robustness of the F-test to errors of both kinds and the correlation between the numerator and denominator of the F-ratio. J. Amer. Statist. Assoc. 63, 600-676. [11] Gayen, A. K. (1949). The distribution of 'Student' t in random samples of any size drawn from non-normal universes. Biometrika 36, 353-369. [12] Gayen, A. K. (1950). The distribution of the variance ratio in random samples of any size drawn from non-normal universes. Biometrika 37, 236-255. [13] Geary, R. C. (1936). The distribution of 'Student's' ratio from non-normal samples. ,L Roy. Statist. Soc. Suppl. 3, 178-184. [14] Geary, R. C. (1947). Testing for normality. Biometrika 34, 209-242. [15] Holloway, L. N. and Dunn, O. J. (1967). The robustness of HotellJng's T 2. J. Amer. Statist. Assoc. 62, 126-136. [16] Hopkins, J. W. and Clay, P. P. F. (1963). Some empirical distributions of bivariate T 2 and homoscedasticity criterion M under unequal variance and leptokurtosis. J. Amer. Statist. Assoc. 58, 1048-1053. [17] Horsnell, G. (1953). The effect of unequal group variances on the F-test for the homogeneity of group means. Biometrika 40, 128-136. [18] Hotelling, H. (1951). A generalized T-test and measure of multivariate dispersion. In: J. Neyman ed., Proc. Second Berkeley Syrup. Math. Statist. Prob., 23-41. University of California Press, Berkeley CA. [19] Ito, K. (1969). On the effect of heteroscedasticity and nonnormality upon some multivariate test procedures. Multivariate Analysis, Vol. 2, 87-120. Academic Press, New York. [20] Ito, K. and Schull, W. J. (1964). On the robustness of the 7"02 test in multivariate analysis of variance when variance-covariance matrices are not equal. Biometrika 51, 71-82. [21] James, G. S. (1954). Tests of linear hypotheses in univariate and multivariate analysis when the ratios of the population variances are unknown. Biometrika 41, 19-43. [22] Kendall, M. G. and Stuart, A. (1963). The Advanced Theory of Statistics, Vol. 1. Griffin, London. [23] Kendall, M. G. and Stuart, A. (1966). The Advanced Theory of Statistics, Vol. 3 (Chapter 37). Griffin, London. [24] Korin, B. P. (1972). Some comments on the homoscedasticity criterion M and the multivariate analysis of variance tests T 2, W and R. Biometrika 59, 215-216. [25] Krishnaiah, P. R. (1965). Multiple comparison tests in multi-response experiments. Sankhya A 27, 31-36. [26] Krishnaiah, P. R. and Waikar, V. B. (t971). Simultaneous tests for equality of latent roots against certain alternatives-I. Ann. Inst. Statist. Math. 23, 451-468. [27] Krishnaiah, P. R. and Waikar, V. B. (1972). Simultaneous tests for equality of latent roots against certain alternatives-II. Ann. Inst. Statist. Math. 24, 81-85. [28] Lawley, D. N. (1938). A generalization of Fisher's z-test. Biornetrika 30, 180-187; correction, ibid. 467-469.

236

P. K. I t ,

[29] Mardia, K. V. (1971). The effect of norm.finality on some multivariate tests and robustness to nonnormality in the linear model. Biometrika 58, 105-12L [30] Nanda, D. N. (1950). Distribution of the sum of roots of a determinantal equation under a certain condition. Ann. Math. Statist. 21, 432-439. [31] . I s , n , C. L. (1974). Comparative robustness of six tests in multivariate analysis of variance. J. Amer. Statist. Assoc. 69, 894-9080 [32] Pearson, E. S. (1931). The analysis of variance in cases of non-normal variation. Biometrika 23, 114-133. [33] Pillai, K. C. S. (1955). Some new test criteria in multivariate analysis. Ann. Math. Statist. 26, 117-121. [34] Plackett, R. L. (1960). Principles of Regression Analysis (Chapter 5). Oxford University Press, London, New York. [35] Roy, J. (1958). Step-down procedure in multivariate analysis. Ann. Math. Statist. 29, 1177-1187. [36] Roy, S. N. (1957). Some Aspects of Multivariate Analysis, Wiley, New York. [37] Roy, S. N., Gnanadesikan, R. and Srivastava, J. N. (1971). Analysis and Design of Certain Quantitative Multiresponse Experiments. Pergamon Press, Oxford. [38] Scheff~, H. (1959). The Analysis of Variance. Wiley, New York. [39] Srivastava, A. B. L. (1959). Effects of non-normality on the power of the analysis of variance test. Biometrika 46, 114-122. [40] Tiku, M. L. (1964). Approximating the general non-normal variance-ratio sampling distributions. Biometrika 51, 83-95. [41] Tiku, M. L. (1971). Power function of the F-test under non-normal situations. J. Amer. Statist. Assoc. 66, 913-916. [42] Welch, B. L. (1936). Specification of rules for rejecting too variable a product, with particular reference to an electric lamp problem. J. Roy. Statist. S,c. Suppl. 3, 29-48. [43] Welch, B. L. (1947). The generalization of Student's problem when several populations are involved. Biometrika 34, 28-35. [44] Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24, 471-494. [45] Yao, Y. (1965). An Approximate degrees of freedom solution to the multivariate Behrens-Fisher problem. Biometrika 52, 139-147.

P. R. Krishnaiah, ed., Handbook of Statistics, VoL 1 North-Holland Publishing Company (1980) 237-278 f

Analysis of Variance and Problems under Time Series Models


D a v i d R. Brillinger j

Dedicated to the memory of Henry Seheffd

"...what may be considered the major purposes of the analysis of variance: to provide a simple summary of the variation in the experimental data, and to indicate the stabihty of means and other meaningful quantities extracted from the data." Green and Tukey (1960)

1.

Introductory remarks

The concern of this work is the analysis of the variation present in data having the character that the basic responses measured are time series. Such measurements are occasionally referred to as; longitudinal data, serial data, repeated measurements, sequential observations and response curve data. A common wish is to use such data to measure and understand the change, if any, produced by an alteration of conditions (perhaps induced by the experimenter, perhaps outside of his control). On occasion the dependence on time may be an unavoidable nuisance and the wish is to adjust somehow for its effect. Serious problems can arise if conventional analyses, ignoring dependence on time, are employed when real dependence on time is present. The topics covered in the paper include; growth curves, survival curves, field experiments, several response series, a single response series, finite dimensional parametric model and the nonstationary case.
1This research was supported in part by National Science Foundation Grant MCS76-06117 and MCS76-10238. 237

238

David R. Brillinger

Notation that will be made use of in the paper includes the following. Matrices and vectors will be denoted by boldface characters. I will denote the identity matrix. 1 will denote the row vector of all l's. An index with further structure will be written in brackets, n =(/m), t = (rs). If U is a matrix-valued r a n d o m variable, its corresponding matrix of means will be denoted ~v. If U and V are matrix-valued r a n d o m variables and "vec" denotes the operator laying a matrix out as a column, then E [ v e c ( U ttv)][vec(V-lxv)]'=~vv, where " r " denotes transposition. The Kronecker product of the matrices U,V will be denoted U Q V . The analysis of variance is c o m m o n l y approached through a linear regression model of the following form (see for example R o y et al. (1971)), Y = XlOX2 + E, (1.1)

with: (i) Y an N X T matrix of observed responses whose N rows correspond to individual realizations of the experiment, the entries of the nth row being the T responses measured for that realization, (ii) X 1 an N M design matrix across individuals, (iii) X 2 an S T design matrix within individuals, (iv) O an M S matrix of unobserved coefficients and (v) E an N T matrix of stochastic errors. The design matrix X 1 is generally m a d e up of O's and l's. (The analysis of covariance is an exception.) The entries of X l, X 2 are generally taken as fixed but on occasion as stochastic. The entries of 0 are sometimes taken as fixed (fixed effects model) and sometimes as stochastic (random effects model). The concern is often to check whether certain entries of 0 or related parameters m a y be 0 and often to estimate parameters of interest and to attach measures of error to the estimates. Simple particular cases of the model (1.1), (i)-(iii), include; (1) one-way A N O V A specified by
Y{#,O = Or,, + e(t,,,),

(1.2)

l = 1 . . . . . Lm; m = l , . . . , M ; with N = Lx + " " + L M ; T = I; E=[e(lm)], E G = 0, ~ , = o2I and in the fixed effects case O=[0m] constant, while in the r a n d o m effects case E O = / , 0 1 , li~00= o21, N0~=0. (2) one-way M A N O V A specified by Y~tm), = Om, + e(l,,)t,

(1.3)

/ = 1 . . . . . Lm; m = l ..... M; t = l ..... T; with N = L~ + " " + L m ; E=[e(lm)t] , E E = 0, X** = %~ I and in the fixed effect case O = [Omt] constant, while in the r a n d o m effects case E O = # 0 l , X o o = % o I .

Analysis of variance and problems under time series models

239

Other particular cases of the model (1.1), (i)-(iii), come up in practice. This paper is concerned in part with such cases wherein the T columns of Y m a y be thought of as corresponding to measurements at successive times (or places.) The results of the paper are most complete in the case that the basic model is linear in lagged values of an observable series X t with stationary additive errors. The basic approach is via the Fourier transform of the given data.

2.

Growth curves

One practically important subject field in which the basic responses of an experiment are time functions, is that of classical growth curves (see Wishart (1938), Box (1950), R a o (1958) and Grizzle and Allen (1969) for example). The usual set-up here involves individuals under study with observations of size m a d e at a succession of times for each individual. The individuals differ in a variety of characteristics, including treatment applied. It is desired to understand the variation in size as a function of time and other characteristics. In the standard case, observations are made at the same times for each of the individuals. Measurements are correlated for the same individual, but uncorrelated for different individuals, generally. If Y,t denotes the size of individual n at time t, then (see Box (1950)), it is helpful to graph Ynt against t for each n. Wishart (1938) recommended that a general model, for example a quadratic, be fitted to each curve separately producing a set of estimates for each n and that these estimates then be analyzed as if they were the responses of the relevant experimental design. The model here might be written Ynt = am + flint + 7rot2+ ent, (2.1)

if individual n belongs to group (or block) m and if observations are made at times t = 1,2 ..... T. One might be interested in whether the tim and 7m are constant, for example. The function E Ynt = am + tim t + 7,, t2 is called the growth curve of the nth individual (associated with group m). The model is a particular case of (1.1) and an extension of (1.2), with X~ containing L 1 rows (1,0 ..... 0), L 2 rows (0,1,0 ..... 0)... and L M rows (0,0 ..... 1). The ruth row of O is ( % , f l m , Ym)" The tth column of X 2 is (1 tt2). In the analysis it is usually assumed that the units are uncorrelated with ~ 2 = %~I. Covariates might be included in the model.

240

David

R. Brillinger

A key assumption, usually required, is that the measurements are made at the same times for each of the individuals. This allows the use of orthogonal polynomials, in t, a n d the further assumption that the series ent, t = I, 2 .... have the same special form covariance matrix for all n a n d this m a y be estimated from the data (e.g. Grizzle and Allen (1969)). The model is typically fitted by weighted least squares or M A N O C O V A . Box (1950) suggested a series of A N O V A ' s performed on the successive differences

L 2 - L~, L 3 - rn2.....
Rao (1958) suggests carrying out a principal component analysis of the multivariate observations (Ynt, t = 1,2 ..... T ) , n = 1. . . . . N a n d testing for differences between the dominant components. This m a y be viewed as contemplating a model
Ynt

=/1 + O/(1)/~(l)n r't +

-L ~ ( K ) f 4 ( K ) " " " T t~ n h't

+ gnt~

and carrying out M A N O V A on the data (&~l)..... &~K~). Church (1966) and Snee (1972) also propose procedures based on eigenvectors. The assumption of the same n u m b e r of observations for each individual is relaxed in Srivastava and M c D o n a l d (1971). Fearn (1971) presents a Bayesian approach and indicates that it can cope with observations for different individuals taken at different times. Chapter 9 in Daniel and W o o d (1971) presents an example where nonlinear regression is used to fit nonlinear functions in t to each of a number of separate realizations. Tests for certain hypotheses of interest are indicated.

3.

Field experiments

In agricultural experiments, treatments are commonly applied to plots within the same fields and to these fields over a number of years. This practice can easily lead to correlation between observations on neighboring plots due to fertility gradients and between successive observations on the same plot due to residual effects, for example. These difficulties are usually overcome by carrying out randomized experiments, however sometimes an analysis taking account of correlation is necessary to obtain higher precision or because the study was not randomized. In contrast to the situation of the previous section where the dependence on "time" was the essence of the situation, here the dependence has the role of a nuisance.

Analysis of variance and problems under time series models


A m o d e l for a single occasion experiment might be
Y(rs) ~" E X(rs)mOm'~ E(rs), m

241

(3.1)

r = 1..... R; s = 1. . . . . S where X(~s)m = 1 if treatment m is applied to the plot in row r, c o l u m n s of the field a n d X(m m = 0 otherwise a n d where %s), r = 0 , _ 1,..., s = 0, +__1.... is a stationary spatial series. I n the case of a r a n d o m i z e d experiment, the X would be stochastic. F o r an experiment repeated a n u m b e r of times, T, o n the same N (uncorrelated) plots a model might be

Y,t = E X , mtOm + e,,,


gn

(3.2)

n = 1..... N; t = 1. . . . . T where Xnm t = 1 if treatment m is applied to plot n on occasion t and Xnmt----0 otherwise a n d where the {ent, t - - 1 , 2 . . . . . T} are uncorrelated time series. I n the standard approaches, the e of (3.1) a n d (3.2) are taken to be uncorrelated zero mean, constant variance deviates. As suggested above, it is sometimes necessary to h a n d l e the effects of time a n d space. O n e a p p r o a c h is to assume that Ee(m (or Eent ) are s m o o t h or periodically varying functions of (r,s) (or t) a n d to model these systematic c o m p o n e n t s by low order polynomials (perhaps periodic.) See for example Cox (1951, 1952) a n d Kiefer (1961). O n e can proceed here b y A N O C O V , see Cox (1962). O n other occasions workers have proceeded by assuming that the e series m a y be modelled by low order autoregressive or moving average schemes (see Cox (1952, Williams (1952) a n d Tiao and T a n (1965)). I n Larsen (1969) a n d L j u n g a n d Box (1976) the e(~s) are such that cov{e(,,),e(/,,)}=0,

rv~r ",

but stationarily correlated for r = r'. In the case that the covariance structure of the e is known, up to a multiplier o 2, inference poses no new problems. It is a particular case of Aitken's generalized least squares. I n the case of the references above, low order autoregressive models are set d o w n for the e a n d estimation is carried out by m a x i m u m likelihood or generalized least squares with estimated covariance matrix. A n alternate procedure that has been e m p l o y e d is the use of the residuals of adjacent plots as a c o n c o m i t a n t variable (Papadakis (1937), Barlett (1938) and A t k i n s o n (1969)).

242

David R. Brillinger

An important distinction concerning experiments as modelled by (3.2) involves whether or not a given experimental unit receives the same treatment on each occasion or whether the treatment applied varies. The first type experiment is really one of growth curves, as considered in the previous section. The references given in this section relate to the second type experiment. Systematic designs, change-over designs, cyclic rotation designs have all been developed to handle various of the difficulties indicated in this section. See, for example, Cox (1951, 1952), Williams (1952) and Kiefer (1961). Duby et al. (1977) consider various experimental designs in a field experiment when the errors form a stationary spatial process. The efficiencies of different designs are compared.

4.

Responses that are covariance stationary time series

This section considers situations wherein repetitions of the basic experiment lead to independent covariance stationary time series. A succession of models of increasing complexity are investigated, in much the same manner that books on analysis of variance proceed. The basic treatments may have factorial structure. The cases of both fixed and random effects are considered. The procedures developed differ from those of the usual analysis of variance in that the statistics developed are functions of frequency, )~. One proceeds from models set down in the time domain to their implications for frequency domain statistics. We begin by defining and indicating approximate sampling properties of certain such statistics for time series that are stationary and whose well-separated values are only weakly dependent. This last property is referred to as mixing. Let X(t), t = 0, _ l,... denote a stationary time series such that Z""
UI

Z Icum{X(t + u,) ..... X ( t + Uk),X(t)}] <


Uk

k = 1,2,..., t = 0 , + 1..... EX(t) = cx, cov{X(t + u),X( t) } = Cxx(U ),

(4.1) (4.2) (4.3)

fxxQO=(2~r)-l~;~Cxx(U)exp{-iuh},
U

- ~ < 2 t < ce.

(4.4)

The parameters (4.2)-(4.4) are referred to as the mean, autocovariance

Analysis of variance and problems' under time series models

243

function and power spectrum respectively. Condition (4.1) is a form of mixing condition. Let us denote the discrete Fourier transform of a stretch of such a series by
T-1

X ^ ( k ) = ( 2 ~ T ) -~/2 ~
t=0

X(t)exp{-i2~rkt/T},

(4.5)

k = 0, 1..... T - 1 where T is the number of observations in the stretch. A fair amount is known about the sampling properties of these transforms in the case that the series X(.) is stationary and mixing (see for example Hannan (1970) and Brillinger (1974, 1975)). In particular for K distinct X(k) with 2~rk/ T - ?~, and 2~rk/ TveO, +__ ~r.....

X ^ ( k ) = ~k + a.s.(1)

(4.6)

as T---~~ , where gl ..... gK are independent N c(O,fxx(h)) variates and Oa.s.(1 ) denotes a variate tending to 0 almost surely. T h e complex variate normal, N c, is defined in the Appendix. The representation (4.6) follows from the usual convergence in distribution result b y a theorem of Skorokhod (1956), see Brillinger (1973). It follows, in particular, from (4.6) that

f~'x(X)=K-'~ IX^(k)12=fxx(X)xZK/2K+o,.~.(1),
k

(4.7)

where X2K denotes a chi-squared variate on 2 K degrees of freedom. Expression (4.7) here indicates a standard estimate of the power spectrum and the estimate's approximate sampling distribution. In practice a key issue is the choice of K. One wants it largeish in order that the estimate have the stability indicated by (4.7), however if fxx(h) changes too much over the domain of frequencies 2~rk/T, k = 1..... K substantial bias can occur for the estimate. Extensive discussion is given on this point in the texts Hannan (1970) and Brillinger (1975). In practice it turns out to be essential to prewhiten the data before estimating a spectral parameter on many occasions, that is to make the spectrum approximately constant. This remark should be kept in mind before proceeding to the calculation of the various statistics indicated in Section 4. The formulas for the case of a vector-valued series are direct analogs of (4.5) and (4.7). For example, the cross-spectrum

frx(h) = (2~r) -~ Z cov{ Y(u),X(O)} exp{ -iu?~}


U

244

David R. Brillinger

m a y be estimated by the statistic

J~x(x) = K - ' E
k

r^(k) X ^ ( k ) .

4.1.

Case of a common signal

Consider a situation in which stretches of series Y,(-) are observed and it is of interest whether the series contain a c o m m o n "signal". T h a t is the data

Yn(t),

t = 0 , 1 ..... T - l ,

n = l ..... N,

is available and the following form of model seems reasonable.

Model 4.1.1. Deterministic signal


For n = 1..... N; t - 0 , + 1. . . . .

to(t) = . . + s(t) + ~.(t),

(4.8)

where (i) the p~ are constants, (ii) the S(t) are constants and (iii) the en(.), n = 1..... N are independent realizations of a stationary time series with m e a n 0, covariance function % ( u ) and power spectrumf~(~). It follows from the assumptions here that the response corresponding to the nth series has expected value
E r . ( t ) = ~. + S ( t )

and

cov( r . ( t + u), L ( t ) } = cAu).


In a variety of situations, it will be of interest whether or not the signal S(t) is identically 0, i.e. whether or not there is some component in c o m m o n to the series. It will also be of interest to estimate characteristics of the signal and of the error series e(.). Making use of the definition (4.5) and the representation (4.6), the model (4.8) leads to

Y ~ ( k ) = S ^(k) + ~nk + ,.s.(1).

(4.9)

(The /~, term drops out because kva0.) This expression is seen to take, approximately, the form of the one-way classification model with fixed

Analysis of variance and problems under time series models

245

effects. The traditional results for that model suggest computing the statistics
N

Y~.(k)-~N-' ~ Y,~(k),
n=l

(4.10)

IY;(A)t 2,
k N

(4.11)
Y;(k)[ 2.
(4.12)

E
n=l

E IY(k) k

Using the representation (4.9) and the complex extension of t h e Fisher-Cochran Theorem given in the Appendix, the latter two sums of squares have the respective forms ~k [S ^(k) + ff+k[2+ Oa.s.(1) =f~(X)X~K(NZ

IS ^(k)I2/f~(X))/2N+Oa.s.(1),
(4.13)

E E [~nk--~+kl2-1-Oa.s.(1)=f~(X)X2K(N--l)/2-l-Oa.s.(1),
1 k

(4.14)

with the t w o X 2 statistically independent. The first X 2 is non-central. The expected values of the dominant terms in (4.13) and (4.14) are

Kf~(X)/N + ~ [S ^(k)[ 2 = Kf~(X)/N + Kf~s(h),


k

and K(N-l)f~(X), respectively. The above results are conveniently collected together in an Analysis of Power Table, see Table 4.1. One of these may be computed for each frequency, h, of interest. Often it will be convenient to graph the F-statistic as a function of X. Examples are given in Shumway (1970) and Brillinger (1973). The null distribution of the F-statistic, (when S(.)=0), is asymptotically F2K;2K(N_ 1)" It is also useful to graph the mean square statistics of Table 4.1 and their difference, which may be viewed as an estimate of Nf[s(X ). The logarithms of the mean square statistics will have approximately constant sampling fluctuations, as functions of X, and it may be useful to graph them. If one proceeded by analogy with what is done in multivariate analysis, see for example Roy et al. (1971), and wished a formal test statistic of the

246
Table 4.1 Sum of squares

David R. Brillinger

Degrees of freedom

Source
Mean
Residual Error Total

(SS)
NZ[ Y+(k)[ 2
N Z ~ I , ~ ( k ) - Y +(k)l 2 1 N Y [ Y,[(k)[ 2 1

(dr) 2K
2 K ( N - 1) 2KN

Source Mean Residual Error Total


N l

Mean square = 2 SS/df (MS)

F-statistic (*) (**) (*)/(**)

"Expected" mean square (EMS)

Nff+r+(X)
X f~.~ r+, r. - r+(A)/(N - 1)

f~Qt)+Ufffs(h )
f~(h)

hypothesis S(.)-~0, one might set down an overall test statistics such as
Q/2 Q/2

sup
q=l

F(2~rq/Q),

Z F(27rq/Q),
q=l

Q/2 I~ [ I +(N-1)F(2Trq/ Q) ],
q=l

for some Q, where F(X) denotes the F-statistic of Table 4.1. It seems however that the use of such test statistics provides too brutal a summary of the data collected. The statistic Y+(k) may be viewed as an estimate of the Fourier component S'(k). As such it has expected value S ^(k) and approximate variance f~(2~rk/T). This variance m a y be estimated by the residual error mean square of Table 4.1. The signal S(t) itself may be estimated, relatively, by Y + ( t ) = Z Yn(t)/N. T h i s latter statistic has expected value #+ + S(t) and variance G~(O)/N=ff~(a)da/N. An estimate of this variance may be based on the sum of the residual error sums of squares of Tables 4.1 covering the whole frequency domain.

Model 4.1.2. Stochastic signal


The assumptions are as in Model 4.1.1 except that (ii) is replaced by: (ii)' S(.) is a realization of a time series independent of the series en(. ) with

Analysis of variance and problems under time series mode&

247

E S ( t ) = O , cov{S(t + u ) , S ( t ) ) = Css(U), t,u=O, +_ 1. . . . . It follows from this assumption that E Y~(t) = I~n,

coy{ r,(t+ ,), r,(0} = Css(U) + q,(u),


fy.y.(x) =fss(X) +L(x),
cov{ Y , ( t + u), r,,(t) } = Css(U), Yr, r,,(X) =/ss(X). The coherence between pairs of the series is seen to be fss(h)/[fss(X)+ f,~(h)]. In a variety of situations it is of interest to ask whether Css(. ) or fss(') are identically 0. In the case of Gaussian series this corresponds to asking whether the series Y,(.) are mutually independent. Under the conditions of Model 4.1.2, the representation (4.9) continues to hold, however it may be carried one step further to
Yn(k) = Yk + fnk + O,.s.(1)

with the Yk independent Nc(O,fss(h)) variates, independent of the NC(0,f~(h)) ~'s. The sums of squares (4.11) and (4,12) now have the representations

~, I.~k+g+kl2+O~.s.(1)=(fss(X)+f~(X)/N)x2 /2+O,.s.(1), (4.15) Z I~,,k-~+,J2+.A1)=L(~)x~,,:~-,F2+as(1), (4.16)


with the X 2 o n c e again independent. (This too follows from the complex version of the Fisher-Cochran Theorem given in the Appendix.) The hypothesis S ( . ) - - 0 may be written fss(.)--0. From (4.15) and (4.16) it is clear that the hypothesis may be examined by the F-statistic of Table 4.1. From expressions (4.15), (4.16) the statistic has the representation

[ L(~) + Nfss(~) ]
f,~(X)

F=K; ~K<~- ,) + Oa.s.(I),

in this case. The remarks made about combining F-statistics in the deterministic signal case clearly apply to this case as well. An estimate of the power spectrum fss(?0, at frequency X, may be based on the entries of Table 4.1. The estimate is
X' f T ~,+r+(X)-N -I ~ r.-r+.r.-r+ ( a) .
1

248

David R. Brillinger

It will be distributed asymptotically as a linear combination of chi-squared variates following the representations (4.15), (4.16). 4. 2. One-way classification that an experiment is carried out in which M different treat* applied a number of times to some experimental units and stationary time series are then recorded for the units. A model be appropriate for this situation is described by,

Suppose merits are covariance that might

Model 4.2.1. Deterministic case For l = 1,...,Lm; m = 1. . . . . M ; N = Lj-~ . . . .

+ LM; t=0,_+ 1..... (4.18)

Elm(t) = ~lm +' S ( t ) -1- Rm(t ) -~-Elm(t),

where (i) the #Ira are constants, (ii) the S(t) are constants, (iii) the Rm(t ) are constants standardized by Y~mLmRm(t)=O and (iv) the elm(") are indepem dent realizations of a stationary series with m e a n 0, covariance function %(u) and power spectrum f~,(X). For this model

E Elm(t ) = ~lm d~ S(t) ~- gm(t), cov{ Ylm(t + U), rim(t)) = c . ( . ) , COV{ Elm(t+ hi), Yl,m,(U)} =0, (l',m')~(l,m).

The series Rm(" ) m a y be viewed as representing the effect of the mth treatment. A hypothesis of interest in a variety of situations is R m (t) = 0 The series manner of In terms write for k for all m, t. (4.19)

S(.) represents a "signal" c o m m o n to all the series in the Section 4.1. It m a y also be of interest to ask if it is identically 0. of the discrete Fourier transform introduced earlier, one m a y an integer with 2 ~ k / K - ~ , but vs0

r,g(k) = S "(k) ~ R ; ( k ) + E,m(k)

= S ^(/) -1- Rn~ (k) -}- ~'lmk-t- Oa.s.(l)~

(4.20)

l = 1 . . . . . L m; m = l . . . . . M; k = l . . . . , K with the ~'s independent NC(0,f~,(~)). The expression (4.20) is seen to take, approximately, the form of the model of a two-way hierarchical classification with fixed effects. To

Analysis of variance and problems under time series models

249

proceed with an analysis, one is led to compute

Y+,,,(k) = L~-~ ~ Ytm(k),


1

Y ++(k)= N - ' ~ ~ Ylm(k)= N - l ~ L,,,Y +,,,(k),


m 1 m

and the sums of squares

N~] [Y+ +(k)l 2,


k

(4;21)
(4.22)

Z ~ Lm[ Y+m(k) - Y+ +(k)] 2,


k m

Y, ~ [Y,m(k)- Yg~m(k)]2.
k m l

(4.23)

Using the representation (4.20) and the complex Fisher-Cochran Theorem expressions (4.21)-(4.23) are seen to have the representations N ~ TM, IS A(k) + ~'+ +k12+ Oa.s.(1)=
k

~ LmlRm(k)+~m+-~'++12+..~.(l) =
m

E E Z ];link-- ;+mkl2"kOa.s.(1)~-fee(X)X2K(N k m I

m)/2Oa.s.(1)"

Once again the results are conveniently collected in an ANOPOW Table. This is given as Table 4.2. This table is to be computed for a variety of frequencies X. The hypothesis (4.19) may be examined by noting that the null distribution of the F-statistic (**)/(***) of Table 4.2 is asymptotically F2K(M-1);2K(N--M)" It is helpful to graph this statistic as a function of ~ and to further indicate the null significance level on the graph. (See Brillinger (1973).) In the case that S ( . ) = 0 , the asymptotic distribution of the statistic (*)/(***) of Table 4.2 is F2r;2K(N_M) for each X.

250 Table 4.2 Source Mean Treatments Residual error Total Source Mean Treatments Residual error Total Source Mean Treatments Residual error Total Sum of squares

David R. Brillinger

Degrees of freedom 2K 2 K ( M - 1)

N~IIY~. + (k)l2 ZY.~Lm] Y+m(k)- Y+ +(k)l 2

Z~S~lz"[Yl~(k ) - Y+m(k)i 2 2 K ( N - M ) ZY~Zl~l Yt,,( k)l2 2KN


Mean square = 2 SS/df

Nf~'. +r+ +(~) (*) (M--I)-Iy'MLmfT+.-- Y++,Y+m- Y++(h) (**) (N-M)-tYqMYqli~f~_r,,rl_r+.Qt)


(***)

F-statistic "Expected" mean square (.)/(***) f~(h) + Nf~R(X) cr (**)/(***) f~(X)+(M- D j - lV,~L ~L ,nJS~S~ f~

T h e F o u r i e r c o m p o n e n t Rm(k), of the t r e a t m e n t effect, m a y b e estim a t e d b y Y 2 m ( k ) - Y~_ +(k). T h e a s y m p t o t i c v a r i a n c e of this e s t i m a t e is


1)L.T

A n estimate of this m a y b e b a s e d o n the residual error m e a n s q u a r e of T a b l e 4.2. T h e restriction 2 , , L m R m ( t ) = 0 is the " u s u a l " constraint. I n o t h e r situations one might c h o o s e to r e p l a c e this b y s o m e t h i n g m o r e r e l e v a n t to the c o n t e x t at h a n d as is d o n e in o r d i n a r y A N O V A , see Scheff6 (1959) a n d Searle (1971). O n o c c a s i o n a stochastic effects m o d e l is m o r e r e l e v a n t to the s i t u a t i o n at h a n d . W e n o w set down,

Model 4.2.2. The Stochastic Case T h e a s s u m p t i o n s are as in M o d e l 4.2.1 e x c e p t that (ii) is r e p l a c e d b y ( i i ' ) ' S ( - ) is a r e a l i z a t i o n of a t i m e series with m e a n 0, c o v a r i a n c e f u n c t i o n Css(U ), p o w e r s p e c t r u m fss(~), (iii) is r e p l a c e d b y (iii)' the Rm(" ) are realizations of a time series with m e a n 0, c o v a r i a n c e f u n c t i o n CRR(U),

Analysis of variance and problems under time series models

251

power spectrum fRR(X) and (v) all of the series S(.), Rm(') and era(") are statistically independent. Under the assumptions of this model
E cov{ =

r,m(t + .), r,m(O} = Css(U) + + .(.), cov{ Ytm( t + u), Ytm,( t) } ----Css(U) + ORR(U), meem ', cov{ Ytm(t + u), Yrm,( t) } = Css(U), (l,m)=/=(l',m').
Similar expressions may be set down for the power and cross spectra. Now, the coherence corresponding to two units receiving the same treatment is

[/ss(X) +yR.(X) ] / [ fss(A) +Jp.R(~)+f~(X) ].


The coherence corresponding to units receiving different treatments is fRR(X)/[ fss(X) +fRR0 t) + f,~(X) ]. In the case that fss(X)-0, the coherence between two units receiving the same treatment is not different from that between any pairs of units. In the case that fRR(X),fss()t)= 0, the units all have 0 coherence. In this stochastic case, the representation (4.20) goes over to

V,~(k) = yk+ ,,,k+ ~,mk+ O,.s.(1)


with all the ,fs, p's, ~'s mutually independent with the "~'s NC(0,fss09), with the 1,'s NC(0,fRR(X)) and with the f's NC(0,f~(X)). One sees that one can proceed here by computing the statistics of the ANOPOW Table 4.2; however the distributional results are severely affected in the non-null cases by any lack of balance, that is inequality among the L m. The residual sum of squares will still be represented as ~] ~ ~2 ]~lmk-- ~+m~]2 + Oa.s.(1)
k m l

==f~(X)X~t~(U-M)/2+Oa.s.(1),
(4.24)

and expressions (4.21) and (4.22) the representations N ~ , 17k + V+k + ~+ +kl 2 + Oa.s.(1)'
k

~ LmlP~k--U+k+t+mk--t++kl2+O~.~.(1),
k m

252

David R. Brillinger

and the sums of squares are seen to be asymptotically independent. In the balanced case of Lm = L, m = 1..... M these may further be written as [ f~(h) + L~(h) + LMfvv( ) ]X~K/2 + oa.~(1), [ f~(h) + M,(X) ] X~K(M- ,) + Oa.~.(1)" (4.25) (4.26)

The final column of Table 4.2 becomes the column given in Table 4.3. "Expected" mean squares may be computed for the unbalanced case by proceeding in an analagous manner to that of the usual ANOVA case. See for example Scheff6 (1959) or Searle (1971). The chi-squares of (4.24)-(4.26) will all be statistically independent. The hypotheses of fgR(h) or fss(X)= 0 may be examined by the F-statistics of Table 4.2. Estimates of the power spectra fRR(h),Fss0t) may be constructed by subtracting the mean square statistics of Table 4.2 taking note of the "expected" mean squares of Table 4.3.
Table 4.3 "Expected" mean square fee(X) +

Lf~,z, Qt) + LMfw(~,) fe~(X)+ Mf~()~)

f.(x) At a minimal cost one can include transient signals in the models considered in this and the previous section. By a transient is meant a deterministic signal, Q(t), satisfying a finiteness condition like
oo

~2 Itl [Q(t)[ <


0

~.

(4.27)

Suppose the model (4.18) is replaced by

Ylm( t) = ~lm df Qlm( t) "t- S( I) %"Rm( t ) -I- elm(t),


where the Q#,(.) are transients satisfying (4.27). Proceeding to the Fourier transforms here leads to

Ylm(k) = Qlm~,+ S ^(k) + R ( k ) + elm(k),


and the Q~mx may be removed by a covariance type analysis. Further details are given in Brillinger (1973).

.4na~ysis of variance and problems under time series models

253

4.3.

Several general models

It seems sensible to mention several general models that include those discussed earlier as particular cases and yet are interesting in their own right. Specifically, consider to begin,

Model 4.3.1
For t = 0, __ 1.....

y(t)=/i+

~ 01(U)Xl(l- u)-t- ~02(u)X2(l--u)-l-8(t), u u

(4.28)

with Y(t),Xi(t),X2(t) observable, with Y(t),l~,e(t) all N-vectors, with 01(0,02( 0 N X P 1 and N X P 2 respectively, with Xl(t),Xz(t) PI and P2 vectors respectively, with XI(. ) and X2(.) fixed and bounded, with ]~tO~(u)],Y~lOz(u)l < Go, and finally with e(.) a zero mean stationary series satisfying (4.1). One question of interest relative to the model (4.28) is whether or not a given set of data suggests 02(.)___0. (4.29)

This question may be addressed by fitting first the full model (4.28), and then the restricted model

Y( I) =lid- ~ Ol(U)Xl(t - u) -1"-E(t)


u

(4.30)

and then asking whether the model (4.30) has done substantially worse. To this end define

O(u) = [8,(u)O2(u)],
X(t)

O(~,) = ~ e x p ( u

i~u}O(u),
(4.31)

= [ x,(t)

X2(t) ].

Given the data Y(t),Xl(t),XE(t), t = 0 , 1..... T - 1 equation (4.30) leads to Y~(k) = O(A)X ~(k) + ~g + Oa.s.(1), (4.32)

for K distinct, non-0, non-~r values 2~rk/T=~ where the ~'~ are independent complex normal variates with mean 0 and covariance matrix f~(X). Expression (4.32) corresponds directly with the general multivariate linear model. Collecting the Y^(k) and X^(k) into N X K and (PI+PE)K

254

DavM R. Brillinger

matrices

v=[w(k)],
the estimate YX'(XX ~)- 1 is suggested for O0~) for example, assuming the inverse matrix exists. The hypothesis (4.29) may be rewritten as 02(-)--=0 and the standard results for the multivariate linear model (see for example Rao (1965)) suggest setting down a M A N O P O W table for each X. Specifically one can set down Table 4.4. The residual error and reduced model mean squares will be distributed asymptotically as independent complex Wisharts under the null hypothesis and tests of the hypothesis (4.29), for each frequency )t, may be based on functions of these in the manner of the usual M A N O V A (see for example Roy et al. (1971)). In the case that N - - 1, the F-statistic Residual model M S / R e s i d u a l error MS will be distributed asymptotically as F2e~;2(i~_e_e9
Table 4.4 Source Hypothesis Reduced model Full model Residual error Total Sum of squares y ~ - ( X l . ~ ) - ixl ~., (.,)-(,) (,) Degrees of freedom Mean square = 2SS/df 2p 1 2P 2 2P 2(K- P)

in the null case and

Kffx,(A-)f~x,(A)- lf~ v(A-)/p1

FX(XX) - 1XY"
(***) - (**) yV~ (***)

(**)

/ ( K - P)
2K

Source Hypothesis Reducegl model Full model Residual error Total

F-statistic

"Expected" M.S.1

f~.(A-) + Kel(A-) fXrlX,(A-)el(A-) "rip 1 f,(A-)


f~(A-) + KO(A-)fxrxOt)O(X) / p
f,(A-)

1The first two E.M.S. are evaluated assuming

e2(A-)=0.

Analysis of variance and problems under time series models

255

will be stochastically larger under the alternative. The parameter O(X) may be estimated by O(X)=YX'(XX') -1 as indicated above, vecO(X) will be distributed asymptotically as NCx(vec O(X), f~(X)(XX')-l), (see for example Brillinger (1969).) The parameter O(u) may be estimated by building a sample analog of the inversion relationship
2,tr

O(u) =

f exp {iua )O(a)da,


0

in the manner of Chapter 6 in Brillinger (1975). On occasion it is more appropriate to consider the series XI(.),X2(. ) of (4.28) as matrix-valued and to investigate the following model (considered in Shumway (1970)) in some detail.

Model 4.3.2 For t = 0, ___1.....


Y(t)=t~+ ~Xl(t-u)O~(u)+

~X2(t-u)O2(u)+e(t )
U

(4.33)

with Y(t),Xl(t),Xz(t ) observable, with Y(t),lL, e(t) N-vectors, with Xl(t),Xz(t ) N P1 and N /}2 respectively, with 01(t),O2(t ) P1- and P2-vectors respectively, with XI(.),X2(- ) fixed satisfying ZIXl(U)l,Y, lXz(u)l < 00. Finally, with e(-) a zero mean stationary series satisfying (4.1). Models 4.1.1 and 4.1.2 are particular cases of (4.33) with P1 =0, P2 = 1, 02(0 = S(t) and with 02(- ) fixed or stochastic respectively. Once again a question of some interest is whether 02(-)_0? This may be examined as follows. The model (4.33) leads to the relationship

Y^ (k) = X l(k)O; (k) + X (k)02 (k) + ~'k + a.s.(1)

(4.34)

in terms of the discrete Fourier transforms, where we consider 2 ~ r k / T - X , but 2qrk/TyrO, ~r, with the Pk independent complex normal variates having mean 0 and covariance matrix f~(~). Rewriting (4.34) as

Y~ (k) = X ^ (k)O ^ (k) + ~ + Oa.s.(1)

(4.35)

and assuming the inverse in (4.36) exists, O^(k) may be sensibly estimated by /~(k) = ( X ~ - ~ ' X ^ ( k ) ) 1X----~'Y^(k). (4.36)

In the case that Oz(.)=-0, the estimate 61 (k) = ( X? (k------~ ~X~ ( k ) ) - l ~ ~Y^ (k)

256 Table 4.5 Source

David R. Brillinger

SS

Hypothesis Xk V^(k) "X,~(k)( X~(k) " X ~ ( k ) ) - l ~ Reduced model (**)- (*) Full model Resid~ml error Total Source

"Y'(k)

(*)
(**)

~-,k Y^(k) ~X^(k)( X~(k)~'X*(k))-I X~(~~(k)


(***)-(**)

~k Y^(k) TY^(k)
df MS

(***) F-statistic

Hypothesis 2KP 1 Reduced model 2KP2 Full model Residual error Total

~.~f~B.(~)/ P1 ~f~A.(Tt)/ P E~f~e.(h)/(N- P)

2KP 2K(N- P)
2KN

m a y be computed for Ol(k ). Suppose K frequencies 2~rk/T near X are considered. Then an A N O P O W table relating to fitting b o t h the general model (4.33) and the restricted model is provided in Table 4.5. In that table it is assumed that
e'(k) = V A( k ) - X A( k ) O " ( k )

corresponds to the residual series, that

A~(k)=X^(k)O^(k)
corresponds to the fitted series under the full model, and that RA(k) - - X ; ( k ) O ? ( k ) corresponds to the fitted model under the hypothesis. Define xQQ = E exp{ - i X u } X ( u ) .
t/

With this notation, Table 4.6 gives the expected mean squares in the case that 0(.) is fixed and that 0 2 ( - ) = 0 for the first and second "expected"

Analysis of variance and problems under time series models

257

mean squares. The hypothesis may be tested, at each frequency, by the F-statistic
Table 4.6 "Expected" mean square f.(~,) + e,- ' tr (f~o,(X) XI-~ "X,(h) ) fe~(x)

f.(x) + ? - ' tr { ~ ( x ) x(x) x(x) }


f~(~)

Reduced model M.S./Residual error M.S. of Table 4.5. In the null case it will be distributed asymptotically as F2KPz; 2K(N-- P)

(4.37)

and in the alternate case it wilt be stochastically larger. In the random effects case where 01(- ) and 02(- ) are independent zero mean stationary series satisfying (4.1) the "expected" mean squares become those of Table 4.7. This is the situation analagous to Models 4.1.2 and 4.2.2. In the unbalanced case, the statistic (4.37) will not have a simple approximating distribution.
Table 4.7 "Expected" mean square

(ii)-(i)

f.(~) + P2-'tr{ fo~o2(X) Xz(h)'[ I- X,Q,)(X,(~--~-)"X,(X))-] X:(X~:]X2(~)


f,,(X)+ P - ~tr{s%(X)x(X) "x(x)) f,(X)

At this point it is clear that if desired, one might set down a model combining Models 4.3.1 and 4.3.2, namely

v(0 =t, + Z Z x l 0 - u)O(v)X2(u- v) + ~(t)


lg 12

(4.38)

with Y(t),Xl(t),X2(t) observable, with Y(t),l~,e(t) N-vectors, with O(t) P Q, with Xl(t ) N P , with Xz(t ) Q x 1. Supposing the data

258

David R. Brillinger

Y(t), Xl(t), X2(t), t = 0 ..... T - 1 are available and taking the discrete Fourier transform of (4.38) leads to Y" (k) = x ? A(k)X (k) + + o,.s.(1)

a model having the form, (1.1), considered in some detail in Roy et al. (1971). The model m a y be considered directly. Alternately it m a y be reparametrized to the form of Model 4.3.1.

4.4.

Case of a single response series

There are m a n y interesting situations in which but a single dependent series is recorded on a single experimental unit, with that unit subjected to a course of treatments. For example in Hall et al. (1971) an experiment with a second-grade class in a poverty area school is described. The dependent series recorded is the daily n u m b e r of "talk outs". The class was initially untreated for a period to establish a baseline level, then for a period systematic praise and immediate reward were given for appropriate behavior. Next period, systematic praise was given with a promise of a reward later. This was followed by a period of no special treatment, as at the beginning of the study. In the final period there was praise for handraising and an ignoring of talking out. The authors of the study were interested in whether or not the inappropriate talking of the children was altered by the treatment. A graph of the daily n u m b e r of talk outs against day shows quite an interesting structure. Some of the papers considering this type of time series situation are: Glass, Willson and G o t t m a n (1975), Kazdin (1976), Box and Tiao (1975), Hibbs (1974, 1977), Box and Tiao (1965), Edgington (1967), and Campbell (1963). In this section the problem will be approached in the manner of Section 4.3. In a later section it will be approached through a model with a finite n u m b e r of parameters. Initially for a dependent response series Y(t), t = 0, _+ 1.... suppose that the experimental treatment protocol may be reasonably described by the real-valued series X(t), t = 0, _+ 1. . . . . A model that might be contemplated is

Y(t) = I~ + ~, O ( u ) X ( t - u) + e(t),
U

(4.39)

t = 0 , - + 1.... with the O(t) unknown parameters and with e(-) an error series. If the treatment is "causing" the response, then the function 0(.) will be realizable, vanishing for t ~ 0 . Possible expressions for the series

Analysis of variance and problems under time series models

259

X ( . ) include: (a)

X(t) = 1
=0
X(t) = 1

if unit treated at time t, otherwise. if unit treated at time o a n d 0 < t - o <w, otherwise. if unit treated at time o a n d 0 < t - o <w, otherwise. in a Case X(.) drug then

(b) (c)

=0
X ( t ) = (t - o ) / w

=0

Case (a) corresponds to a succession of impulses, perhaps applied regular fashion. Case (b) corresponds to step functions of duration w. (c) corresponds to r a m p functions of duration w. Other examples of may be considered, for example X(t) might be the dose of a administered, as treatment to an individual, at time t. In the case of (a) if the treatments occur at times trj, j = 1,2 ..... (4.39) becomes
Y ( t ) = Ix + ~ O(t - 9 ) + e(t).

(4.40)

The response in (4.40) is seen to be the sum of the impulse responses initiated at each time of treatment. In m a n y cases O(t) will be non-zero only for t > 0 and for a short duration. In these cases, provided the aj. are sufficiently separated, the effects of the individual responses may be seen in the record of Y(.). Examining for a treatment effect m a y be viewed as checking whether 0(.)--=0. General tests for this will be described shortly. In the case of (a) the "average evoked response"
N

Y(u+ oj)/N
j=l

(4.41)

is often computed and graphed as a function of u. (See for example Ruchkin (1965).) The cases (b) and (c) of the series X(-) m a y be handled either by proceeding directly with X ( . ) as given for the case, or by redefining 0(-) and using the X(-) of case (a). In case (b) for example set

o'(t)=
O<t--v<w

o(v).

In the case that the error series is stationary, the model (4.39) is a particular case of Model 4.3.1 and test procedures developed for that model m a y be employed. Table 4.8 is an A N O P O W table appropriate for

260 Table 4.8 Source SS

David R. Brillinger

df

Hypothesis
Residual error Total

IX Y'(k) X^(k) 12/XlX^(~)l 2 (,)


(**)-(*)

2
2 ( K - 1) 2K

XlY'(k)l m
Mean square = 2 S S / d f

(**)

Source Hypothesis

F-statistic (1) (1)/(2)

"Expected" mean square

Klfrrx(Ml2/fxrx(h)

f~.(;k) + KIO(X)I~fx~x(X)

Residual error
Total

K[frr(x)-fffx(X)J2/f[x(X)]/(K-

1) (2)

L,(~)

checking the hypothesis 0(~) = 0 where O() = Z exp{ -iu } O(u).


U

The F-statistic

( K - 1)lfl ( )lV[

- I fl ( )l

(4.42)

will be distributed asymptotically as F2;2(K l) under the hypothesis. The F-statistic (4.42) is a simple function of the coherence statistic

= Iffx(x) r2/IfL(x)f;dx)r 2

(4.43)

and checks for association between a series X(.) and a series Y(.) are often based on the latter. The sampling distribution of a coherence statistic is discussed in some detail in Brillinger (1975). If there is an apparent effect, it is perhaps of interest to ask if the effect is instantaneous, that is whether O(t) = 0 for t va0. It is generally of interest to estimate 0(.), and this may be done in the same manner as for Model 4.3.1. It is important to point out in connection with fitting the model (4.39) and the related computations of Table 4.8 and (4.43) that the operation of prefiltering the series X(.) and Y(-) to make the relationship between the series approximately instantaneous has been found to be especially important here. Prefiltering is discussed in Brillinger (1975). A direct method is described in Cleveland and Parzen (1975). Suppose for the moment that a single treatment is a p p l i e d at time o. Then expression (4.40) becomes

Y(0

0(t- 0)+

Analysis of variance and problems, under time series models

261

If O(t)= 0 for t < 0, then this m a y be written


r(t) = + t <o,

=lz+O(t-a)+e(t)

t> o.

(4.44)

Were o unknown, a n d a model of the form (4.44) set down, one might say that he has the problem of detecting change. References concerning this problem m a y be found in Hawkins (1977). The model here takes the form of (4.39) with

x ( t ) = 1 t=0,
=0 otherwise,

and 0(-)--~0(- - o ) . The unknown time point m a y be incorporated into the unknown function 0(.) and a test m a y be carried out once again in the manner of that of Model 4.3.1. Langmuir (1950) carried out a large scale experiment for which the model (4.40) might be considered appropriate. Observations on weather were made daily throughout the U.S. Every seventh day cloud seeding was carried out using ground based burners. The oj in (4.40) were--7j. In order to see if the seeding was having an effect, Langmuir examined various meteorological statistics for a seven day periodicity. He found some, but unfortunately there are natural p h e n o m e n a of period seven. The statistic (4.41) is a useful one in examining the results of cloud seeding experiments. N e y m a n (1977) presents graphs of the statistic, as a function of u corresponding to hourly m e a n rainfall, with separate graphs for the case in which seeding was actually carried out as opposed to the situation simply having been decided to be suitable. The choice of whether or not to seed a suitable situation was made randomly. The analysis of a n y test statistic may then be based on a randomization distribution and assumptions such as stationarity avoided. Given the model (4.39) a natural question that comes up is how one might best choose (design) the series X(.) if one has that possibility. It seems that a wide variety of advantages acrue if the series X(-) is taken to be Gaussian white noise. So far, for the tests of the hypotheses that have been discussed, it has been proposed that analysis of variance tables be computed separately at each of a number of frequencies )t, and it has been implied that a separate test statistic should be computed for each of the 7~. On occasion it m a y be reasonable to combine these separate test statistics into a single overall test statistic. See, for example, W a h b a (1968, 1971).

262

David R. Brillinger Finite parameter models

4. 5.

The models that have been considered so far, in this section, m a y be described as distribution free in the sense that specific functional assumptions have not been set down. There are a variety of situations in which it seems however, appropriate to set down a model involving an unknown finite dimensional parameter. To begin, suppose, in the manner of the previous section, that a single response series Y(-) is available, corresponding to the treatment series X(-). Consider the model

Y(t) = i~+ Z a(u; O ) X ( t - u) + e(t),


12

(4.45)

where 0 is a finite-dimensional parameter, where the f o r m of a(-) is known, except for the value of 0 and where e(.) is a zero m e a n stationary error series with power spectrum f~( ; 0) whose form is known except for the value of 0. As a specific example of such a situation, consider the specification

o"x(t-u)+4t),
u=0

where e(.) is a mixed autoregressive moving average (ARMA) process satisfying

e(t) + b l e ( t - 1) + - . -

+ bpe(t - p ) = % ( t ) + c 1 % ( t - 1) +... +Cq~)L(t-q), (4.46)

with % ( . ) a series of mean 0, variance o 2, uncorrelated variates, where the parameters appearing are such that the series is well-defined and where

O=(p,b 1..... bp,c i . . . . . Cq, O).


On m a n y occasions a vector form of (4.45) is more appropriate. We mention the state space model

S(t) = F S ( t - l) + G X ( t - l) + KgL(t),
Y(t) = H S ( t ) + Le(t), with the series involved vector-valued, with the series X(.) and Y(-) observable, with e(.) and % ( - ) independent white noise processes having covariance matrices I and with

O= (F, G, H, K, L).

Analysis of variance and problems under time series models

263

Another vector case is the so-called A R M A X model


P q

v(O+ Z
u=l u=O

a.X(t-.)+ Z .vc(t-.),
u=O

(4.47) where ~6(.) is a white noise process having covariance matrix 52, where the parameters are such that the series Y(-) is well defined and where O=(a o.... ,ar, b , ..... bp ..... e, ..... eq, N). The state space model is considered in Akaike (1976), Gupta and Mehra (1974) for example. The A R M A X model is considered in Hannan (1976), and Nicholls (1976). It may be the case that the model set down initially is not of the form (4.45); however through the operations of differencing and simple filtering it may be brought to that form. Box and Tiao (1975) discuss a variety of cases of this character. It was remarked in the previous section that the series X(-) might have several (equivalent) forms, through the redefinition of the 0(.) of (4.39). The specific definition of X(-) need not be crucial. With the formulation of the present section, the definition of the series X(.) becomes crucial. We next indicate one method of fitting the model (4.45) given the data X(t), Y(t), t = 0 ..... T - 1 . Taking the Fourier transform one has

Y ^ ( k ) - A(Xk; O)X ^(k) + e^(k),


with A(X; O)= E exp{ -iXu}a(u; O)
U

and where Xk=2qrk/T. For distinct values of k, the e^(k) are approximately independent Nc(0,J~(Xk)) variates. These remarks suggest setting down the "Gaussian" log likelihood function given, up to a constant, by
1 T--1

- -~ ~ I Y^(k)--A(Xk; O)XA(k)12/f~e(hk; O)
k=l 1 T--1

2 ~ lgf~e(Xt'; 0)
k~l

(4.48)

Traditional statistical inference suggests the consideration of the value of 0

264

David R. Brillinger

that maximizes (4.48) as an estimate of 0, or equivalently that minimizes

Q"(o) = -7= E I Y"(k) - A (X/,; O)X ~(k)lZ/L~(hk; O) k=l 27r T-1 + T E 1ogf~(~'k;0)"
k=l

2~

T-

(4.49)

This type of estimation procedure was introduced and discussed by Whittle (1954, 1961). If 00 denotes the true value of 0, then under regularity conditions, Q r(0) tends in probability to
2,rg

Q(0)=

f [ f~(X;O)/f,~(X;Oo)+logf~(X;O) ] clX
0

(4.50)

an expression that is minimized when 0 = 00. A condition for identifiability would include Q(O)>Q(Oo) for 0va00 . Under regularity conditions 0, the estimate minimizing (4.49), will be consistent and asymptotically N(00,~- ll~r~t,- l) where qs = plim 0 2 Q T ( O ) o 00" 0

Oo

0Q"(0o)
0o

= U(0,zcr) + p([~rl)"

Various minimization algorithms should be considered for obtaining t~. We mention Newton-Raphson, Gauss-Newton and the method of scoring. These generally require sensible starting values. On occasion it is possible to introduce o2=exp (2~r)-1 f log2~f~(o~)da as one of the parameters and to consider the estimate minimizing only the first term of (4.49). More explicit formulas have been given for the covariance matrix of the asymptotic distribution in certain cases, see Nicholls (1976), Anderson (1978), Dzhaparidze and Yaglom (1974), Akaike (1973). The efficiency of the estimate has been considered in Hannan (1969) and Nicholls (1976). Davies (1973) develops further properties in the

Analysis of variance and problems under time series models

265

case of Gaussian Y(-), however it is not necessary that the series be Gaussian for the above considerations to be appropriate. Other candidates for minimization (or maximization) are considered in the references given above. For example Box and Tiao (1975) consider time, rather than frequency domain, based criteria. Some Monte Carlo studies have been carried out, but it is not yet clear which criterion is best in which sort of situation. For completeness the vector versions of the previous expressions are now set down.

Y(t)=p+ ~
U

a(u; 0 ) X ( t - u) +e(t),

(4.45)'

2~r r - 1 T ~] tr{ [ Y ^ ( k ) - A ( X k ; O)X^(k)]


k=l

[
2qT

,r

--1

27r } +T

T-I

logdet f~(X~; 0),


k=l

(4.46)'

f [tr{f~(X;o)-lf~(X;Oo)}+logdetf~(X;O)ldX.
0

(4.50)'

The conditions for identifiability can be quite complicated in this case, see Dunsmuir and H a n n a n (1976) and Deistler et al. (1978). The previous sections of this paper have indicated certain procedures for testing hypotheses of some interest. Consider next the testing of the hypothesis that some n coordinates of 0 take on specific values, versus the hypothesis that all coordinates are free. Denote 0 with the specified values inserted by 0'. Let ~J and 0' denote respectively, the estimates obtained by overall minimization of (4.49) and by minimization under the hypothesis. One way of proceeding here is to use the asymptotic distribution of indicated above to set down an approximate confidence region for the n coordinates of 0 of interest and to see whether or not the specified values fall within the region. An alternate procedure is to set down an " A N O V A " table summarizing the results of fitting the model in general, and under the hypothesis, and to see whether dropping the hypothesis has a substantial effect on the criterion, Q r, of fit. Table 4.9 provides such a table. The hypothesis mean square statistic will be asymptotically (as T ~ o o ) , distributed like under the null hypothesis and so a test may be based on it. The above remarks apply whether Y(.) is real or vector valued.

x~/n

266

David R. Brillinger

On occasion it may be the case that one is unwilling to make assumptions concerning the form of the error spectrum, f~(79- In this case one can consider estimating 0 by minimizing

Qr(O) = -~- ~] ] Y ~ ( k ) - A ( X ~ ;
k=l

2~r T--1

O)XA(k)IZ/L~(X~)

where f~[(~) is an estimate of f~(h), perhaps given by

=f(y(x)-Iffx( )lVfL,(x).
Such estimates are considered in Akisik (1975).
Table 4.9 Source Hypothesis Full model Total SS ( * * ) - (*) 2Qr({~ ') 2 Q r(~) (*) (**) df n T- n T MS = S S / d f [(**)- (*)]/n EMS 1

5.

Further considerations

5.1. Regression models


On occasion one is led to set down linear regression models with the errors autocorrelated. For example consider the following model for an individual response,

Y(t) = 0X(t) + e(t),

(5.1)

t--0, + 1.... with X ( . ) a given vector-valued function, with 0 an unknown parameter of interest, and with e(.) a zero m e a n stationary error series. Model 2.1 of Section 2 is an example of such a case. The estimate of 0 of (5.1) m a y be determined individually for each response a n d these values subjected to a (multivariate) analysis of variance for example, if the responses have been collected in accordance with an experimental design. One method of estimating 0 is via least squares, i.e. taking the value that minimizes
T-1

I Y ( t ) - 0X(t)l 2,
t=0

Analysis of variance and problerns under time series models

267

or equivalently
T-I

~] I YA(k)-OX^(h)l 2.
k-O

(5.2)

Denote this estimate by 0. It turns out that under weak regularity conditions 0 is consistent, asymptotically normal, and asymptotically efficient surprisingly often. In indicating the asymptotic distribution here it is convenient to allow X(t) to (possibly) depend on T and we write Xr(t). Suppose
T-I

~] xT(t +
t~O T--I

u)Xr(t)'/T---~mxx(U),
(5.3)

sup Ixr(t)l/V T-,O,


tffiO

as T--~c. Typically there will exist a matrix-valued function of bounded variation Gxx(X) such that

mxx(U ) = f
--or

exp(iua}

dGxx(a ).

Now under a further assumption of mixing concerning the series e(.), the estimate 0 will be asymptotically normal with mean 0 and covariance matrix

-~mxxtU)
-77

L~(aldGxx(a)mxx(O) '.

(5.4)

Conditions of the form of (5.3) are known as Grenander's conditions after Grenander (1954). The divisor of T in (5.3) may sometimes be replaced by some other function of T and useful similar results derived. References to this type of result include; Grenander (1954), Rosenblatt (1959), Anderson (1972) and H a n n a n (1973). As a particular, simple, case suppose that

r ( 0 = 0 + ~(t),
then 0 is the simple mean
T--I

(5.5)

O= ~, Y(t)/T
t=0

(5.6)

268

David R. Brillinger

and is asymptotically normal with mean 0 and variance

(5.7)
In other circumstances a component of X(t) may be t, another may be t 2, a further one may be 1 if a treatment is applied at time t and 0 otherwise. In order to construct an (approximate) confidence region for 0 one should expect to have to estimate the matrix (5.4). It is to be remarked that this matrix will not generally be the traditional

o2(XX~)- l~%(O)mxx(O)- ' / T

(5.8)

corresponding to independent errors. Were the error spectrum known one could consider the asymptotically efficient estimate obtained by minimizing the expression
T-1

lYe(k) --OX^( k)12/f~(~k)


k=O

(5.9)

instead of (5.2). Under regularity conditions the asymptotic variance of this estimate may be shown to be 27r
-q7

f~(a)- dGxx(a

))'

(5.10)

In practice one would generally need to replace f~,(.) in (5.9) by an estimate. In principal one can use the above asymptotic distributions to test hypotheses concerning 0. In a number of important circumstances, the matrix function Gxx (.) consists of step functions and the matrices (5.4), (5.10) may be estimated fairly directly and test statistics constructed. The above development proceeded by modeling individual experimental responses. On occasion it may be appropriate to set down a full vector model. An example of such a model is Y(t) = XlOX2(t ) +

e(t)

(5.11)

where Y(t),X2(t ) are vectors, where X 1 is a (design) matrix, where 0 is a matrix-valued parameter and where e(-) is a vector whose components are

Analysis of variance and problems under time series models

269

realizations of independent zero mean stationary time series. If one sets Y = [Y(O)... Y ( T - 1)],

x = Ix2(0).-- x2(r- 1)],


then, for example, the least squares estimate of 0 will be 6 = (X~X1) - IX~YX(XXZ) - 1 and it may be shown to be asymptotically normal under regularity conditions.
5.2. The effect o f autocorrelation

In the context of the present paper, the main use of the results of the previous section may be to serve as a warning against using expressions developed for the case of independent errors (such as (5.8)) in situations where autocorrelation is present. It is seen that the least squares estimate is consistent, generally, however expressions (5.4) and (5.8) are not typically equal. Indeed for the case of (5.5) and (5.6) expression (5.7) shows that the variance of the usual estimate may be either greater or less than that in the case of independence depending on the value of f~(0). This occurrence creates a predicament when one wishes to test an hypothesis. The procedures developed in Section 4 do remain valid here, however having been designed for models allowing all lagged values of the independent series X(-) to enter, undoubtedly they will be inefficient. Some detailed studies have been carried out concerning the effect of continuing to use the standard procedures despite the presence of statistical dependence in the errors. Box (1954) demonstrated that serial correlation within the rows of a two-way array has little effect on the significance level of the usual F-test for equality of column effects, but has a large effect on the F-test for equality of row effects. This is not surprising, for from what has been said just above, serial correlation in time (row) can affect the variance of time (row) means drastically. Triggs (1975) carried out detailed Monte Carlo investigations of the effects of serial correlation on traditional procedures for autocorrelation models other than those investigated in Box (1954). Cochran (1947) and Scheff6 (1959) comment on the effects of departures from underlying assumptions in the case o f statistical dependence and ANOVA models. The effects of serial dependence on certain chi-squared statistics are discussed in Brillinger (1974), Chanda (1975) and Gasser (1975). The workers in sample survey have been concerned with the effects of serial correlation when a systematic sample is drawn, see for example Cochran (1946), Quenouille (1949) and Williams (1956).

270
5.3. Randomization

David R. Brillinger

In certain circumstances the problems of autocorrelation and nonstationarity can be avoided by introducing randomization. A case in point is that of cloud seeding experiments (see for example N e y m a n (1977)) wherein days are judged as to suitability for seeding, then randomization is employed to decide whether or not to seed a given suitable day and the response recorded is the amount of rainfall. In this sort of situation valid significance tests (and confidence intervals) can be constructed by refer ring the test statistic to the randomization distribution obtained by evaluating the statistic for all relevant combinations of the observations. Specific assumptions need no longer be set down here and in a certain sense it is as if the error variates are statistically independent. Cox and Kempthorne (1963) discuss a randomization test for comparing survival curves. On occasion positive serial correlation of adjacent days (units) has been combatted by alternating treatment and control, as in a change-over or Student-sandwich plan; however it is easy to think of correlation structures that these designs do not deal with appropriately.
5. 4. Extensions

A number of papers have appeared on the problem of the statistical design of experiments for the case of a linear model with serially correlated errors. We mention Williams (1952) as an important early paper. Kiefer (1961) discusses the efficiency of certain systematic designs. Jenkins and Chanmugan (1962) are concerned with the comparison of designs to estimate a simple slope parameter when the errors correspond to the values of a discrete time stationary series. Duncan and Jones (1966) remark that for a model like (5.1) one wants to choose X(-) to have power at frequencies where the error power is lowest. This is apparent from the asymptotic covariance matrix (5.4). Berenblut and Webb (1974) consider optimal design criteria for randomized block and Latin square arrangements with simple autocorrelation models. Sacks and Ylvisaker (1966, 1968, 1970a, 1970b) in a series of papers investigate the problem of optimal selection of time points at which to make measurements, for a linear model with autocorrelated errors. Bickel and Herzberg (1977) consider the problem under alternate limit conditions. Bellhouse (1977) develops some conditions for a sampling design, of a population distributed in the plane, to be optimal assuming a stationary spatial correlation model. D u b y et al. (1977) compare some particular field designs when the errors are spatially correlated. A basic notion in all of these papers is to compare designs via some measure of the size of the covariance matrix of the estimates. The asymptotic forms for these matrices, indicated in the paper, are useful here. As

Analysis of variance and problems under time series models

271

Bickel and Herzberg (1977) note, the limiting procedure adopted is essential to the investigation. The discussions of the paper have so far been concerned with the case in which the individual experimental responses are ordinary time series. There are however alternate interesting situations in which the basic responses are point processes, that is sequences of times corresponding to the occurrence of an event of interest, for example the times of death of patients subjected to some form of treatment and having given individual characteristics. As was the case with ordinary time series, the experiment may lead to the measurement of several responses (under different conditions) or to the measurement of a single response. No totally accepted analog of the linear model employed in the analysis of variance and discussed in this paper has come to hand. An approach that is suggestive in many situations is to proceed by modelling the conditional intensity of occurrence

Prob{ point in the time interval (t,t+ h)]~t)/h~y(t,~t),

(5.12)

h small, where ~t is an event referring to the conditions obtaining when a point occurs. Cox (1972a, b) has indicated parametric models for (-) of the form

exp( OX(t) } Yo(t),


and he has described some inference procedures including some for the estimation of 0 and of testing of hypotheses. An alternate form of model, of the character of (4.39), takes the form for (5.12)

t~+ f O(u)X(t- u) du.


This particular form of model for the response has the property that the asymptotic procedures indicated in Section 4 remain relevant provided that one simply alternates the definition of the Fourier transform of the response to
N--1

Y ^ ( k ) = ( 2 q r T ) -1/2 E
j=0

exp{-iXr/}

where the '9 denote the times of events in the observation interval (0, T]. (See Brillinger (1972, 1974).)

272

D a v i d R . Brillinger

Acknowledgements
I would like to thank Professors G. V. Glass, R. H. Shumway and Dr. Agnes M. Herzberg for pointing out certain of the references mentioned in this paper.

Appendix
An r vector-valued variate Y is said to have the complex normal distribution with mean/~ and covariance matrix Z, (denoted NC(/~, )), if the variate

Re
ImY 1 is distributed as

N2r I m / t j ' [ I m Z e

ReE

An r r matrix-valued variate W is said to have the complex Wishart distribution WC(n, ~) if it has the representation
tl

w= ZY Y;
j~l

where the Yj are independent NC(0, l~) variates. These sorts of variates are discussed in Brillinger (1975). A useful result for dealing with sums of squares is provided by the following THEOREM (complex extension of the Fisher-Cochran theorem). distributed as N C ~ , ozI) and let Y*Y=Y~AlY+ - + Y'AKY ,
where A k is Hermitian of rank n k. A necessary and sufficient condition for the quadratic forms Y'AkY to be distributed independently with Vg'AkY distrib2 ~ uted as o 2 Xznk(~ I~/ 2 ) / 2 is that
nl-~,.. --~ n K .~- n .

Let Y be

(See Brillinger (1973).)

Analysis of variance and problems under time series models

273

References
Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive moving average models. Biornetrika 60, 255-265. Akaike, H. (1976). Canonical correlation analysis of time series and the use of an information criterion. System Identification: Advances and Case Studies, Academic Press, New York, 27-96. Akisik, V. A. (1975). On the estimation of parametric transfer functions. Ph.D. Thesis, Univer-~ sity of California, Berkeley. Anderson, T. W. (1972). Efficient estimation of regression coefficients in time series. In: L. LeCam, J. Neyman and E. L. Scott eds., Proc. Sixth Berkeley Syrup. Math. Statist. Prob., University of California Press, Berkeley. Anderson, T. W. (1978). Maximum likelihood estimation for vector autoregressive moving average models. In: Proc. I M S Ames Meeting on Times Series, to appear. Atkinson, A. C. (1969). The use of residuals as a concomitant variable. Biometrika 56, 33-41. Bartlett, M. S. (1938). The approximate recovery of information from replicated field experiments with large blocks, or. Agric. Sci. 28, 418-427. Bellhouse, D. R. (1977). Some optimal designs for sampling in two dimensions. Biometrika 64, 605-611. Bement, T. R. and Milliken, G. A. (1977). Recovery of interblock information for designs with groups of correlated blocks. J. Amer. Statist. Assoc. 72, 157-159. Berenblut, I. I. (1970). Treatment sequences balanced for the linear component of residual effects. Biometrics 26, 154-156. Berenblut, I. I. and Webb, G. J. (1974). Experimental design in the presence of autocorrelated errors. Biometrika 61, 427-437. Bickel, P. J. and Herzberg, A. M. (1977). Robustness of design against autocorrelation in time I: Asymptotic theory, optimality for location and linear regression. To be published. Borpujari, A. S. (1977). An empirical Bayes approach for estimating the mean of N stationary time series. J. Amer. Statist. Assoc. 72, 397-402. Box, G. E. P. (1950). Problems in the analysis of growth and wear curves. Biometrics 6, 362-389. Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems II: Effects of inequality of variance and of correlation between errors in the two-way classification. Ann. Math. Statist. 25, 484-498. Box, G. E. P. and Hay, W. A. (1953). A statistical design for the efficient removal of trend occurring in a comparative experiment with an application in biological assay. Biometrics 9, 304-319. Box, G. E. P. and Tiao, G. C. (1965). A change in level of a non-stationary time series. Biometrika 52, 181-192. Box, G. E. P. and Tiao, G. C. (1975). Intervention analysis with applications to economic and environmental problems. J. Amer. Statist. Assoc. 70, 70-79. Brillinger, D. R. (1969). A search for a relationship between monthly sunspot numbers and certain climatic series. Bull. Internat. Statist. Inst. 43, 293-306. Brillinger, D. R. (1972). The spectral analysis of stationary interval functions. In: L. M. LeCam, J. Neyman, E. L. Scott eds., Proc. Sixth Berk. Symp. Math. Statist. Prob. University of California Press, Berkeley, 483-513. Brillinger, D. R. (1973). The analysis of time series collected in an experimental design. In: P. R. Krishnaiah ed., Multivariate Analysis 111, Academic Press, New York, 241-256. Brillinger, D. R. (1974). Cross-spectral analysis of processes with stationary increments including the stationary G~ G/oo queue. Ann. Probability 2, 815-827.

274

David R. Brillinger

Brillinger, D. R. (1974). The asymptotic distribution of the Whittaker periodogram and a related chi-squared statistic for stationary processes. Biometrika 61, 419-422. Brillinger, D. R. (1974). Fourier analysis of stationary processes. Proc. 1EEE 62, 1628-1643. Brillinger, D. R. (1975). Time Series: Data Analysis and Theory. Holt, Rinehart and Winston, New York. Campbell, D. T. (1963). From description to experimentation: interpreting trends as quasiexperiments. In: W. P. Harris ed., Problems in Measuring Change, University of Wisconsin Press, Madison. Chanda, K. C. (1975). Chi-square goodness-of-fit tests for strong mixing stationary processes. Report ARL TR 75-0016 Aerospace Res. Labs. Wright-Patterson A.F.B. Church, A. (1966). Analysis of data when the response is a curve. Technometrics 8, 229-246. Cleveland, W. S. and Pavzen, E. (1975). The estimation of coherence, frequency response, and envelope delay. Technometrics 17, 167-172. Cochran, W. G. (1946). Relative accuracy of systematic and stratified random samples for a certain class of populations. Ann. Math. Statist. 17, 164-177. Cochran, W. G. (1947). Some consequences when the assumptions for the analysis of variance are not satisfied. Biometrics 3, 22-38. Cole, J. W. L. and Grizzle, J. E. (1966). Applications of multivariate analysis of variance to repeated measurements experiments. Biometrics 22, 810-828. Cox, C. P. (1962). The relation between covariance and individual curvature analyses of experiments with background trends. Biometrics 18, 12-21. Cox, D. F. and Kempthorne, O. (1963). Randomization tests for comparing smvival curves. Biometrics 19, 307-317. Cox, D. R. (1951). Some systematic experimental designs. Biometrika 38, 312-323. Cox, D. R. (1952). Some recent work on systematic experimental designs. J. Roy. Statist. Soc. Ser. B 14, 211-219. Cox, D. R. (1972a). Regression models and life-tables. J. Roy. Statist. Soc. Ser. B 34, 187-220. Cox, D. R. (1972b). The statistical analysis of dependencies in point processes. In: P. A. W. Lewis ed., Stochastic Point Processes, Wiley, New York, 55-56. Daniel, C. and Wood, F. S. (1971). Fitting Equations to Data. Wiley, New York. Daniels, H. E. (1938). The effects of departures from ideal conditions on the t and z tests of significance. Proc. Cambridge Phil, Soc. 34, 321-328. Davies, R. B. (1973). Asymptotic inference in stationary Gaussian time series. Adv. AppL Prob. 5, 469-497. Deisfler, M., Dunsmuir, W. and Hannan, E. J. (1978). Vector linear time series models corrections and extensions. Adv. Appl. Prob., to appear. Duby, C., Guyon, X. and Prum, B. (1977). The precision of different experimental designs for a random field. Biometrika 64, 59-66. Duncan, D. B. and Jones, R. H. (1966). Multiple regression with stationary errors. J. Amer. Statist. Assoc. 61, 917-928. Dunsmuir, W. and Hannan, E. J. (1976). Vector linear time series models. Adv. AppL Prob. 8, 339-364. Dzhaparidze, K. O. and Yaglom, A. M. (1973). Asymptotically efficient estimation of the spectrum parameters of stationary stochastic processes. Proc. Prague Symp. Asymp. Statist. Vol. 1, Charles University Press, Prague, 55-105. Edgington, E. S. (1967). Statistical inference from N z 1 experiments. J. Psychology 65, 195-199. Elston, R. C. and Grizzle, J. E. (1962). Estimation of time-response craves and their confidence bands. Biometrics 18, 148-159. Fearn, T. (1975). A Bayesian approach to growth curves. Biometrika 25, 357-381.

Analysis of variance and problems under time series models

275

Fearn, T. (1977). A two-stage model for growth curves which leads to Rao's covariance adjusted estimators. Biometrika 64, 141-143. Frederiksen, C. H. (1974). Models for the analysis of alternate sources of growth in correlated stochastic variables. Psychometrika 39, 223-245. Gallant, A. R., Gerig, T. M. and Evans, J. W. (1974). Time series realizations obtained according to an experimental design. J. Amer. Statist. Assoc. 69, 639-645. Gasser, T. (1975). Goodness-of-fit tests for correlated data. Biometrika 62, 563-576. Geisser, S. (1970). Bayesian analysis of growth curves. Sankhy8 Ser. A 32, 53-64. Ghosh, M., Grizzle, J. E. and Sen, P. K. (1973). Nonparametric methods in longitudinal growth studies. J. Amer. Statist. Assoc. 68, 29-36. Gill, J. L. and Hafs, H. D. (1971). Analysis of repeated measurements of animals. J. Animal Sci. 33, 331-336. Glaeser, L. J. and Olkin, I. (1972). Estinmtion for a regression model with an unknown covariance matrix..In: L. LeCam, J. Neyman and E. L. Scott, eds., Proc. Sixth Berkeley Symp. Math. Statist. Prob. University of California Press, Berkeley, 541-568. Glass, G. V., Willson, V. L. and Gottman, J. M. (1975). Design and Analysis of Time Series Experiments. Colorado Associated University Press, Boulder. Green, B. F. and Tukey, J. W. (1960). Complex analyses of variance: general problems. Psychometrika 25, 127-152. Grenander, U. (1954). On the estimation of regression coefficients in the case of an autocorrelated disturbance. Ann. Math. Statist. 25, 252-272. Grizzle, J. E. and Allen, D. M. (1969). Analysis of growth and dose response curves. Biometrics 25, 357-381. Gupta, N. K. and Mehra, R. K. (1974). Computational aspects of maximum likelihood estimation and reduction in sensitivity function calculations. IEEE Trans. Automatic Control AC 19. Hall, R. V. et al. (1971). The teacher as observer and experimenter in the modification of disputing and talk-out behaviors. J. Appl. Behav. Anal. 7, 635-638. Hall, W. B. (1973). Repeated measurements experiments. In: V. J. Bofinger and J. L. Wheeler, eds., Developments in FieM Experiment Design and Ann&sis, Alden Press, Oxford, 33-41. Hall, W. B. and Williams, E. R. (1973). Cyclic superimposed designs. Biometrika 60, 47-53. Hannan, E. J. (1970). Multiple Time Series. Wiley, New York. Hannah, E. J. (1973). Central limit theorems for time series regression. Z. Wahrschein. 26, 157-170. Hannah, E. J. (1976). The identification and parametrization of ARMAX and state space forms. Econometrica 44, 713-723. Harville, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. J. Amer. Statist. Assoc. 72, 320-337. Hawkins, D. M. (1977). Testing a sequence of observations for a shift in location. J. Amer. Statist. Assoc. 72, 180-186. Hibbs, D. A. (1974). Problems of statistical estimation and causal inference in time series regression models. In: H. L. Cootner, ed., Sociological Methodology 1974, Jossey-Bass, San Francisco. Hibbs, D. A. (1977). On analyzing the effects of policy interventions: Box-Jenkins and Box-Tiao vs. structural equation models. In: D. R. Heis, ed., Sociological Methodology 1977, Jossey-Bass, San Francisco, Ch. 4. Hills, M. (1968). A note on the analysis of growth curves. Biometrics 24, 192-196. Huynh, H. and Feldt, L. S. (1970). Conditions under which mean square ratios in repeated measures designs have exact F-distributions. J. Amer. Statist. Assoc. 65, 1582-1589.

276

David R. Brillinger

Jenkins, G. M. and Chanmugan, J. (1962). The estimation of slope when the errors are autocorrelated. J. Roy. Statist. Soc. Ser. B 24, 199-214. Jones, R. H., Crowelt, D. H. and Kapuniai, L. E. (1969). A method for detecting change in a time series applied to newborn EEG. Eleetroenceph. clin. Neurophysiol. 27, 436-440. Joreskog, K. G. (1973). Analysis of covariance structures. In: P. R. Krishnaiah, ed., Multivariate Analysis 11I, Academic Press, New York, 263-285. Kazdin, A. E. (1976). Statistical analyses for single-case experimental designs. In: M. Hersen and D. Barlow, eds., Single Case Experimental Designs: Strategies for Studying Behavior Change, Pergamon, New York, Ch. 8. Khatri, C. G. (1966). A note on a MANOVA model applied to problems in growth curve. Ann. Inst. Statist. Math. 18, 75-86. Kiefer, J. (1961). Optimum experimental designs V, with applications to systematic and rotatable designs. In: Proe. Fourth Berkeley Syrup. Math. Stat. Prob. Vol. I. California, Berkeley, 381-405. Kitagawa, T. (1951)~ Analysis of variance applied to function spaces. Mere. Faculty Sei. Kyusyu Univ. Ser. A 6, 41-53. Koch, G. G. et al. (1977). A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics 33, 133-158. Krishnaiah, P. R. (1969). Simultaneous test procedures under general MANOVA models. In: P. R. Krishnaiah, ed., Multivariate Analysis H, Academic Press, New York, 121-143. Krishnaiah, P. R. (1975). Simultaneous tests for multiple comparisons of growth curves when errors are autocorrelated. Bull. lnternat. Statist. Inst. 46 (4) 62-66. Larsen, W. A. (1969). The analysis of variance for the two-way classification fixed-effects model with observations within a row serially correlated. Biometrika 56, 509-515. Langmuir, I. (1950). A seven day periodicity in weather in United States during April, 1950. Bull. Amer. Meteorol. Soc. 31, 386-387. Ljung, G. M. and Box, G. E. P. (1976). Analysis of variance with autocorrelated observations. Tech. Report No. 478, Dept. of Statistics, University of Wisconsin. Marten, K. and Marler, P. (1977). Sound transmission and its significance for animal vocalization I, II. Behav. Ecol. Sociobiol. 2, 271-290 and 291-302. Mehra, R. K. (1971). Identification of stochastic linear dynamic systems. AIAA Journal 9, 28-31. Morrison, D. F. (1970). The optimal spacing of repeated measurements. Biometrics 26, 281-290. Neyman, J. (1929). The Theoretical Basis of Different Methods of Testing Cereals H. Buszczynski and Sons, Warsaw. Nicholls, D. F. (1976). The efficient estimation of vector linear time series models. Biometrika 63, 381-390. Neyman, J. (1977). A statistician's view of weather modification technology (a review). Proc. Nat. Acad. Sci. 74, 4714-4721. Papadakis, J. S. (1937). Methode statistique pour des experiences sur champ. Bull. lnst. Amel. Plantes a Salonique 23. Patterson, H. D. (1952). The construction of balanced designs for experiments involving sequences of treatments. Biometrika 39, 32-48. Patterson, H. D. (1964). Theory of cyclic rotation experiments. J. Roy. Statist. Soc. Ser. B 26,
1-45.

Paterson, H. D. and Henderson, R. (1973). Recovery of information in the analysis of serial factorial experiments. Proc. 39th Session Internat. Statist. Inst. 14 (1), 508-514. Pothoff, R. F. and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313-326. Quenouille, M. H. (1949). Problems in plane sampling. Ann. Math. Statist. 20, 355-375. Quenouille, M. H. (1949). On a method of trend elimination. Biometrika 36, 75-91.

Analysis of variance and problems under time series models

277

Rao, C. R. (1958). Some statistical models for comparison of growth curves. Biometrics 14, 1-17. Rao, C. R. (1965). Linear Statistical Inference and Its Applications. Wiley, New York. Rao, C. R. (1965). The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrika 52, 447-458. Rosenblatt, M. (1959). Statistical analysis of stochastic processes with stationary residuals. I n : U. Grenander, ed., Probability and Statistics, Wiley, New York, 246-275. Roy, S. N., Gnanadesikan, R. and Srivastava, J. N. (1971). Analysis and Design of Certain Quantitative Multiresponse Experiments. Pergamon, Oxford. Sacks, J. and Ylvisaker, D. (1966). Designs for regression problems with correlated errors. Ann. Math. Statist. 37, 66-89. Sacks, J. and Ylvisaker, D. (1968). Designs for regression problems with correlated errors; many parameters. Ann. Math. Statist. 39, 49-69. Sacks, J. and Ylvisaker, D. (1970a). Statistical designs and integral approximation. In: Time Series and Stochastic Processes: Convexity and Combinatorics, Canadian Math. Congress, Montreal, 115-136. Sacks, J. and Ylvisaker, D. (1970b). Designs for regression problems with correlated errors III. Ann. Math. Statist. 41, 2057-2074. Scheff~, H. (1959). The Analysis of Variance. Wiley, New York. Searle, S. R. (1971). Linear Models. Wiley, New York. Seber, G. A. F. (1977). Linear Regression Analysis. Wiley, New York. Shumway, R. H. (1970). Applied regression and analysis of variance for stationary time series. J. Amer. Statist. Assoc. 65, 1527-1546. Shumway, R. H. (1971). On detecting a signal in N stationarily correlated noise series. Technometrics 13, 499-519. Shumway, R. H. and Dean, W. C. (1968). Best linear unbiased estimation for multivariate stationary processes. Technometrics 10, 523-534. Skorokhod, A. V. (1956). Limit theorems for stochastic processes. Theory Prob. Appl. 1, 261-290. Snee, R. D. (1972). On the analysis of response curve data. Technometrics 14, 47-62. Srivastava, J. N. and McDonald, L. L. (1971). Analysis of growth curves under the hierarchical models and spline regression I. Aerospace Research Laboratories Report ARL 71-0023, Wright-Patterson AFB, Ohio. Tarone, R. E. (1975). Tests for trend in life table analysis. Biometrika 62, 679-682. Tiao, G. C. and Tan, W. Y. (1966). Bayesian analysis of random-effects models in the analysis of variance II. Effect of autocorrelated errors. Biometrika 53, 477-495. Triggs, C. M. (1975). Effects of serial correlation on linear models. Ph.D. Thesis, University of Auckland, New Zealand. Tukey, J. W. (1961). Discussion emphasizing the connection between analysis of variance and spectrum analysis. Technometrics 3, 1-29. Wahba, G. (1968). On the distribution of some statistics useful in the analysis of jointly stationary time series. Ann. Math. Statbt. 38, 1849-1862. Wahba, G. (1971). Some tests of independence for stationary multivariate time series. J. Roy. Statist. Soc. B 33, 153-166. Wallenstein, S. and Fisher, A. C. (1977). The analysis of the two-period repeated measurements crossover design with application to clinical trials. Biometrics 33, 261-269. Whittle, P. (1954). Some recent contributions to the theory of stationary processes. Appendix 2 In: H. Wold, A Study in the Analysis of Stationary Time Series, Almqvist and Wiksells, Uppsala, app. 2. Whittle, P. (1961). Gaussian estimation in stationary time series. Bull. Internat. Statist. Inst. 35, 109-129. Williams, R. M. (1952). Experimental designs for serially correlated observations. Biometrika

P. R. Krishnaiah, ed., H a n d b o o k o f Statistics, Vol. t North-Holland Publishing Company (1980) 279-320

Tests of U n i v a r i a t e a n d M u l t i v a r i a t e N o r m a l i t y

K. V. Mardia

1.

Introduction

G e a r y [37] writes: ' " N o r m a l i t y is a myth; there never was, and never will be a normal distribution." This is an overstatement f r o m the practical point of view, but it represents a safer united mental attitude than any in fashion during the past two decades.' The same remarks apply today just as much as they did in 1947. The normal distribution has always been the most widely used distribution, since if it may be assumed for a population, it gives rise to a rich set of mathematical consequences. However just as important as normal theory is, so is it important to know when one is departing from this idyllic state of universal normality. We shall adopt the following notation. Let X l , X 2 . . . . . x n be an independent and identically distributed r a n d o m sample of size n, from a population with probability density function f ( x ) , and distribution function (d.f.) F ( x ) . Let @ be the d.f. of x distributed normally. Our null hypothesis is Ho: F ( x ) = d~(x).

Let Xtl),X(2),...,Xtn ) be defined to be the n order statistics of the sample so


that x(1) <<.xtz) < <xtn ). Let /~ and 02 be the mean and variance respectively of a normal population with sample counterparts 2 and S z and define

m r ----

Pl

IN, X ir,

mr=

--

(Xi--X)

/"/ i = 1

to be the rth sample moments about the origin and mean respectively. Let
279

280

K. V~ Mardia

s 2 = n S Z / ( n - 1) where S 2= m 2. Further define


= E(x'), = E(x -

2.
2.L

Tests based on descriptive measures


S k e w n e s s a n d kurtosis

A basic means of distinguishing between the shape of two distributions has historically been taken to be the consideration of the skewness and kurtosis coefficients. These are defined as
/~i~ ]3/~2 2 3

and

]32 ~"~~4/~2,

where/~i is the ith moment about the mean. For the normal distribution fll = 0 and ]32= 3, and therefore if we define 71=V']31 and

T2=]32--3,

under the null hypothesis of normality 71=72=0. The sample skewness and kurtosis coefficients are defined by
~/bl=m3/S 3

and

b2=m4/S

4.

Note that V' b~ can take negative values in this notation.


2.2. A test based on cumulants

Fisher [33] proposed the use of cumulants to test for departures from normality. Define a series of statistics { k} such that the mean value of kp is the pth cumulative moment function of the population. The simplest measure for departure from normality is then
3' = k 3 / k 3/2.

The relationship between ~/b I

and

k 3 / k 3/2 is given by - 2).

7 = ( n ( n - 1) } 1/2V' b J ( n

By forming recurrence relations, the moments of 7 can be obtained. Let yj

Tests of univariate and multivariate normality

281

be 7 calculated from j observations. Hence expressing ~'n in terms of 7n- 1, Fisher derives the exact values of the first three even moments of the ratio ~,. Fisher also proposed the following statistics based on higher cumulants,

(~= k4/k 2 and

c=ks/k~/2.

The relationship between 6 and b 2 is given by {b2 = (n+l)(n-l) (n - 2)(n - 3) Also


n l/2(n - 1) 3/2 m5 (n __-2--~---3)-~--4){ (n + 5) ~ m -10(n-l)~/b

3(n-l) n+-l-

)"

I }.

Thus . / a n d 6 lead to the same tests as in Section 2.1.

2.3. Null distributions of

~/ b I

and be

Pearson [63], using the results of Fisher [33] and Wishart [86], obtains the first four moments of the sampling distributions of Vb~ and b 2 up to terms in n -3, under the null hypothesis of normality. We have for ~/b~ E ( V b,) =/3,( V 61) ~- 0 ,

( V b l ) = V ( 6 / n ) ( 1-3n + 6n 2
and

n 315 + . . . )

B 2 ( ~ / b 0 = 3 + 36 _ 86___44 + 12,09__.____66 ..... n n2 n3 For b2, we have

E(b2)= 3 ( n - 1)

(n+l)

o(b2) = X/(24/n) 1 - ~ fi,(b2) = 2_~1n6 (1 - - -29 -+ n


B2( b2) = 3 + - -

271 + -8n 2 519 n2 7637 --+ n3

2319 - 16n 3

+...

... )

540

20,196 n2

470,412 n3

282

K. V, Mardia

Thus ~/b 1 is a symmetrical leptokurtic curve which fairly rapidly tends to normal as n increases, however b 2 is ve12 skewed for n = 100 and is hardly normal for n = 1000. Approximations to the distributions of x/b1 and b 2 are therefore found by representing the distributions by Pearson curves. Hence 5% and 10% significance points are given for various values of n. Recent distribution work is discussed in Section 2.7. Tables for the percentage points of 1/bl and b2 are reproduced in P e a r s o n and ttartley [65, Vol. 1, Tables 34 B and C]. Mulholland [62] investigates the distribution of -x/b~ under /4o, for samples of size less than 25. Define fn to be the probability density function (p.d.f.) of x/b 1 under H 0. Following Mulholland [61, 62], the leading singular terms of f, can be obtained at the singularities and using ultraspherical polynomials and Gram-Charlier expansions, approximate critical values of X/b~ are obtained for values of n from 4 to 25.

2.4.

Tests based on mean deviation

Geary [35] proposed the ratio of mean deviation to standard deviation, estimated from the sample. If the mean is known and assumed equal to zero, we have simply

E Ixil

w"= ~/n ( X x i ) 2.~1/2


and under the null hypothesis, Geary derives all the moments of w,. Under normality, the ratio tends asymptotically to (2/~@/2. An alternative statistic was derived for the general case by Helmert's transformation on the x, the transformation being given by yi=(Xl'k-X2-} - ' ' " q"X i - i x , + l ) / ( i ( i + y =(xl+...

1)} 1/2, i=

1,2 ..... n--1,

+ x n ) / n 1/2.

Thus if the x are from a normal population, then the y are also normal.

E(Yi)=O,

i= l ..... n - l ,

and under the null hypothesis of normality the Yi are mutually independent for i = 1 , . . . , n - 1 . The transformation is orthogonal and the test statistic becomes
n--I

wn_,= ~] ]xl+ . . . . + x i - i x i + l [ / [ { n ( n - 1 ) i ( i + l ) } I / 2 S ] .
i=1

Tests o f univariate a n d multivariate normalitY

283

The transformation however, depends on the order that the original sample took, as the transformation is not symmetric in the y. The only way to resolve the problem is to take permutations of all the original observations and calculate wn for each permutation, taking the mean or median value of the w's as the test statistic. Geary [35] also suggested

a= E

Ixi--

xl/Snl/2

as the obvious extension to the general case for unknown /x and o z, the moments of which, are derived under H 0 in [36]. We have

E(a) = ~/(2/~r) +
0.045070 var(a) = - - //

0.199471
n //2

0.024934 + - n2

0.031168
n3

0.008182
n4

+--.

0.124648

0.084859
//3

-~

0.006323
//4

+""

1.7618(2.36818,8646)

~/fll(a)=

~/~
n

n
7"628 tn

n2
}

+""

/32(a)=3 +5.441 {1

Under H 0, a is asymptotically N {(2 / ~r)1/2, 0.045070/n }. Note that Dumonceaux, Antle and Haas [28] show that a, when used as a one-sided test, is equivalent to the likelihood ratio test for normality against a double exponential distribution. Further, it is equivalent asymptotically, to the most powerful test for this case. For another justification see Section 4.3. It should be noted that under H 0 corr(b 2, a) = - { 12(~r - 3)} -1/2. Tables of a are reproduced in Pearson and Hartley [65, Vol. 1, Table 34A].

2.5.

A class of statistics

Geary [37] considered the following classes of test statistics,


1
a(c) : - ~

/=1 [xi--~l ~,

c>~l

g(d)= ~sd[

~IXi--Yl dXi ~ X

5~ Ix,-~l ~],d>~l.
Xi <X J

284

K. K

Mardia

Special cases are b2=a(4 ) and # b I =g(3), which are of course the standard sample estimates of/32 and -v//31. Further a ( 1 ) = a . In showing the asymptotic normality of a(e), Geary approximated it by a~(c) which is defined by
n

a,(c) = ~

E Ixi-~ Ixlc
i=1

and derived the asymptotic distribution of al(c ). However, the quantities al(c ) and a(c) do not always have the same limiting distribution except when the underlying distribution is symmetric [34]. Testing for normality against the family of alternative distributions given by

f ( x ) = X/(2~r-~

1 [1 + ~

d;]ex2j2

Geary concludes, for large samples b2= a(4) is the most powerful test statistic in the family defined by a(c) and similarly for ~/b 1=g(3) in the family defined by g(d).

2.6.

A test based on the range of the sample

David, Hartley and Pearson [24] presented a test for homogeneity and normality of data based on the ratio of range w to "standard deviation" s from a sample of n points. If the sample x~,x z..... x n is ordered as X ( l ) " " " < X ( n ) then the statistic is defined as u = w / s , where w = X(n ) - X ( l ) Now, ( x i - ~ ) / s , i = 1 ..... n is independent of s and hence E ( u r ) =

E(w')/E(s').
Taking samples of varying sizes the first four values of E ( w r) and E(s ~) were calculated and thus the first four moments of u. Pearson curves with the appropriate first four moments were used and 100a% points of the distribution of u were approximated. Tables of u are reproduced in [65, Vol. 1, Table 29].

2. 7.

Omnibus tests based on moments

Recently there has been much work done in producing an omnibus test of normality, combining ~/b 1 and b 2 in one test statistic. D'Agostino and Pearson [21] suggested the statistic

X2( ~/ bl)-l-- X 2(b2)


for testing for departures from normality, where X 2 ( # b 0 and X2(b2) are

Tests of univariate and multivariate normality

285

standardised normal equivalent deviates, the sum being distributed as X22. However this test requires the assumption of independence between 1/bl and b2 and hence, this not being the case, the statistic is not recommended, a fact which the authors themselves were to point out. Bowman and Shenton [11] suggest in passing, the statistic

(1/b,)2/o~ + ( b2-

3)2/022,

where o2=6/n, o2=24/n, are the asymptotic variances of 1/b~ and b 2 respectively. The statistic is distributed as X2 under H 0. Also mentioned is the suggestion by Cox and Hinldey [17, pp. 121-i25], max([ 1/bii/Ol, 1b2 - 3[/02). Bowman and Shenton [11] propose the use of contours in the (1/bl,b2) plane, by approximating 1/b I and b 2 by the Johnson family of curves [49, p. 22] which are denoted by Su and S s. Note Pearson [63] in which 1/bl and bz are approximated by Pearson curves. Hence we have,

(i) (ii)

Xs(1/bl)= 81 s i n h - 1(!/bl/)h), n/> 25, Xs(b2) 62sinh-I{(b2-~)/)~2}, n < 2 5 , Xs(b2)=yz+Szlog ( ~2+~ _b2-~b2 ),
= 72+

where 61, 82, y/, ~1, ~k2 and ~ are constants and the test statistic becomes

rs= Xg(1/ bl) + X~( b2).


Since Xs(1/bl) and Xs(b2) a r e uncorrelated and nearly independent, Ys is approximately a X2 variable. In [76] Shenton and Bowman attack the problem of finding the joint density of 1/bl and b2 under the hypothesis of normality. X/bl and b 2 are uncorrelated but not independent, and in fact Fisher [33] shows the correlation between bl and b 2 to be

54n(n 2-9)
(n - 2)(n + 5)(n + 7)2( fl2(V 61) -- 1)

) 1/2

Shenton and Bowman [76] propose the joint density of 1/bl and b 2 t o possess the properties that 1 / b l has a marginal distribution based on Johnson's S u and b 2 has a gamma density conditional on b 2 > 1 + b v

286

K. K Mardia

A Monte Carlo study was carried out, with tile conclusions that tile model gave results in good agreement with theoretical values, and results which were of a better shape than those from Ys 2, presumably because of the correlation between b~ and b z which the I,-2 model ignores.

2.8.

R-test

Pearson et al. [64] propose the following test. Let +x/b~(~*) be the upper and lower 100a*% points of x/b I and Lb2(a*) and ubz(t~*) be the lower and upper 100a*% points of b 2. Hence we can define a rectangle in the ~/bl, b a plane with vertices { - ~ / b l ( a * ) , u b z ( ~ * ) } , { X~bl(~*), t:bz(a*)}, { - k~ bl(a*), Lb2(a*)}, and (-x/bl(a*), Lb2(a*)} Hence if x/b~ and b 2 were independent, the probability of the point (X/hi,b2) falling outside the rectangle would be

a=4(~* -- (0/*)2}.
The actual values of ~* are obtained from the Johnson curve approximation. However since V' bl and b 2 are not independent the test will be conservative as it stands. To find the correction needed to compensate for this false assumption of independence, large scale simulations were undertaken. These showed that the discrepancy between the independent values and the true simulated values, fell off with increasing n. Tables of corrected values are given for n =20, 50 and 100 for ~ = 0 . 0 5 and 0.10.

3. 3.1.

Shapiro-Wilk's W-test and its modifications Shapiro-Wilk's W-test

With the appearance of the statistic W proposed by Shapiro and Wilk [74], a renewal of interest was aroused in the whole area of testing for normality. The method depends on order statistics as follows. Define c to be the vector of expected values of the n order statistics from a N(0, 1) distribution, and let V be the corresponding expected covariance matrix. For a normal distribution with m e a n # and variance 02, it is known that E(x), where x is the vector of order statistics, is given by
E ( x ) = ~c0 + oc,

where c0 = 1 is the n x 1 vector of l's. The covariance matrix is given by o2V and hence the best linear unbiased estimate of (r from the G a u s s -

Tests of univariate and multivariate normality

287

Markoff Theory is given by


(~2 :~_ e t V - i x / c / V le.

An unbiased estimate of o 2 for any population with finite second moment is 6~ = Z ( x , - ) 2 1 ( n - 1) = s 2. A test of normality is thus to compare the ratio ( 6 / 6 ) 2 with 1, which is the statistic W apart from a constant multiplier. W is usually defined as W= (a'x) 2 = ( E aixi) 2

(~- l) ~
where
at etV - 1

E (x,- g) ~

(e,V-2e) l/2 Values of a can be found from tables given by Sarhan and Greenberg [70] for sampl.~s of sizes up to 20. For n >20, Shapiro and Wilk [74] suggest the following approximation to a.

ai":-2mici,
and

i = 2 , 3 ..... n - 1

a~= 2_ r{(n+l)}

"o- r(

.+02,J2

Hence the a coefficients are obtained for sample sizes of up to 50 and are given in Table 5 of [74]. Tables for W are reproduced in [65, Vol. 2, Tables 15-18].

3.2.

D'Agostino's D

A modification of the W-test was given by D'Agostino [20] which does not require any table of a weights, and is as follows. T

n2S
where

T=~{i ,=1 -1 2(n+l))

x(

288

K.V. Mardia

and T is a constant multiple of Downton's [27] unbiased estimate for the standard deviation of the normal distribution. Under H 0

E ( D ) = ( n - 1)F{(n- 1)/2} 1 2(2nrr)'/2F(n/2) ~-- 2X/g"


Asymptotically the standard deviation of D, k becomes A = ( 1 2 ~ / 3 - - 3 7 + 2 ~ r ) 1/2 = 0.02998598 247rn x/n Hence, an approximate standardised variable is y=A-l{D-(2X/rr) -'} and y is asymptotically N(0, 1) under H o. If H 0 is not true, the value of y will tend to differ from zero. A simulation study indicated that for alternative distributions with less kurtosis than the normal, y tends to be greater than zero and for distributions with greater kurtosis, y tends to be less than zero. D'Agostino [20] gives an improved approximation to D under H0, and tabulates percentage points for D obtained by using Cornish-Fisher expansions.

3.3.

Shapiro-Francia's W'

Shapiro and Francia [73] suggest that for large samples the ordered observations may be treated as if they were independent (i.e. vii=0, i ~ j ) and hence W can be replaced by

w ' - (b'x):
nS 2
where
C!

(c,c)l/2' ci being the expected value of the ith order statistic from a normal population. Values of c are given in [44] for sample sizes up to 400. That the test is consistent is shown in [71 ]. Pearson et al. [64] cast some doubt on the critical values of W' given by Shapiro and Francia [73], since for samples of size n =35, 50,51(2)99, the

Tests o f u n i v a r i a t e a n d m u l t i v a r i a t e n o r m a l i t y

289

simulated critical values were calculated from only 1000 samples. Pearson et al. [64] recalculate the lower percentage points of W', this time based on 50,000 simulations, for n--99, 100 and 125. A comparison of the two sets of values shows that the Shapiro-Francia values are somewhat higher, by amounts sufficient to show the test to be more powerful than it is.

3.4.

Weisberg-Bingham's fie'

A further modified W-statistic fie' was proposed by Weisberg and Bingham [85] through W' as

fie,=
where

nS 2

'e)

i 2 ---7-7-, ln+~J

i = 1,2,...,n

and ~ - l ( . ) is the inverse of the d.f. for N(0, 1). The statistic fie' is thus identical to the W' defined by Shapiro and Francia [73], with the substitution of ~ for c. This has the advantage over W and W' that no storage of constants is necessary, making it more suitable for machine computation, provided of course that a routine is available to evaluate ~ - l ( p ) . The approximations ci to c, suggested by Blom [10] are very close, especially in small samples, and because of this very close agreement between 5; and % the null distributions of fie, and W' are practically identical.

3.5.

A generalization of W

Puri and Rao [69] consider a generalization of the Shapiro-Wilk test. Consider the expected value of the i th order statistic to be
E ( x ( i ) ) = -/1 -~ -/ 2 e i -t-

"/3( C2- ~k) -t- -/4(e? --

~ e i ) -~ . . . ,

where X and/~ are chosen so as to provide orthogonal polynomials. When the underlying distribution is normal 71=~,-/2=0,]t3~--/4 ..... 0 (see Section 3.1). The Shapiro-Wilk's test essentially tests 72 = o, and Purl and Rao investigate whether any further information can be obtained by incorporating tests on Y3 and 74 with the W test. Under orthogonal polynomial representation, it is found that E(x) = 7161 + 72b2 + Y363 -t- Y464,

290 where
b l = C 0, b2 = C l ,

K. V. Mardia

b3=C 2

elW~co c;V-tc0 '

' V - lC 1 C3

b4= 3
c t i V - lCl '

and

c~=(c~..... c.0, j=0,1,2 .....


The best linear unbiased estimates of the y are given by

b~V- 1x
b~V- lbi ' with variance given by
O2
vG) = , ------i- "

blV

bi

Hence we can define the test statistics for 72 = o, 73 = T2= W 1 / 2 =


^

0,

~4 ~--0 as

3~2(biV-lb2) 6(b;V-ab2) l / a '

7 3 ~-

--~ (b;V-lh3)1/2,
^

T4 = - ~ (b~V-lh4) W2. From a Monte Carlo study against W alone, Puri and Rao that the use of W and T 3 jointly leads to a more efficient test normality against non-symmetric alternatives, and similarly and T4 jointly is more efficient in measuring departures kurtosis. [69] conclude procedure for the use of W from normal

4. 4.1.

Likelihood approach A shifted power transformation

One method of assessing normality is to use the Box and Cox [12] shifted power transformation,

x!~'~)= / {(x'+~)~- 1)/x, [ log(xi + ~),

x~0,x,>-~,
)~ = O, X i > - - ~.

Tests of univariate and multivariate normality

291

X1. . . . . Xn are independent N(/~, c~ 2) random variables if and only if X= 1. Hence we can test H 0 : ) t = 1 against H 1 : X # 1. Under H l, it can be shown (see [4, 12]) that the logarithm of the likelihood function, initially maximized with respect to/z and 02 is L .... (~,X) = - -2- log27r- ~- log62+ (X - 1)
n n

log(xi+~),

i=l
where
ni=l

and

n i=1

After substituting xi(a'x) in terms of x i, we can now maxim~e Lm~x(~,X) to obtain numerically the maximum likelihood estimates ~,h of ~,)t. The likelihood ratio test leads to the test criterion 2{Lm~(~,~)-Lm~x(~, 1)} which is asymptotically distributed as X~. Note that Lmax(~, 1) does not depend on ~ and therefore any value of ~ including ~, will maximize L(~, 1). Following the same procedure, a 100(1- a)% confidence region for and )t can be constructed for large n from

2{ Lmax(~,X ) -- Lmax(~,)k) } % X22(00,


where X2(a) is the upper 100a% point of X~. If the region contains X-- 1, then H 0 is accepted. The test of significance is of course a more stringent procedure. The above approximate procedures can be extended to cover linear models [12]. A n d r e w s [1] has proposed an exact procedure. Atkinson [6] gives an asymptotically normal test statistic derived from the slope of the likelihood function for these transformations. The Box and Cox test as well as the Atkinson test, are shown to be uniformly more powerful than the exact test of Andrews for a numerical example. This transformation also provides a transformation to normality if the hypothesis of normality is false. Kaskey et al. [50] also give a transformation to normality through the Pearson family of distributions. The Johnson family of distributions [48] would also be specifically useful for this purpose.

4. 2.

Shift and scale family

A likelihood ratio test is also proposed as a means for testing for normality by Dumonceaux, Antle and Haas [28]. They consider the following problem.

292

I. V. M a r d i a

Let xl,x 2..... x n be n independent observations from either fo(x;l~,o ) or fl(x; i~,o), where 1 ~x--tt

i=1,2,

and let f0(x;/~,a) and fl(x; ~,o) be the densities under H 0 and H 1 respectively. Dumonceaux, Antle and H a a s [28] define the test statistic R M L , the ratio of m a x i m u m likelihoods, such that

max I"I
RML =
/~,o i= 1 //

max ]-[ fo(xi; I~,o)


be' i= 1

The distribution of R M L is independent of the nuisance parameters. Three specific alternative distributions to the normal distribution are considered, (i) Cauchy, (ii) Exponential a n d (iii) Double Exponential. Critical values and powers of the test for the three cases are given. The test statistics simplify for cases (ii) and (iii) as follows. F o r H o : Normal against H n : Exponential, the test statistic is a function of D, where
t/

D= n S / ~] (x i-minx~).
i=l

For H 0 : N o r m a l against H ~ : D o u b l e exponential, the test statistic depends on a monotonic function of G, where G is defined such that

G = n S / ~ IXg- Xmed[.
i=l

This is similar to a except that the m e a n deviation is taken f r o m the median.

4.3. A Laplace-O2pe family


To motivate some of the tests, Hogg [47] considered the family of distributions of the form,

f(x;O)=(2F(1-+-O-1))-lexp(-[x[e),

-~<x<~,

0>0.

The distribution has mean zero. For 0 = 2, x is normal, whereas for 0 = l, x is double exponential and as 0--->~, x tends to the uniform distribution.

Tests of univariate and multivariate normuflty

293

The most powerful invariant test against scale for testing O= 01 against 0 - - 0 2, is found to be

n -1
i=1

Ix:2

<<K,
Ix/l,

n-I
i=1

where K is a constant chosen to give a size a test. Thus for testing against 0 2 = 4, we have the test statistic

01 =

m'41(m;) 2,
whereas for testing 01 = 1 against 02 = 2, we have the test statistic

rn'2/ E Ixil
The most powerful invariant test for scale as well as location turns out to be complicated [47], but the modification of the above statistics provide some justification for using b 2 and a. Hogg [47] recommends using the sample median in the m e a n deviation that appears in a, in place of Y.

4.4. Discriminating between models


Atkinson [5] discusses a method of discriminating between models. For example let fl(x;01) be the p.d.f, of N(/~, o) and f2(x; 02) be the p.d.f, of an alternative model. One can combine the two models by taking the p.d.f. proportional to

(f,(x;O,)) X{ A ( X ; 0 2)) 1 - - h
We are then interested in testing )~= 1. Atkinson provides a general asymptotic theory for such a test, but it seems that it has not been exploited for testing normality.

5.

Goodness-of-fit tests

5.1. X2 goodness-of-fit
The classical goodness-of-fit test is naturally the X2 test based on the X2 -statistic. Chernoff and Lehmann [13] show that when s parameters are estimated from the sample, X2 with k cells, is in fact asymptotically

294

K.V. Mardia

distributed as

xL.++ ,+<.v++.. +xy++,


where 0 < ~ 1 < . o . <h+ < 1, Yl ,Ys are independent N(0,1)variables. In this particular formulation however, we are not really better off, as the distribution of X2 depends on the unknown parameters ~, i = 1.... ,s. Watson [84] considers the X2 goodness-of-fit test, under the special case of testing for the fit to a normal distribution. Watson adopts the procedure of having fixed cell probabilities rather than fixed cell frequencies. Having estimated/~ and a 2 efficiently from the sample and decided on the n u m b e r of classes, k, construct the class boundaries, either (i) if k is even, such that is a boundary, with an equal number of intervals on each side, each of length some multiple of S or (ii) if k is odd, such that ~ is at the centre of an interval with an equal n u m b e r of class intervals on either side, again each of length some multiple of S, with S being the estimated standard deviation. Using this procedure, Watson shows that the distribution of X 2 for the case of normality, is of the same f o r m as that of Chernoff and L e h m a n n [13] with X2 distributed asymptotically as
. . . .

X2 3+a~ Y l2 -{-)~2Y2, 2
but here the formulation does not now depend on unknown parameters. In fact, we have

k X,= 1 - ,=,E {#2(i)/P,)


where

and

2t2= 1 - ~ i=1
1

k 1 E {4,~(i)/p,),

~(i)

= Z/~)(Zi) -- Z/_ l(~(Zi_ 1)'

(])(t) = - - e

t2/2

'

{Pi} are the desired probability contents of the intervals, and the zi are chosen to satisfy

pi = f(' 5.2.

,#(t) dt,

i= l,...,k.

Tests based on empirical distribution functions

Consider the following test statistics when q~ is N(0,1) and zi= {(x(i)- g ) / o ) , i = 1..... n. (i) The K o l m o g o r o v - S m i r n o v statistic K: K = m a x ( D +, D - ) ,

Tests of univariate and multivariate normality

295

where
D += max l~i~n

{(i/n)--zi} ,

D-=

l~i<~n

max [ z i - { ( i - 1 ) / n } ] .

(ii) C r a m e r - v o n Mises statistic Wz:


W a=

~ [zi-{(2i-1)/2n}]2+(l/12n).
i=l

(iii) The Kuiper statistic V:


V=D++D -.

(iv) The Watson statistic U2: U 2= W z - n Z where = -hi= 1

z i.

(v) The Anderson-Darling statistic A2:


A 2=~, [ ( 2 i - 1 ) { l o g z i + l o g ( 1 - z o + , - i ) } / n ] - n .
i=l

The real practical problem arises when/z and o are estimated. We know that/2 = Y and 0 2= S 2 are the maximum likelihood estimators of/~ and a 2 respectively. After replacing/2 and 6 in z i in place of/~, a in these tests, we denote K, if/2 .... as the values of K, W z .... etc. Note that y =(b(z) is uniform on (0, 1) but this probability integral transformation is influenced when parameters are estimated as investigated by David and Johnson [25]. In fact, from [25] it follows that the null distribution of K is different from that o f / ( but the null distribution o f / ~ does not depend on the nuisance parameters/z and o 2. However, there is a substantial difference between the null distributions of K and /(. This statement applies to other test statistics. (This fact was first observed for K by Lilliefors [51] through Monte Carlo percentage points.) Stephens [78] gives the (approximate) percentage points of these statistics for both cases and these are reproduced in [65, Vol. 2, Table 54]. A new development in this area is the construction of components of a statistic. Durbin et al. [30] show how to construct components ~,, 1,~,,2,... such that I/V f can be written in the following canonical form, namely

j=l

where the 2,,j are asymptotically independent N(0, 1) variables. 2,,~ can be

296

K, V. Mardia

represented as a linear combination of the trigonometric functions ~nj, such that

j=l

where (2 ) v'-fi ~ cos(~rj3)r) and fir=F(x(r);#,62).

Hence Durbin et al. [30] show how to test for departures from normality in the presence of unknown parameters. Asymptotic significance points are derived for the three statistics !iZ2,A~,U~. The points are tabulated in Table 2 of [30], and are similar to Stephens [78] for n ~ . De Wet and Venter [26] extend the Cramer-von Mises statistic Wf to

Q(~)= ~ [z i- (i/(n+ 1)} ]2~/(i/(n+ 1)},


i=l
where

l~(t) = [ t~((I)--l(t) } ]2.


For asymptotic theory see [41] and for approximate percentage points see Pettitt [66]. Pettitt [66] shows Q(~b) to be asymptotically equivalent under certain conditions to

j=3
where the ~n are estimates of the coefficients of the Edgeworth expansion. In particular

Z3n ~-'~(~/6 )( -- ~v/bl),

Z~4n= (5/~4 )(3 -- b2).

Thus Rn ^ 2 "~3X 1 2(x/bl)+~X 1 2(b2) so that it gives an omnibus test for departures from normality where the effect of b 2 is scaled down in comparison to the test of D'Agostino and Pearson [21]. Schafer et al. [72] and Dyer [31] consider a modified Kolmogorov E test,

E= ~ max(Di+,Di-)
i=1
where
D,- = e,-(iD, + =

i/n-4.

Tests of univariate and multivariate normality

297

Durbin [29] proposes a test statistic U-- max ( ;- - l~i~n n

j=l

/ &) , ,~

where ga=(n+ 2 - j ) ( c o . ) - c o _ l } ), j = 1..... n and O<c{o)<Co)< .. <ccn ) are obtained by ordering


C0=2 D CI = Z 2 - - Z 1. . . . .

cn= 1--2 . .

5. 3.

Transformations

Csorgo, Seshadri and Yalovsky [19] approach this problem of the Kolmogorov-Smirnov and related tests for unknown mean and variance from a different angle. The approach is by means of transformations on the original observations to give random Variables with known distributions. Let Xl,X z ..... x, be n ( > 4) observations from a normal distribution with unknown mean and variance. Define

Zi=(Xl+'''+xi--iXi+l)/'k/[i(i+l)],
and hence define

i=1

.....

n-1

y,=z#lzd,
and

YJ=(VJ)zJ+I/V[ z2+''"

+4]'

j = 2 ..... n--2.

Under H 0, the hypothesis of normality, the y,, i = 1..... n - 2, are independently distributed t variables with 1,2 ..... n - 2 degrees of freedom respectively. Thus, the problem has reduced to testing whether the ~(Yi) are independently distributed uniform variables on the interval (0, 1), and hence the standard Kolmogorov-Smirnov test etc. can be used. Alternatively define the y such that

y2=z +z, .....


where j+l---Put
j+l Sj+l= ZYi i=1

2 2 y j + , = z._2+z_,,

n--1 2

298

K . V . Mardia

and define
r

fir* = ~ , Y i / S j + , ,

r= l ..... j.

Under the hypothesis of normality, the ~b* behave as the order statistics o f j independent uniform random variables on the interval (0, 1). We can again use the Kotmogorov-Smirnov test or related tests. Alternatively, Pearson's probability product test gives
J

2Pf =--2
t.=l

log'q*

which is distributed as Xz with 2j degrees of freedom under the null hypothesis.


6.
6.1.

Miscellaneous tests
Entropy principle

The normal distribution maximizes entropy against any other distribution with the same variance o z. This maximum entropy is log{(2~re)l/2o}. Vasicek [82] applies this characteristic as a basis for a test of normality using the statistic Km,, such that

Km n _

2mS

{fI

i= 1

( x ( i + m ) __ x ( i - m ) )

}'"
,

where m is a positive integer, m < n / 2 and xo) x(i) = J x(0


Lx(~)

i < 1, 1 -<< i < n,


i>n.

Under the null hypothesis


P

gmn ~

~/(2~re)

as

n --~ oe , m--> oe , m / n--~O.

Put H ( f ) equal to the entropy of the alternative distribution which has variance a 2, then
Kin, --+ o - ' exp{ H ( f ) } < (2~re) 1/2.
P

Hence Kmn is consistent for such alternatives.

Tests of univariate and multivariate normality

299

There is however the problem of what the value for m should be. A Monte Carlo study was undertaken against a number of continuous alternatives. The simulations indicated that the most powerful tests were given for m = 2 when n = 10, m = 3 for n = 2 0 and m = 4 for n = 5 0 .

6.2.

U-statistics

Locke and Spurrier [52] propose tile use of U statistics for testing normality against non-symmetric alternatives, and consider statistics of the form U / s p, where U is chosen so as to be a location, invariant U-statistic and s 2 is the unbiased estimate of the variance, p ( > 0) is chosen to make U / s e scale invariant. Usually p is a positive integer. Locke and Spurrier consider a U-statistic of the form
(ft) -ln-2 n-1

up.= 3

i=1 j = i + l k = j + l

and d?p(Xl,X2, X3)'-'~(y 3 - y 2 ) p of XpX2, X 3. Thus the test statistics are

(y2--Yl) p, Y I,Y2,Y3 being

the order statistics

r,. = u , . / s,.
They have investigated T~, in greater detail and thus U1. = ( 3 ) - ' where ~ w,x(,), i=1

Under H o,

E[r,n]=0,
and var( T13) = al, var( TI4) -- ~ l (al+3a2), v a r ( r ~ , ) = 3!(n - 3 ) ! ( n ! ) - '

X ( al + 3 ( n - 3)a2 + l.5(n-- 3)(n--4)a3},

n >~5,

where a I = 1.03803994, a 2 = 0.23238211 and a 3 = 0.05938718. Approximate percentage points for T~,, are obtained.

300

K. V. Mardia

It is found that Tin and Tz. have good properties but T3n has no advantage over b 1. Note that if we construct U / s p with kernel
3

~(x,,~,x~)= E { ~ - ~ ( ~ , + ~ + ~ 3 ) )
i=l

we obtain b~, whereas if we take

~(x,, x2) = i x , - x21


we obtain the statistic D suggested by D'Agostino.
6.3. Combination of two test statistics

Spiegelhalter [77] considers obtaimng a test statistic for departure from normality for symmetric alternatives by combining two given test statistics for normality. Let T(~,hN) denote the most powerful location and scale invariant test for XN (normal) against a specified alternative ~0, where h denotes the shape of a symmetric distribution whose density is of the form

p(xilO, o,x) = ,,- ~o(Jx,- 0 l/olx),


0 and o being location and scale parameters respectively. Using the results of H~ijek and Sid~ik [43, p. 49], he obtains the most powerful test. In particular, for the uniform (hu) and double exponential distributions (XD), it becomes
T(Xu, I~N) = n - ' ( n -- 1)-' ( x ( , ) - x(,)) -("-O/p(xlXN),

T(;ko, XN)---2--("-O(n-2)! ~ j~l where


"4~ n-1

Wf'/p(xlXs),
jva 2 , - ~ + 1,
J= 2'2

j---~

+l--j,

wj= 4 v f - ' [ l + ( n - 1 ) ( x ( ( , / 2 ) + ' ) - x ( , / 2 ) ) v ; - ' ] '

n_n

+1,

= E Ix(i~-x(j)l,
and

p(xl~kN)=lr( n-- l~ln-,/2{~s~(n_ 1)} -(n-l)/2


These can be regarded as two given test statistics. Spiegelhalter also proposes
T = T(~.u, hN) +

T()kD,aN)

Tests of univariate and multivariate normality


as a suitable combined test statistic, justifying the combination tically and by a Bayesian approach. Note that the uniform and exponential can be considered as being a short and a long tailed tive to normality respectively. Spiegelhalter shows that T is asymptotically equivalent to the statistic

301

heurisdoublealternasimpler

T , = ( ( CnU)--(n-- l)-~- ( n-1/2a)--(n-- l) ) 1/(n-l)


where

c , = ~1 n - ~(n!) 1/~"-~).

Following the same theme, Locke and Spurrier [52] consider testing against alternatives that have both tails heavy or both tails light and proposed kernels of the form
k-I ~(Xl .....

Xk)= E Ci(X(i+l)--X(i)Y'
i=1

p>O

where c~..... ck_~ are constants such that ci= Ck_ i. F r o m trying a number of kernels, Locke and Spurrier decided on the statistics (a) TD, in which k = 2 , c I = 1 and p = 1 and which is a constant times D'Agostino's D statistic and (b) T*, with k = 4 , c I = c3=0, c2= 1 a n d p = 1.

6. 4.
as

Trimmed statistics

Hogg [47] suggested the statistic Q based on the trimmed sample defined Q= ( U(l/20) -/](1/20)) / (U(1/2) - L(1/2)), where U-(/3) = mean of the n/3 largest order statistics, /~(/3) = mean of the n/3 smallest order statistics. Tiku [81] gives another test. Suppose the sample of n observations is ordered and trimmed by removing the r~ smallest and the r 2 largest observations, to leave the censored sample

Xa~Xa+l~ "'" ~Xb,

a=r+l,

b=n-r

2.

The choice of r 1 and r 2 is determined by the skewness or otherwise of the alternative non-normal distribution, i.e. the assumption is that one has a priori knowledge. The following rules are proposed. (i) For a positively skewed distribution r 1= 0 and r 2 = ( 0 . 5 + 0.6n). (ii) Similarly for a negatively skewed distribution, r x= ( 0 . 5 + 0 . 6 n ) and
r2~---0.

302

1(. V. Mardia

(iii) For a symmetric distribution r 1= r 2 =(0.5 +0.3n). F r o m [79, 80] an efficient estimator % of the population standard deviation can be obtained f r o m the c e n s o r e d sample under the hypothesis of normality. Then the test statistic 7' is defined as

r = ( ' - l)A c (,A - 1)s


where qi = ri/n, i= 1,2; A = 1 - q l - q2. N o w 6C is defined by

6c= ( B + ( B a + 4 A C ) 1/2}/2A,
where

ql = r l / n,

q2 = r 2 / n,

A = 1 - ql -- q2,
- qlal)K,

B= q2a2Xb--qlalXa--(q2a2 b

C= I E xiZ + qzf12x~- q, fl,xZ, --(1-- q1- q2 + q2f12-- ql [31)K2,


n t.~ a ~ K--- 1 xi + q2 fl2Xb -- ql fllXa

n i=a 1 - - q l - - q 2 d - q 2 f l 2 - - q l f l l '

and al,fll and 0~2,~2 are chosen to give good fits to O(z)/O~(z)= Ol.l-[--fllZ and eO(z)/9(z) = a 2 + B2z respectively where 'I,(z) = 1 - ~(z). Asymptotically, the values of a l, a 2 and E1,82 are as follows
al=~(tl)/ql--flltl, a2=~(t2)/q2--f12t2,

ill = - - q ~ ( t l ) ( t l + ~ ( t l ) / q l } / q l ,

f12 = - - q ~ ( t z ) ( t 2 - - q ~ ( t z ) / q 2 ) / q 2 ,

where rb(q)= ql and 't~(t2)= q2. The critical region of the test consists of small values of T. Percentage points are given.

6.5.

The gap test

Andrews et al. [2] propose the gap test. Consider


gi =

X(i+l)--X(i),
c(i+ 1) -- e(i)

i=l,...,n--1

where c(j) is the expected value of the jth order statistic from a N(0, 1) distribution. Under the null hypothesis of normality, the g/will be independent exponential variables. Andrews et al. [3] suggest comparing the means

Tests of univariate and multivariate normality of adjacent cells of the g/. E.g. define
(n-- 1)/4 3(n-- 1)/4 n-- 1

303

Sl=

E gi, i= 1

gm=

E gi, i=(n-- 1)/4

Su =

E gi i= 3(n-- 1)/4

where the s u m m a t i o n s have nl, n2,n I observations respectively, where 2n 1+ n 2 -- n. Define gl = Sl/nl,g,~, -- Sm/n2,gu = S u / n l . U n d e r the null hypothesis g.1 and gu have m e a n a n d variance ~, o2/n~ respectively, while g,, has m e a n o and variance o2/n2. T h u s r l = g l / g m a n d r , = g u / g m are distrib u t e d as F(2nl,2n2) a n d hence for large n, r l a n d r~ are a p p r o x i m a t e l y n o r m a l with m e a n one a n d v a r i a n c e equal to ( 1 / n l + 1/n2). A n o m n i b u s test statistic is thus, with 2n I = n 2 q = -~- ( 3 ( r l - 1 ) 2 - 2(r, - 1 ) ( r , - 1) + 3(r u - 1) 2 ) where q is a p p r o x i m a t e l y distributed as a X2 variable.
n1

6. 6.

Probability plots

If we plot x(i ) against its quantile qi defined b y q i = ~ - l ( ~ i ) where ~i = i / ( n + 1), (or ~i = (i - )In), then under H o, we shall expect the points (x(i),qi), i = 1 . . . . . n, to lie on a straight line. This graphical m e t h o d is simplified by using n o r m a l probability p a p e r where (x(0, ~i), i -- 1. . . . . n, will a p p e a r directly as points scattered a r o u n d a straight line u n d e r H o. Other choices of ~i include i / n , a n d ( i - )In. However, these values of ~i are not necessarily the best values for a linear plot. I n d e e d Blom [10] has suggested

~i=(i--~)/(n+)
and Benard a n d B o s - L e v e n b a c h [9] have suggested ~i = (i -- 0 . 3 ) / ( n + 0.4). N o r m a l probability plots also yield estimates of /z a n d o 2 using the intercept and slope of the line x(i)=l~+~io. T h e best linear unbiased estimates of/~ a n d o 2 are given, if the plotting points are defined such that

= dP(b/(b'b)),
where

b' = c'lV- l/c'lV- lc I


in the notation of Section 3. Barnett [8] r e c o m m e n d s the G u p t a m e t h o d

304

K.V. Mardia

[42] of plotting ~ = Cb(ci), i = 1 . . . . . n, which is comparatively easy to compute, given tables of the ci. Filliben [32] proposed a specific test for normality based on probability plots. Instead of considering the mean to be the measure of location for the i th order statistic, consider instead the median. Define m(i) to be the theoretical median value for the i tb- order statistic under H 0, then the plot of x(o against m(o will be approximately linear. Hence Filliben [32] suggests the test statistic r, where r is the correlation between (x(o,m(o), i = 1..... n. Under H 0, r should be near to one. The importance of the graphical method should not be underestimated and it is always worthwhile to supplement a test procedure with a plot.

7.
7.1.

Power studies
M a i n studies

For ease of reference, we give a summary of the main statistics in Table 1. Shapiro et al. [75] launched the first major power study into the behaviour of the various tests for normality. The nine statistics they considered were (i) Shapiro-Wilk's W, (ii) ~/bl, (iii) b2, (iv) Kolmogorov-Smirnov's K, (v) Cramer-von Mises W z, (vi) AndersonDarling's A 2, (vii) Durbin's T, (viii) X2, and (ix) Studentized range, u. 12 families of alternative distributions were considered (a) Beta, (b) Binomial, (c) Chi-Squared, (d) Double Chi-Squared, (e) Johnson $8, (f) logistic, (g) log normal, (h) Non-central Chi-Squared, (i) Poisson, (j) Student t, (k) Tukey, and (1) Weibull. A number of different values of the parameters within each family were also considered, giving 45 alternative distributions. Shapiro et al. [75] draw a number of conclusions from their Monte Carlo study. (1) The Shapiro-Wilk's W provides a generally superior omnibus measure of non-normality. (2) The tests based on the empirical distribution function are not very powerful. But, see Stephens [78] below. (3) The studentized range u, has good properties against short-tailed symmetric distributions, but is hopeless against asymmetry. (4) A combination of ~/b 1 and b 2 is generally quite powerful, but is usually dominated by W. This predominance of W, stirred further investigators into action. Dyer [31] considered seven test statistics used for testing H 0. (i) Cramer-von

Tests of univariate and multivariate normality

305

~+
~q

306

K. V. ~lardia

Mises W 2, (ii)Anderson-Darling's A 2, (iii) Kolmogorov-Smimov's K, (iv) Watson's U 2, (v) Kuiper's V, (vi) Modified Kolmogorov E, and (vii) Shapiro-Wilk's W. These statistics were tested against four alternative distributions, namely, uniform, exponential, double exponential, and Cauchy. Two cases are considered, (a)/~ and 02 unknown and (b)just a 2 unknown. A Monte Carlo power study indicated that W and A 2 do generally better than the rest, A 2 being slightly superior for a double exponential or Cauchy, while W is better for a uniform or exponential. However the interesting results are those for the power between the cases with/z and o 2 unknown and the cases with just o 2 unknown. The power of the tests are always greater when one assumes /~ and 02 are unknown, and hence are estimated from the sample A similar study is described by Stephens [78]. The tests under investigation here are (i) Kolmogorov-Smirnov's K, (ii) Cramer-von Mises W 2, (iii) Kuiper's V, (iv) Watson's U 2, (v) Anderson-Darling's A 2, (vi) X2, (vii) Shapiro-Wilk's IV, (viii) D'Agostino's D, and (ix) Shapiro-Francia's W'. The range of alternatives considered are (a) uniform, (b) Cauchy, (c) exponential, (d) Laplace, (e) lognormal, (0 Weibull, (g) Tukey, (h) Student t, and (i) X2. Contrary to Shapiro and Wilk [74], and Shapiro, Wilk and Chen [75], (see above), where the empirical function statistics are shown in a very poor light, Stephens [78] shows these statistics to have powers "roughly comparable" to that of Shapiro-Wilk's W. In explanation of this discrepancy, Stephens [78] points out that in the previous studies, the wrong critical values were used for the empirical d.f. statistics since the mean and variances were assumed known. For a true comparison with W, the mean and variance should be calculated from the sample. Independently Stephens [78] comes to the same conclusion as Dyer [31] that when one is testing for normality, the mean and variance should never be assumed to be known, even if they are in fact known! The Monte Carlo study in [78] indicates that the test procedure using W for samples of size less than 50, and W' for samples greater than 50, performs slightly better than the leading contenders, A 2 and W 2, from the empirical d.f. statistics. Pearson et al. [64] have produced an extensive Monte Carlo power study for 8 test statistics against 58 non-normal alternative distributions. The test statistics considered can be split into two groups, omnibus tests and directional tests, where a directional test is one which is especially sensitive to an expected type of departure from normality. The omnibus test statistics are (i) Bowman and Shenton's y2, (ii) Pearson et al.'s R, (iii) Shapiro-Wilk's W, and (iv) D'Agostino's D. The directional tests are (i)

Tests of univariate and multivariate normality

307

X/bl, (ii) b2, (iii) right angle, and (iv) D (one-tailed). The right angle test is given by rejecting the null hypothesis of normality if 1/bl a n d / o r b 2 are outside the upper 100a*% limits x/bl(a*) a n d / o r b2(a* ) (see Section 2.8). The alternative family of distributions under consideration are (a) Beta~ (b) X2, (c) Student t, (d) Johnson's S s, (e) Johnson's Su, (f) lognormal, (g) Weibull, (h) logistic, (i) symmetrical Tukey, (j) Laplace, (k) scale-contaminated normal, and (1) location-contaminated normal. Let us divide the results into two sections under the headings 'Symmetric alternatives' and 'Skew alternatives'. Symmetric alternatives. (i) For platykurtic populations, y2 and R have pretty much equal power, with W being more powerful in general and D not doing particularly well. For the two relevant directional statistics, b 2 (lower tail) is much superior to upper-tailed D and is more powerful than the four omnibus statistics in every case. (ii) For leptokurtic populations, generally the omnibus statistics in descending order of power are Ys 2, R, D, W. For distributions with very long tails, however, D is the most powerful test. It would appear that if prior knowledge is available, the directional tests have greater power. Skew alternatives. For these cases, W is vastly superior to the other omnibus tests. However, if there is prior information that if the population is not normal, then it will be positively skewed, the Monte Carlo study suggests the use of either x/b~ upper tail, or the right angle test, both of which are more powerful than W for this case. Table 2 gives a summary of various other power studies.

7.2. Effect of ties and grouping


Pearson et al. [64] raise the problem of the effect which ties and the grouping of data have on the various test statistics proposed for testing the null hypothesis of normality. Let l be the ratio of standard deviation to rounding interval. Pearson et al. [64] consider the effect of grouping on ~/b 1 and W, for n = 2 0 , 5 0 and /=3,5,10, and also on D and W' for / - 3 , 5 , 8 , 10, based on 1000 simulated samples from a normal distribution in each case. The effect of grouping on ~/b~ can hardly be said to be significant, and similarly for D but to a lesser extent. For W with l = 3 and 1= 5, the effect is quite pronounced, but falls off for 1=10, whilst W' is extremely unsatisfactory for practically all the cases, suggesting that W' should be used with great caution in the presence of multiple ties. It should be noted that if we are given a grouped data or a large scale data, the tests based on order-statistics are hardly practical.

308

K. V. Mardia

t~

t~

0 eql: ~

~A 0 ..,~,

~,~ ~ o
e~

i
J .~

,-.i

el)

->

z~
&

~9 ->
0

Tests of univariate and multivariate normality


) 0

309

,,..2

.~
.'~ .u aN

'.~ 0 o.~ ~

~o

F
E

F
E

&
.c

&
.c

310

t(. V. Mardia
Tests of multivariate normality

8.

8.1.

Introduction

L e t x 1. . . . , x n be n observations of a random vector x with p components and let ~ and S be the sample mean vector and covariance matrix respectively, corresponding to the population statistics/~ and ~. Our null hypothesis is now that X is multivariate normal. One simple procedure is to test the marginal normality of each of the p components by using univariate procedures. Of course, marginal normality does not imply multivariate normality but the presence of non-normality is often reflected in the marginal distributions. The problem of combining these tests (or simultaneous tests) requires resolving. Naturally, however, one should recognize that although one may be able to detect non-normality through the marginal normality tests, tests which exploit the multivariate structure should be more sensitive. These tests are reviewed in greater depth by Gnanadesikan [39, pp. 161-195]. Consequently our review will be brief and to some extent supplementary. One of the most important distinctions between the various procedures is whether or not they are invariant under arbitrary non-singular linear transformations of x. Some of the tests which are not invariant can be described as coordinate dependent techniques. Although invariance considerations lead to sensitive tests, there are situations where the particular choice of component is important. We shall write

4=(Xi--x)tS-l(xi--X),

rij=(xi--x)'S-l(xj--X)

for Mahalanobis distance of x i from ~ and Mahalanobis angle between the vectors x i - ~ and x j - g.

8.2.

Univariate generalizations

8.2.1. Skewness and kurtosis The first formal test of multinormality was proposed by Mardia [56] through multivariate measures of skewness and kurtosis. Invariant combinations of the third and fourth order moments are found, which have the maximum effect on the distribution of Hotelling's T 2 under non-normality. Let S be the sample covariance matrix with S - ~ = (siJ). Then a measure of multivariate skewness can be defined as
=
"~ i=lj=l

blP

~.

Tests of univariate and multivariate normality

311

Asymptotically nbl,p/6 is distributed as X2 with p(p + l)(p + 2 ) / 6 degrees of freedom. Empirical significance points for p = 2, 3 and 4 are given in [57,

581.
A measure of multivariate kurtosis defined by Mardia [56] is given by
b2 p = ' hi=

~ r4.
1

Asymptotically b2,p is normally distributed with mean p(p +2) and variance 8 p ( p + 2 ) / n . Again, empirical significance points for p = 2 , 3 and 4 are given in [57, 58]. Note that the r population counterparts are

B,.=e{(x_.), x ,(y_.)}3,
where X and Y are independently and identically distributed. An algorithm to compute bl, p and b2,p is given in [59]. Expressions for these measures in terms of moments are given in [56]. Obviously, these tests are coordinatefree.

8.2.2

Union intersection principle

Malkovich and Afifi [55] propose generalizations of the univariate statistics X/bl and b 2, and Shapiro-Wilk's W, by use of Roy's union-intersection principle, to test for departures from multivariate normality. (i) Skewness and Kurtosis. Multivariate skewness is defined by Malkoo vich and Afifi [55] as

[var(c'x)] 2 and multivariate kurtosis is defined as

[
(a)

2 1

for some vector e. Using Roy's principle, we accept the null hypothesis if

b'{= rraxbl(c)<kb,,

where kb, is a constant and bl(c ) is the sample counterpart of ill(c). (b) (b~) 2 = max [ b2(e) - k] 2 -<< kb2,

312

K. V. Mardia

where b2(c) is the sample counterpart of/32(e), kb2 is a constant, and k-->3 as n--~ oe. (ii) S h a p i r o - W i l k . Malkovich and Afifi [55] propose the multivariate generalization of Shapiro-Wilk's statistic 14" as
t _ 2

w(e) =

The hypothesis of multivariate normality is accepted if W*= min W(c)/> k w,


c

where k w is a constant chosen to give a size a test. These tests are also coordinate free but involve a considerable amount of computing effort.
8.2.3 K o l m o g o r o v - S m i r n o v a n d Cramer--von Mises statistics

Note that Vii= r~ is asymptotically distributed as X2 under H 0. Defining S(V) to be the sample c.d.f, based on V we can apply any univariate 2 goodness of fit test to test the null hypothesis that V is Xp. Significance points for the C r a m e r - v o n Mises statistics (CM*) and Kolmogorov-Smirnov statistics (K*) for the above case by Monte Carlo methods are given in [54]. Subsequent work is given in [55]. Again these tests are coordinate free.
8. 2.4. Transformation methods

Consider the simple power transformation to normality discussed in Section 4.1, in the multivariate context. The linear transformation ~ ' = (Xl,X z..... Xp)=l' is the only transformation consistent with the null hypothesis of multivariate normality, as considered in [2]. Let X = (xij), i = 1,2 . . . . . n, j = 1. . . . . p, Xij > 0 be the observation matrix. For given ~, define x,.(j x) such that

[ logxij,

Xj=0.

The maximum likelihood estimates for/L and 22 are thus /2 = 1 X(X),1,


n

and
= (x
n

- 1#').

Tests of univariate and multivariate normaliO~

313

Hence

~loglZl +

1) ~ logx/j
i=1

and ~ is the value of X for which Lmax(~ ) is maximized. The null hypothesis of normality can then be tested by considering 2{ Lmax(~,) - Lmax(1)) which is asymptotically distributed as X~. The tests are coordinate dependent. 8. 3. Strict multivariate procedures

8.3.1. Directional normality Consider z i = S-1/2(x i - ~ ) . Andrews et al, [2] define d e such that

d,~
=

WiZi
Wi = IIz;ll ~

i= 1

WiZi
i=1

'

where Ilxll is the length of the vector x and a is an unspecified constant. For a = - 1, d e is a function of the orientation of the z/s while for a = 1, d~ is more influenced by those observations distant from the mean. In general terms, if a > 0, d~ will tend to point to clustering away from the mean, while for a < 0, d~ will point to clustering near the centre of gravity of the observations. Hence d*=Sl/2d~, for a given value of a, can be regarded as a univariate sample upon which any of the standard tests for departures from normality can be employed. However because of the data-dependence Of the approach (as well as coordinate dependence), the procedure should only be used as a guide since the formal significance levels are not directly applicable. 8.3.2. Radius and angles and graphical techniques 2 2~ X p2 and therefore we can plot r 2 against Xp Since under H 0, we have r,. as suggested by Cox [16] and Healy [45]. For the bivariate case, define 0i to be the angle z i makes with a prescribed line. The 0i are then approximately uniformly distributed on [0, 27r) under the null hypothesis a n d the r i and 0~ are approximately independent. For moderate sample sizes the dependence should be negligible.

314

32. K Mardia

The exact marginal distribution of r~ is a constant multiplier of a beta rather than a X2, but Gnanadesikan and Kettenring [40] show that even for moderate samples, n = 20, in the bivariate case, the difference in using X2 instead of beta is insignificant. (i) Bivariate Case ([3], [39, pp. 172-174]). Order the n squared radii r~,) < r~2) ~< . . . <~r{n) and plot against the corresponding expected values for the c.d.f, from a X~ distribution. Similarly for the 0* (0* = 0J27r), plot 0* against the expected values of the c.d.f, of a uniform distribution. Under the null hypothesis of bivariate normality, the two plots should be approximately linear. (ii) Higher Dimensions ([3], [39, pp. 172-174]). In this case, the appropriate X2 distribution is X2 for the squared radius plot. For one of the angles, the appropriate procedure is still a uniform probability plot, but the remaining ( p - 2 ) angles have p.d.f.'s proportioned to sinP-l-JOj, j = 1,2 ..... p - 2 and hence plot the n ordered values of the jth angle against the corresponding expected values of the c.d.f, from this distribution.

8.3.3. Nearest distance test


In this test of Andrews et al. [3], nearest neighbour distances for each point are transformed to standard normal deviates where under H 0 these distances are coordinate free. The procedure is as follows. 1. Compute the residuals Yi = S - 1 / 2 ( x , - ~), i = 1..... n. Define Z~ as the vector with the p-elements of dp(yu), Yij being the jth element of Yi. These Zi's are now transformed onto the unit hypercube. 2. A nearest neighbour distance in this hypercube may be calculated by using the metric

dij = d( Zi, Zj) = I~x { inin [ [ Zki - Z d, IIzk,.- z d -

11]}
ZD

and the volume of the set enclosed by a distance d from the point (Zj:

d(Z;,Zj) <d)

is V(d)= (2d) p. Under the null hypothesis, V(d) has an exponential distribution.Therefore for fixed i, if

d(i)=mind( i,j) < 1/2n '/p,


and if

i~j,

d(i,j)> 1/2n l/p, j < i ,


calculate

xv-7:-]5 ]

Tests of univariate and multivariate normafity

315

The w i are then independent of the Z i. The test of independence can be performed by regressing the w i on 1, Zil . . . . . Ze, Z~]..... Zip z.
8. 3.4. Maximum curvature

Cox and Small [18] suggest, as the most direct approach for obtaining an invariant procedure for testing multivariate normality, that one should find linear combinations of the original variables, such that two new variables are obtained which possess the property that one of these has m a x i m u m curvature when it is regressed on the other. Let the p 1 vector variable x have components with zero mean and covariance matrix l~. Define the linear combinations Y and W such that where Y=a'x and W=b'x

a ' l ~ a = b ' ~ b = 1,

and Y and W have zero mean and unit variance. Define ~/-- y / [ E ( W 4 ) - 1 -{E(W3)}z]1/z where y is the least squares regression coefficient of Y on W 2, adjusting for linear regression on W. "172 can be viewed as the proportion of the variance of Y accounted for by the quadratic component in the least squares regression of Y on W and W 2. Cox and Small [18] obtain an expression for ~q2(b) which is defined as the supremum of ~2(a, b) over a for fixed b. Invariably the solution has to be found numerically. Let ~2(b) be the sample value of ~2(b) and let its m a x i m u m be denoted by ~/m~" ^2 Under H 0, simulations indicate that for n >/50 and p ~<6, log~am~x is normally distributed with
iz=log(5pR/8n) 8.3.5. Student-t approach

and

o=log(O.53+ 3.87/p).

Let (xil,xi2), i = 1 . . . . . n be n observations f r o m a bivariate normal distribution under the null hypothesis. Cox and Small [18] define Qz, l to be the t statistic for testing the significance of the regression coefficient of X 2 on X~, in the case in which X 2 is regressed on X 1 and X]. Similarly define QI,2. Under the null hypothesis the joint distribution of (Q2,1, QI,2) is asymptotically bivariate normal with zero mean and unit variance. Let 10= corr(X1,X2) and hence Cox and Small [18] obtain the asymptotic correlation between Q2,1 and Q1,2 as corr(Q2,1, Q1,2) = 0( 2 - 312) Two test statistics are proposed.

316

K.V. Mardia

(i) max([Q2,1l, [Q~,21) which can be tested from tables of the bivariate normal distribution.

(ii)

[ Q2,, Q1,2] r ( 2 _ 3r2)

I 1 r 23r2 llE.21
1 Q,,2

where r is the sample estimate of P- The above quadratic form under H 0, is asymptotically X2 with 2 degrees of freedom. For the case when p > 2, these statistics can be easily extended and are given in [18]. The test is coordinate dependent.

8. 4.

Miscellaneous tests

In the univariate case, Durbin [29] has proposed a reduction of the composite null hypothesis of normality (versus non-normality) to a simple one using a randomized procedure. This technique is extended to the multivariate case by Wagle [83]. As is usual with randomized procedures a given set of data need not always yield the same decision. Hensler et al. [46] have proposed a test procedure which reduces the composite null hypothesis of multivariate normality to a simple null hypothesis, that reduces to testing uniformity of a set of observations between 0 and 1. However, the proposed test seems to involve arbitrary steps in formulating the transformation. Dahiya and Gurland [23] give a test of fit based on generalized minimum x2-technique. The special case of testing of fit of a bivariate normal is investigated.

8. 5.

Power studies

Malkovich and Afifi [55] compared the empirical power of b'~, b~, W*, CM* and K* when (i) the x i components are independent logN(0, 1), uniform (0, 1) and Student t with 4d.f. and (ii) when the xi are from mixtures of N(0, I) and N ~ , ~). For most of the alternatives in their study a t p = 2 , CM* and K* had nearly the same power and were generally no better than one of b~, b~" or W*. Recently, Giorgi and Fattorini [38] compared the empirical power for W*,(bl,p,b2,p), CM* and K* and directional criterion GN1 and GN2 obtained on applying Shapiro-Wilk's statistics to d I and d_ 1. The alternatives consist of taking x/as independently and identically distributed (i) X~ 2, t,--4, 10 and (ii) LN(0, 1). They conclude (i) powers increase smoothly with p for each test, (ii) W* shows the greatest power for these alternatives, (iii) for n >i 50, (b~,p,ba,p) is recommended and (iv) other tests are generally no better than W* (any n) and (bl.p,b2,v), (n large). It seems the power studies

Tests of univariate and multivariate normality

317

are not yet extensive biat the conclusions are similar to the univariate case~ viz. any new test proposed should be compared with W* or (bl,p,b2,p).

Acknowledgement I wish to express my deepest gratitude to Robert Edwards for his valuable help and comments.

References
[1] Andrews, D. F. (1971). A note on the selection of data transformations. Biometrika 58, 249-254. [2] Andrews, D. F., Gnanadesikan, R. and Warner, J. L. (1971). Transformations of multivariate data. Biometrics 27, 825-840. [3] Andrews, D. F., Gnanadesikan, R. and Wariaer, J. L. (1972). Methods for assessing multivariate normality. Bell Laboratories Memorandum. [4] Andrews, D. F., Gnanadesikan, R. and Warner, J. L. (1973). Methods for assessing multivariate normality. In: P. R. Krishnaiah ed., Multivariate Analysis 111, Academic Press, New York, 95-116. [5] Atkinson, A. C. (1970). A method for discriminating between models (with discussion). J. Roy. Statist. Soc. B 32, 323-353, [6] Atkinson, A. C. (1973). Testing transformations to normality. J. Roy. Statist. Soc. B 35, 473 -479. [7] Barnett, V. (1975). Probability plotting methods and order statistics. Appl. Statist. 24, 95 - 108. [8] Barnett, V. (1976). Convenient probability plotting positions for the normal distribution. Appl. Statist. 25, 47-50. [9] Benard, A. and Bos-Levenbach, E. C. (1953). Her uitzetten van waarnemingen op waarschijnlijkheidspapier. Statistica 7, 163-173. [10] Blom, G. (1958). Statistical Estimates and Transformed Beta-Variables. Wiley, New York. [11] Bowman, K. O. and Shenton, B. R. (1975). Omnibus test contours for departures from normality based on ~/b 1 and b 2. Biometrika 62, 243-250. [12] Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. J. Roy. Statist. Soc. B 26, 211-252. [13] Chernoff, H. and Lehmann, E. L. (1954). The use of maximum likelihood estimates on X 2 tests for goodness of fit. Ann. Math. Statist. 25, 579-586. [14] Chernoff, H. and Lieberman, G. J. (1954). Use of normal probability paper. J. Arr~ Statist. Assoc. 49, 778-785. [15] Chernoff, H. and Lieberman, G. J. (1956). The use of generalised probability paper for continuous distributions. Ann. Math. Statist. 27, 806-818. [16] Cox, D. R. (1968). Notes on some aspects of regression analysis. J. Roy. Statist. Soc. A 131, 265-279. [17] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman and Hall, London.

318

K. V. Mardia

[18] Cox, D. R. and Small, N. J. H. (1978). Testing multivariate normality. Biometrika 65, 263-272. [19] Csorgo, M., Seshadri, V. and Yalovsky, M. (1973). Some exact tests for normality in the presence of unknown parameters. J. Roy. Statist. Soc. B 35, 507-522. [20] D'Agostino, R. B. (1971). An omnibus test of normality for moderate and large sample sizes. Biometrika 58, 341-348. [21] D'Agostino, R. B. and Pearson, E. S. (1973). Tests for departures from normahty. Empirical results for the distributions of b 2 and ~/b I. Biometrika 60, 613-622. [22] D'Agostino, R. B. and Rosman, B. (1974). The power of Geary's test of normality. Biometrika 61, 181-184. [23] Dahiya, R. C. and Gurland, J. (1973). A test of fit for bivariate distributions. J. Roy. Statist. Soc. B 35, 452-465. [24] David, H. A., Hartley, H. O. and Pearson, E. S. (1954). The distribution of the ratio, in a single normal sample of range to standard deviation. Biometrika 41, 482-493. [25] David, F. N. and Johnson, N. L. (1948). The probability integral transformation when parameters are estimated from the sample. Biometrika 35, 182-190. [26] De Wet, T. and Venter, J. H. (1973). Asymptotic distributions for quadratic forms with applications to tests of fit. Ann. Statist. 1, 380-387. [27] Downton, F. (1966). Linear estimates with polynomial coefficients. Biometrika 53, 129-141. [28] Dumonceaux, R., Antle, C. E., and Haas, G. (1973). Likelihood ratio test for discrimination between two models with unknown location and scale parameters (with discussion). Technometrics 15, 19-31. [29] Durbin, J. (1961). Some methods of constructing exact tests. Biometrika 48, 41-55. [30] Durbin, J., Knott, M. and Taylor, C. C. (1975). Components of Cramer-von Mises statistics II. J. Roy. Statist. Soc. B 37, 216-237. [31] Dyer, A. R. (1974). Comparisons of tests for normality with a cautionary note. Biometrika 61, 185-189. [32] Filliben, J. J. (1975). The probability plot correlation coefficient test for normality. Technometrics 17, 111-117. [33] Fisher, R. A. (1930). The moments of the distribution for normal samples of measures of departure from normality. Proe. Roy. Soe. A, 130, 16. [34] Gastwirth, J. L. and Owens, M. G. B. (1977). On classical tests of normality. Biometrika 64, 135-139. [35] Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika 27, 310-332. [36] Geary, R. C. (1936). Moments of the ratio of mean deviation to the standard deviation for normal samples. Biometrika 28, 295-305. [37] Geary, R. C. (1947). Testing for normality. Biometrika 34, 209-242. [38] Giorgi, G. M. and Fattorini, L. (1976). An empirical study of some tests for multivariate normality. Quaderni dell'lnstituto di Statistica 20, 1-8. [39] Gnanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York. [40] Gnanadesikan, Ro and Kettenring, J. R. (1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81-124. [41] Gregory, G. C. (1977). Large sample theory for U-statistics and tests of fit. Ann. Statist. 5, 110-123. [42] Gupta, A. K. (1952). Estimation of the mean and standard deviation of a normal population from a censored sample. Biometrika 39, 260-273. [43] Hhjek, J. and Sidak, Z. (1967). Theory of Rank Tests. Academic Press, New York. [44] Harter, H. L. (1961). Expected values of normal order statistics. Biornetrika 48, 151-165.

Tests of univariate and multivariate normality

319

[45] Healy, M. J. R. (1968). Multivariate normal plotting. Appl. Statist. 17, 157-161. [46] Hensler, G. L., Mehrotra, K. G. and Michalek, J. E. (1977). A goodness of fit test for multivariate normality. Comm. Statist. Theor. Meth. A 6, 33-41. [47] Hogg, R. V. (1972). More lights on the kurtosis and related statistics. J. Amo Statist. Assoc. 67, 422-424. [48] Johnson, N. L. (1949). Systems of frequency curves generated by methods of translation. Biometrika 36, 149-176. [49] Johnson, N, L. and Kotz, S. (1970). Distributions in Statistics. Houghton Mifflin, Boston. [50] Kaskey, G., Kolman, B., Krishnaiah, P. R. and Steinberg, L. (1961). Statistical techniques in transistor evaluation: transformations to normality. Technical report, Applied Mathematics Department, Remington Rand Univac. [51] Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. J. Am. Statist. Assoc. 62, 399-402. [52] Locke, C. and Spurrier, J. D. (1976). The use of U-statistics for testing normality against non-symmetric alternatives. Biometrika 63, 143-147. [53] Locke, C. and Spurrier, J. D. (1977). The use of U-statistics for testing normality against alternatives with both tails heavy or both tails light. Biometrika 64, 638-640. [54] Malkovich, J. F. (1971). Tests for multivariate normality. Ph.D. thesis, University of California, Los Angeles. [55] Malkovich, J. F. and Afifi, A. A. (1973). On tests for multivariate normality. J. Am. Statist. Assoc. 68, 176-179. [56] Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519-530. [57] Mardia, K. V. (1974). Applications of some measures of multivariate skewness and kurtosis for testing normality and robustness studies. Sankhyd, A 36, 115-128. [58] Mardia, K. V. (1975). Assessment of multinormality and the robustness of HoteUing's T 2 test. J. Roy. Statist. Soc. C 24, 163-171. [59] Mardia, K. V. and Zemroch, P. J. (1975). Algorithm AS84. Measures of multivariate skewness and kurtosis. J. Roy. Statist. Soe. C 24, 262-265. [60] Mulholland, H. P. (1965). On the degree of smoothness and on singularities in distributions of statistical functions. Proc. Camb. Phil. Soc. 61, 721-739. [61] Mulholland, H. P. (1970). On singularities of sampling distributions, in particular for ratios of quadratic forms. Biometrika 57, 155-174. [62] Mulhotland, H. P. (1977). On the null distribution of x/b1 for samples of size at most 25, with tables. Biometrika 64, 401-409. [63] Pearson, E. S. (1930). A further development of tests for normality. Biometrika 22, 239. [64] Pearson, E. S., D'Agostino, R. B. and Bowman, K. O. (1977). Tests for departure from normality: Comparison of powers. Biometrika 64, 231-246. [65] Pearson, E. S. and Hartley, H. O. (1972). Biometrika Tables for Statisticians, Vols. 1 and 2. Cambridge University Press. [66] Pettitt, A. N. (1977). A Cramer-von Mises type goodness-of-fit statistic related to x/b1 and b 2. J. Roy. Statist. Soc. B, 39, 364-370. [67] Prescott, P. (1976). Comparison of tests for normality using stylized sensitivity surfaces. Biometrika 63, 285-289. [68] Prescott, P. (1976). On a test for normality based on sample entropy. J. Roy. Statist. Soc. B 38, 254-256. [69] Purl, M. L. and Rao, C. R. (1976). Augmenting Shapiro-Wilk Test for Normality. Contributions to Applied Statistics, Birkhauser (Crrossohaus), Berlin, 129-139. [70] Sarhan, A. E. and Greenberg, B. G. (1956). Estimation of location and scale parameters by order statistics from singly and doubly censored samples, Part I. Ann. Math. Statist. 27, 427-451.

320 [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86]

K. V. Mardia

SarkadS, K. (1975). The consistency of the Shapiro-Francia test. Biometrika 62, 445-450. Schafer, R. E., Finkelstein, J. M. and Collins, J. (1972). On a goodness of fit test for the exponential distribution with mean unknown. Biometrika 59, 222-223. Shapiro, S. S. and Francia, R. S. (1972). An approximate analysis of variance test for normality. J. Am. Statist. Assoc. 67, 215-216. Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika 52, 591-611. Shapiro, S. S., Wilk, M. B. and Chen, H. J. (1968). A comparative study of various tests for normality. J. Amer. Statist. Assoc. 63, 1343-1372. Shenton, L. R. and Bowman, K. O. (1977). A bivariate model for the distribution of ~ b I and b 2. J. Amer. Statist. Assoc. 72, 206-211. Spiegellialter, D. J. (1977). A test for normality against symmetric alternatives. Bioo metrika 64, 415-418. Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. J. Amer. Statist. Assoc. 69, 730-737. Tiku, M. L. (1967). Estimating the mean and standard deviation from censored normal data. Biometrika 54, 155-165. Tiku, M. L. (1973). Testing group effects from type II censored normal samples in experimental design. Biometrics 29, 25-33. Tiku, M. L. (1974). A new statistic for testing for normality. Corn~ Statist. 3, 223-232. Vasicek, O. (1976). A test for normality based on sample entropy. J. Roy. Statist. Soc. B 38, 54-59. Wagle, B. (1968). Multivariate beta distribution and a test for multivariate normality. J. Roy. Statist. Soc. B 30, 511-516. Watson, G. S. (1957). The X2 goodness-of-fit test for normal distributions. Biometrika 44, 336-348. Weisberg, S. and Bingham, C. (1975). An approximate analysis of variance test for non-normality suitable for machine calculation. Technometrics 17, 133-134. Wishart, J. (1930). The derivation of certain high order sampling product moments from a normal population. Biometrika 22, 224-238.

P. R. Krishnaiah, ed., Handbook of Statistics, VoL 1 North-Holland Publishing Company (1980) 321-341

[
I V

Transformations to Normality*
G. K a s k e y B. K o l m a n P. R. Krishnaiah L. Steinberg

Introduction

Many parametric tests of hypotheses are based upon the assumption that the distribution underlying the data is univariate or multivariate normal. But there are several situations where the above assumptions are not valid and the tests are sensitive to departures from normality. In these situations, alternative parametric procedures may be proposed but the distribution problems associated with these procedures may be complicated. An alternative procedure is to transform the data such that the distribution underlying the transformed data is univariate or multivariate normal and draw inferences about the original parameters using the transformed model. Andrews (1971), Atkinson (1973), Box and Cox (1964), Draper and Cox (1969), Fraser (1967), Rao (1960), Tukey (1957) and other workers considered the problems associated with transformations to normality. In this paper, we consider the problems of transforming Pearson type distributions to normality by using an approach completely different from the approaches used by the above authors. In Section 2 of this paper, we give a brief description of Pearson type curves. For a detailed description of these curves, the reader is referred to Elderton and Johnson (1969). Several complicated distributions have been approximated with Pearson type distributions in the literature (e.g., see Krishnaiah, Lee and Chang (1976), Lee, Chang and Krishnaiah (1977) and Stephens and Solomon (1978)) with reasonable degree of accuracy. In Section 3, we reduce the problem of transforming the family of Pearson type curves to normality to the problem of solving a second order differential equation subject to certain initial conditions. A discussion of the evaluation of the initial conditions is given in Section 4. In Section 5, *The work in this paper was done under a contract with the Bureau of Ships when the authors were at the Remington Rand Univac, Philadelphia. Inquiries regarding this paper should be addressed to P. R. Krishnaiah. 321

322

G. Kaskey et aL

we discuss the evaluation of the second order differential equation using the Runge-Kutt a method. Finally, we illustrate the usefulness of the technique of transformation to normality by using some data on transistors. The material in the following sections is a condensed version of the material in a technical report by Kaskey, Kolman, Krishnaiah and Steinberg (1961).

2.

Pearson type curves

A wide class of unimodal density functions which includes many common distributions can be characterized as solutions of the differential equation

(Co+Clx+Czx2) d f ( x ) - - ( x + a ) f ( x )
dx

c~<x </3,

(2.1)

where f(x) is the probability density function. Any density function f(x) which satisfies (2.1) for certain values of C0, C l, C2, a, and o~ and B is said to be of Pearson type. For example, the well known gamma distribution and beta distribution belong to the family of Pearson type distributions. If we multiply equation (2.1) by x m, and integrate from a to B, we obtain B

f (CoXm+ClXm+I+Czxm+2) df(x) a dx # = f (X m +l _{_a x r e ) f ( X ) dx,

(2.2)

provided the moments/z~,_ 1,/t~, and/z~,+1 exist. Integrating equation (2.2) by parts leads to the formula
f(x)(Coxm ~ ~ ~.m+2~lB__ ..~ C l X m + 11= t ~2 ]let -

=mColam l + [ ( m + l ) C i + a ] l ~ + [ ( m + 2 ) C 2 + l ] ~ + v
(2.3) Hence, the vanishing of the quantity f(x)(Co+ Clx+ C2 x2) at the upper and lower limits would yield the linear relation

rnCol.tm_l+ [ ( m + 1 ) C l + a ]/L~+ [(m +2)C2+ l]/m+l = 0 (2.4)

Transformations to normality

323

between three consecutive moments of the distribution. Equation (2.1) will subsequently be shown to lead to solutions which actually possess this property. If the first four moments of the distribution are known, equation (2.4) gives rise to a system of linear equations which can be used to determine the values of C o, C l, C 2, and a. Substituting m = 0, 1,2, 3 into equation (2.4) yields the following system: C 1+ 2/~'l C
! 2 4 - a m_ - - / ' 1

C O+ 2~'1C 1+ 3~;C 2 +/L]a = -/~; 2/~'1Co + 3/z~C 1+ 4#3 C 2 +/z2a - -/~3 3/22Co + 4/23C 1+ 5/~4C2 +/~3 a - _ / ~ .
# ! t _ t _ t

(2.5)

In the cases in which the determinant of the coefficients approaches zero, it seems feasible to overcome it by a slight perturbation of the moments. If the Pearson parameters f o r f ( x ) are C 0, C l, C 2 and a, it follows readily that for those for Q ( x ) = f ( ( x - I ~ ) / o ) are

Co = o2Co-- C 1~-1- C 2tl 2,


C ; = o C 1 - 2 ~ C 2,

C~ = C 2,
a' = oa -

(2.6) Ix.

This can be seen as follows:

,j,(x
Q'(x)= Q(x)
=

/=

j(~_7_~)

~o+ ~, (~__~) + ~( _x_7_~ 1~


x + (oa - ~)
(2.'1)

Czx 2 + ( o C 1- 2/~C2)x + ( o 2 C 0 - C,/~0 + C 2/~2)

Thus the effect of a linear transformation upon the r a n d o m variable is easily calculated. Since most well-behaved distributions are quite amply described by their first four moments, and since any Pearson curve is uniquely determined by these moments, the following procedure can be employed to transform an empirical distribution into a Pearson-type curve.

324

G. Kaskey et aL

(1) Evaluate the raw moments N where N/~ = E~=l x k and x l ..... xu are the observed values. If the N values have been broken down into M class intervals of size 2H, the relative frequency in the ith interval being Yi, the moments are defined by
M

Iz'k= ~, [Xo+(i-1)H]kyi
i~l

where X 0 is the midpoint of the first class interval. These moments are then substituted into the linear system given by equation (2.5) and the values of C 0, C l, C 2 and a are then determined by matrix methods. This routine will henceforth be referred to as T - 1. Integration of equation (2.1) for various values of the parameters reveals that, from a computational standpoint, the solutions can be broken down into six distinct classes. The classes and their properties are presented in Fig. 1. The remainder of this section refers specifically to this figure. The class numbers are essentially those given by Pearson. It was found that, from a computer standpoint, Class 2 could be incorporated with one of the others. The seemingly haphazard arrangement of these numbers results from an attempt to cross-categorize the cases with reference to the quadratic Co+ Clx+ C2x 2. Cases 1, 6, 5 and 4 are those for which C2~0. Cases 1 and 6 arise from two distinct real roots, case 5 from two equal roots, and case 7, the normal, when C 2-- C~ = 0 and C o < 0. Note that the distributions with their appended constraints represent exactly those solutions of Pearson's equation for which the first four moments are finite. The function F(x) is the standard gamma function defined by the relations I'(x)= ~
0

tx-%-tdt

(2.8)

for x > 0 and F ( x + 1)= xF(x) for other x. The function fl(m, n) is the beta-function defined by
fl(m,n)=
or

tm-l(1

--

t)n-ldt

o r(m)r(n) r(m+n) "

(2.9)

fl(m,n)=

II

tt0 0

II
I +

Z II II II

II

II

II

II
O

II

10

g
IT FT
~A xcv

,,
AV

II

V <1

T
V

T
V
~le,I

T
A
I O
to

T
A
O O ~0

V
[o

~'~
n~

o
A

o
V 8 V

~
V/ ::1.

V
g I

g
~ ~

V ::L

V
8 I

" V g I

g V
k

g V

V g I

V g I

"O v

+ +

8
+

+ +
I
+

~-~ + +
1
+

+ .~. I

+
r~

U
.q

~-.,~

:::1.

~
I

~
I

5
I

:10
I

,:10
eD

I
.,<

I
,.<

r.)~

325

326

G. Kaskey et aL

A p r o g r a m is described which accepts the parameters a, C o, C 1, a n d 6"2, as input, and generates the class n u m b e r a n d the quantities )t,/~, o, a a n d e. If any of these quantities does not appear in the equation, the p r o g r a m assigns the value zero to it. The p r o g r a m evaluates the g a n n n a function in this m a n n e r : If x < 5, the recursion relation F(x + 1) = xI'(x) is used until an a r g u m e n t between 1 and 2 is obtained. F o r example, F(3.5)=(2.5)(1.5)F(1.5). I n this range the function is evaluated by the approximation F(x + 1 ) ~ 1 + alx + a2x2 -t- a3x3 + a4X4 + asx 5, where a I = - 0.5748646 a 2 = 0.9512363 a 3 = - 0.6998588 a 4 = 0.4245549 a s = - 1.1010678. The m a x i m u m error can be shown to be below 10 4. If x ) 5 , the function is evaluated by Stirling's approximation.

(2.1o)

logr(x +
+
1

1) ~1og27r - x + (x + ) l o g x
1 ---~ 360x 2 1

12x

1260x 3 '

(2.11)

The beta function is obtained f r o m the relationship /3(m, n) = r ( m ) r ( n ) r ( m + n) " The integral f~/2 (cost)-2(~+~)cosh(et)dt is evaluated with the aid of Simpson's rule a n d a decomposition of the interval (0,~r/2) into 200 parts. This yields an error of the order of 10 -3 . This p r o g r a m will be referred to as T - 2 . It is of importance to realize that the closeness of two distribution functions, fl and f2, is not d e p e n d e n t u p o n their having the same class number, but rather u p o n some appropriate metric such as oo 2 ~1/2

-oo

dt)

1/2

Transformations to normality

327

Two distributions whose equations may appear in form to be dissimilar may, when graphed, be very close together and yield almost identical normality transformations. A useful subroutine is one which accepts a, C 0, C 1, and C2 as input together with the quantities X0, h, and H n and yields the points on the resulting distribution function corresponding to the abscissas X o+ kh for k = 0 , 1,2 ..... [ X n IX n ' (2.12)

where [x] denotes the greatest integer less than or equal to x.

3.

Transformation of Pearson-type distributions

Let f(x) ( i = 1,2), be a density function such that it is positive in the interval (ai,fli) and zero outside this interval and let F,.(x) be the associated distribution function. Then, the function Z'--'I'(x) defined by
x Z

ff,(0dt= fA(0dt
t~l ~2

(3.1)

yields a continuous transformation of (al,fll) onto (a2,f12). Differentiating equation (3.1) with respect to x yields the result Z f'(x)=f2(Z) d dx" (3.2)

Hence, if the density functions are known instead of the cumulative distribution function, Z = ' I ' ( x ) can be obtained as the solution to the preceding differential equation subject to the initial condition Z(xo)= Zo, where
Xo O~1 Z0 1~2

f fl(t)dt-- f f2(t)dt.
Some feasible sets of initial conditions are these:
(a) Z(al) = o~2,

(b) (c)

Z(ill) -- fi2, Z(ml) = m2,

where m 1 and m 2 are the medians of fl and f2.

328

G. Kaskey et aL

Case (c) is always applicable, while (1) or (2) may become useless for numerical computation when either (al,/3t) or (o~2,fl2)is infinite. Hence, in what follows case (c) is always used. If fl(x ) and fz(Z ) are two Pearson-type distributions, the transformation Z = Z(x) which maps the cumulative probability function defined by the first into that defined by the second is given by the integral equation
fl(t)dt= f2(t)dt

-- 00

-- 00

(3.3)

Differentiating equation (3.3) with respect to x, we obtain dZ f~(x) =f2(Z) -dx" (3.4)

Although equation (3.4) can be used to obtain the transformation sought, the complexity of the transcendental functions they contain makes it desirable to seek other methods of solution. It will be shown that the mapping Z---Z(x) can be obtained as the solution to a second-order differential equation which contains only rational functions. From the definition of a Pearson-type curve it is known that for some values of the parameters a, Co, C 1, C2, a*, C~, C'{, C~, the functions fl(x) and f2(Z ) satisfy the differential equations dfl(x )
dx

(x + a ) f l ( x ) C O"l" C I X --I.-C2 x 2 ' ( Z + a * ) f2 C~ + C~ Z + C2Z 2

(3.5) (3.6)

df2(Z ) dZ

Moreover, differentiating equation (3.4) with respect to x yields dfl(x) =f2(Z) d~2 + df2(Z) [ d Z ] 2 dx ~ \-d-~x] " (3.7)

By employing equations (3.4), (3.5), and (3.6), we can eliminate fl and f2 from equation (3.7) and reduce it to the form d'Z+ (Z+a*) ( d Z ~ 2_ dx 2 C~.,I-C~Z-t-C~Z2~dx ] (x + a) dZ (3.8) This second-order equation will yield the desired transformation when

CO + ClX-[.- C2 x 2 d x

Transformations to normality solved with the initial conditions

329

Z(Xm)=Zm,
fl(Xrn) Z (Xm)-- f2(Zm) , ! where the quantities x,n and Z m In the important case f2(Z) =
1 __ e are

(3.9)

the medians of the distributions.

Z2/2,

(3.10)

the Pearson parameters assume the values a* = C~' = C~ = 0 and C~ = - 1. Equation (3.8) then becomes d2Z dx 2
-

Z[ d Z ~2 +
k dx }

(x + a)

dZ dx "

Co..[-Cix.]- C2x2

(3.11)

The initial conditions simplify to Z(Xm) = 0 Z"(Xm)='~f2~ fl(Xm). A program, T - 4 , will be described which evaluates these initial conditions for a given distribution function f l ( x ) . Once these starting values are ascertained, the graph of Z = Z ( x ) can be found by numerical integration. We call this program T - 5. It is worthwhile to note that if both distributions f~(x) and f 2 ( Z ) are normal, the transformation Z = Z ( x ) is linear. If
fl(x) =

l~e-(X

~1)2/2d

and f2(Z) = ~ 1 o2 then substitution into (3.3) yields 1 f e-(t-~)2/2O12dt= - 1 ; e-(t-~2)2/za22 dt. -01 -- oo 02 --oo

e-(Z--~2fl/2~

(3.12)

(3.13)

330

G. Kaskey et al.

Setting r = (t-/~1)/o~ in the left-hand integral, and ~ = ( t - - / ~ ) / o 2 in the right-hand integral, we obtain
( X "- 1~1)/O1 ( Z -- ~ 2 ) / 0 2

--C~

e-'2/2 d~"=

f
--oo

e-~2/2d~-.

(3.14)

Since e - ' 2 / 2 > 0 for all ~-, equation (3.14) implies that Z and x are linearly related through the equation x - t t l -- Z--/x2
O1 O2

(3.15)

Hence, it is clear that if the distribution fl(x) is almost normal, the transformation function Z = Z ( x ) will be almost linear. This fact can be used to test the validity of the assumption that sampled data come from a normal population.
4. Evaluation of initial conditions

The program T - 4 evaluates the median of any Pearson-type density function fl(x). It then employs this median, x m, to obtain the initial conditions

Z(xm) = 0
Z'(xrn) = ~ fl(Xm)
(4.1)

for (3.11), the differential equation defining the transformation of fl(x) to the standard normal form. T o expedite the explanation of the methods used, we introduce Fig. 2. All undefined terms on this figure have the same meaning as in Fig. 1. The following discussion refers to the figure and explains it. Class 7 is absent from the figure since it gives rise to a known linear transformation (see equation (3.15)). The first step in determining x m consists of reducing the various equations for xm to one of the three standard forms:
v

/*(v) = f tA(1- t)Bdt- k=0,


0 v

(4.2) (4.3)
(4.4)

F ( v ) = f tAe-'dt0 v

f*(V)=

f
-7r/2

cos2Ate"'dt-k=O,

+~

&+ ~a

+ + I

+ I

II , p.I II

II I H "x3

%
I t.il oo o

t--

a ,-.,~

> "-= ~a II I II t-. fl tli t-li tI::a @

t,i

I: c "a ~ Jr-,I II ~le,I II il

,'c

i~ =:k W

=~
I ,..,, ,< I ,2, ,.<

i :=l.

-I

,2,
,,< .~ /<

332

G. Kaskey et al.

where k is a constant depending upon the parameters of the distribution. F o r the sake of unity we shall write
V

fi(v)--

f g(t)dt- k,
b

where b and take different values depending upon which of the three forms is under consideration. The variable V is a simple function of x m. The value of Xm can easily be found once the correct value of V has been ascertained (see Fig. 2). The equation i f ( V ) = 0 carl be solved by the N e w t o n - R a p h s o n method. To set the iterative scheme

g(t)

if(V.)
Vn+ l ~" Vn -- f i , ( pn ~ = Vn -

fo v"g(t)dt-k
g( pn ) (4.5)

into operation, it is necessary that a method be found to evaluate


V

,(v)=

f g(t)dt.
b

Direct Maclaufin series expansion and term-by-term integration applied to equation (4.2) yields oo
X=O

( - v)X

(4.6)

The analogous result for equation (4.3) is


I ( V ) = g A +1

~<:o x!CA-+-~-+1 )

(-- V)X

(4.7)

A different method must be applied to equation (4.4). In this case we can write
V

s(v)= f
~r/2

cosZate~tdt =

f (1--sin2t)AeB'dt
-- ~'/2

- ~ / 2 x=o

f ~ (A)(-1)Xsin2XteB'dt

TransJbrrnations to normality
V

333

= ~ { A } ( - I ) ~ f sin2XteS'dt ~=o X -~,/2 = ~


x=o where
V

(A)(_

I)XHx,

(4.8)

f
-7r/2

sin 2~te e' dr.

(4.9)

Applying integration by parts to equation (4.9) yields the recursion relation

1 { l Be-a,,~2 ) Hx=)k2.t_(B)2 X ( ) t - ~ ) H x - i - - ~ sinzx-1V.G x


(4.10) where Gx = leSV[2)tcos V - s i n V] = Gx_ 1+ cos lie BV (4.11)

The starting values for this relation are readily found to be Go-sin Ve s v 4 and H oeB y - e - B,~/2 B (4.12)

Although the convergence of this method is slow it can be used if not too many significant figures are desired. A feasible alternate method in this case would be to use numerical integration to obtain I(V). In a programmed routine it is imperative that V o, the initial approximation to V, be such that the algorithm always iterates to the correct solution. To ensure convergence in the Newton-Raphson method, the following conditions are sufficient. If r is the desired root and V0 the initial approximation, then in the interval ( V0, r) (or (r, V0) if r < V0), (1) f * ' does not change sign; (2) f * " does not change sign; (3) f ' f * " does not change sign; (4) No other root of f * ( V ) = 0 is contained. Tracing through the derivation of equation (4.2) makes it evident that the desired root lies in the interval (0, 1), in which it is, moreover, the only

334 root. For this equation

G. Kaskey et aL

f*'( V)= VA(I-- V)B >O

in(0,1)

and
(4.13)

/*"(V)= VA-'(1 - V ) " - ' [ A - ( A +B) V],

It is clear that there can exist, at most, one Vo in (0,1) such that f*"(Vo) = 0; and that, if such a Vo exists, its value is given b y A V0- A + B" If A + B = 0 ~ A or if A + B (4.14) lies outside the interval (0,1), then f * " ( V )

cannot change sign in that interval. This condition is equivalent to A B <<,O. Letting V-+0 + in equation (4.13) we find that, in (0, 1), f * " has the sign of A if A ~ 0 , and the sign of B if A = 0 . If both A and B are 0, then f * " = 0 . On the basis of these facts the method shown in Fig. 3 has been programmed to obtain a starting value for the N e w t o n - R a p h s o n iteration. This method always ensures convergence. For equation (4.3),

f * ' ( v ) = tAe

',
(4.15)

f * " ( V ) = t A - ' e '(A - t),

and one and only one root is known to lie in the interval (0, ~ ) . In this case we obtain the flow diagram of Fig. 4 for computing an appropriate

Vo.

I~o~
YES (!) ~ NO

L v.=T l
Fig. 3. Method for obtaining starting value for Newton-Raphson iteration cases 1,6.

Transformationsto normality

335

INo
Fig. 4. Method for obtaining starting value for N e w t o n - R a p h s o n iteration cases 3, 5.

F o r the third and last s t a n d a r d form, equation (4.4), we have f * ' ( V ) = cos z~ te a` f * " ( V ) = cos (2A - 1) t exp(Bt)

(B cos t - 2A sin t).

(4.16)

Here the procedure is particularly simple. If A 4=0, we set V0 = tan_ 1 2B A" (4.17)

If A = 0, the correct value of V can be f o u n d in closed form. Integrating equation (4.4) and solving for V yields

V=-~

loge(KB+e -B~/2)

for B=/=0 (4.18) for B = 0

V = K - -2

5.

Integration of transformation equation


T o integrate the t r a n s f o r m a t i o n equation

d2Z (dZ)2 dx 2 -~x +

Co-I-ClX..]-C2 x2 ~ x

(dZ)=F(x,Z,Z')

(5.1)

p r o g r a m T - 5 employes the second-order R u n g e - K u t t a method. The inputs to this p r o g r a m are (a) a, C o, Cl, C2 (b) x., and Z'(x,,,) (c) The range, (L, U), in which the values of Z = Z(x) are desired, and (d) h, the integration interval.

336

G. Kaskey et al.

ZOzz<
1

o
z <

8
0

~5
0
o

71~
II

~
"~

7
[~

r)

<

Transformations to normafiCy

337

The R u n g e - K u t t a method is applicable to any differential equation of the form Z " = F ( x , Z , Z ' ) , subject to the initial condition Z ( x o ) = Zo, Z ' ( x o ) --Z~. The values of Z, = Z ( x o + nh) are computed by recursively evaluating the sequence of formulae
x,, = x n . l + h, k 1 = h f ( x n , Z n, Z~),
1
1 1 t

1
1

t
I

1
I

~ = hF(xo + h, Zn + hZ~ + ~ h~, Z~ + ~3), a z = h( G +

(5.2)

~(~, + ~2 + ~3)),

A Z ' = ~(k I + 2k 2 + 2 k 3 + k4) ,


Zn+l= Zn-I-mz,

Z'+I=Z/,+AZ'.

In our case, we set x 0 = x m. To simplify the plotting of the output, a slightly modified procedure has been adopted. The values L, h, and U should be chosen so that U - L = nh, that is, an integral multiple of h. Suppose U* is the smallest value of the sequence [L + nh], which is greater than or equal to x m. Using this value, we decompose the integration procedure into three phases. (a) We set h,, = U * - x m, and use this increment once in formula (5.2) to obtain Z ( U * ) and Z ' ( U * ) . (b) With Z(U*) and Z ' ( U * ) as initial conditions, and the original value for h, the values of Z corresponding to the values U* < x < U are found. (c) With Z(U*) a n d Z ' ( U * ) as initial conditions and - h as the integration increment, the values of Z corresponding to L < x < U * are determined. An orthogonal polynomial routine can be used to find the best polynomial fit to the function Z - - Z ( x ) . Here the term "best" refers to the X2 criterion. We refer to this as T - 6 . Fig. 5 gives a s u m m a r y of various calculations to be m a d e for transformation to normality.

6.

Statistical analysis of transistor data

In this section, we illustrate the method of transformation to normality by using the data on Beta parameter of a batch of transistors. The histogram depicting the distribution of beta at zero hours is shown in Fig. 6. The data is transformed to normality by using the following steps.

338
0.20

G. Kaskey et al.

>- 0.16 z u.I 0 0.12 tal kl_ 0.08 I..J 0.04

_J
, ~fS_&,
4
6 8 B ETA

,
I0

,I,
12

14

17

16

F i g , 6.

Frequency distribution for parameter Beta.

Step L W e c o m p u t e the raw m o m e n t s #;, by using the formula


/x~- 1

-~

Y xi%
i=1

where Yi denotes the frequency in the class interval with m i d p o i n t x i and N denotes the sample size. In the case of our data, we f o u n d that /~'~= 10.988333, /~2' - 125.170917, /~3 = 1,466.936433 and /x4-' - 17,588.236029. The constants C o, C~, C 2 and a are then found by using the equation

Cl =_

2~'i

3~;

f,

3~; "::II 4.; ' 5.~ '

~;j

~;/ .~j

(6.1)

The values of the constants for our data are C o = 7 . 6 3 9 8 2 5 , C 1 = 4 . 0 7 8 8 1 3 , C 2 = 0.244372 and a = 12.279991. Step II. The exact f o r m of the distribution is found f r o m the input, C o, C 1, C2,a, as follows: (a) Since Cz~O, the class n u m b e r , F, equals 1, 4, 5, or 6. (b) The constants/~, o, a and e are calculated as follows:

2c~ co c~
C1

Transformations to normality A=b2-c

339

/~= - b - x/A = 2.149995 o = - b + C A = 14.541041 a = - ( p t + a) -3.345421 e = 2C2V-----~ = 0.746709 (c) Since a > - 1 and e > - 1 for the beta distribution, F = 1, and the density function is f (x) = X(x - i~)~(o - x)q (d) The program finds that )t = (4A)-(~+"+ 1)/2 = 0.441412 X 10 -4, 3 ( a + 1,e+ 1) where 3(m, n) is the beta function. Step IIL On the basis of the inputs F, )t,/~, o, a, e, and L=2.5 U = 14.5 H=0.5 the function f ( x ) is evaluated at intervals of 0.5 from x = 2 . 5 to x = 14.5. This approximation to the histogram of Fig. 6 is seen superimposed upon it. Step IV. Using the inputs, F, ~, I~, o, and e, the program ascertained that
x,,, =

2c2vzx (o+a)

median o f f ( x ) = 1 1 . 2 9 0 9 9 2

Z'(Xm) = V ~ f ( X m ) = 0 . 4 3 7 6 1 8 where Z ( x ) is the transformation to the standard normal. This implies that Z(Xm) equals zero. Step V. By using the inputs x m, Z'(xm), a, C o, C 1, C 2, L, U, and H, this program determines the mapping function Z ( x ) in the interval [L, UI at points spaced H units apart. This is done by numerically integrating the second order equation

d2Z=zCdZ~2
dx 2 tax/

(x+a) (dZ) + ( Co + Clx + C2x 2) -~x

(6.2)

by the second order R u n g a - K u t t a method.

340

G. K a s k e y et aL

Ld I-n."

> _1 75 n.o z -2
-

-3

-4

/
4 6 8 I0 12 X (ORIGINAL VARIATE ) 14. 16

-5

Fig. 7. Transformation curve for parameter Beta.

The value Z(xm) equals zero and the value of Z'(Xm) calculated in Step IV are used as initial conditions. The resulting transformation curve for the beta parameter is shown in Fig. 7. Step VL An analytic fit to the transformation curve of Fig. 7 is obtained by the method of orthogonal polynomials. The 25 points generated b y Step V are used as input and lead to the polynomial approximation Z = -3.90162224+0.215246621v +0.00125276413v 2 where v = (x - 2.5)/0.50. References
Andrews, D. F. (1971). A note on the selection of data transformations. Biometrika 27 825-840. Andrews, D. F., Gnanadesik;an, R. and Warner, J. L. (1971). Transformations of multivariate data. Biometrics 27, 825-840. Atkinson, A. C. (1973). Testing transformations to normality. J. Roy. Statist. Soc. Ser. B 35, 473 -479. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. J. Roy. Statist. Soc. Ser. B 26, 211-252.

(6.3)

Transformations to normality

341

Draper, N. R. and Cox, D. R. (1969). On distributions and their transformations to normality. J. Royal Statist. Soc. Ser. B 31, 472-476. Elderton, W. P. and Johnson, N. L. (1969). Systems of Frequency Curves. Cambridge University Press. Fraser, D. A. S. (1967). Data transformations and the linear model. Ann. Math. Statist. 38, 1456-1465. Kaskey, G., Kolman, B., Krishnaiah, P. R. and Steinberg, L. (1961). Statistical techniques in transistor evaluation: transformation to normality. Technical Report, Applied Mathematics Department, Remington Rand Univac, Philadelphia. Krishnaiah, P. R., Lee, J. C. and Chang, T. C. (1976). The distributions of the likelihood ratio statistics for tests of certain covarianee structures of complex multivariate normal populations. Biometrika 63, 543-549. Lee, J. C., Chang, T. C. and Krishnaiah, P. R. (1977). Approximations to hhe likelihood ratio statistics for testing certain structures on the covariance matrices of real multivariate normal populations. In: P. R. Krishnaiah, ed., Multivariate Analysis-IV. North-Holland, Amsterdam. Rao, M. M. (1960). Some asymptotic results on transformations in the analysis of variance. ARL 60-126, Wright-Patterson AFB, Ohio. Solomon, H. and Stephens, M. A. (1978). Approximations to density functions using Pearson curves. J. Amer. Statist. Assoc. 73, 153-160. Tukey, J. W. (1957). On the comparative anatomy of transformations. Ann. Math. Statist. 28, 602-632.

P. R. Krishnaiah, ed., Handbookof Statistics, VoL 1 North-Holland Publishing Company (1980) 343-387

| |
1 1

A N O V A and M A N O V A :
Models for Categorical Data

Vasant P. Bhapkar

1.

Introduction and notation

Consider a random sample of size nj from t h e j - t h population, j = 1..... s. Suppose that each observation (or unit) in the j - t h sample is assigned to one of i) categories on the basis of some characteristic of the observed unit. Let n~,j be the number of units in the j-th sample that are assigned to the i-th category, i = 1..... 1),j= 1. . . . . s. Then
9

E ni,j=nj, j = l , . . . , s .
i=l

Data consisting of such counts {n~j} are referred to as categorical data. We assume that ~ri, j is the probability that a random observation from the j-th population would be assigned to the i-th category. Thus,
rj

7rij>/O,

E qri,j=l,
i=1

j=l

. . . . . s.

Let N/j denote the random count in the i-th category for the j-th population. Then the random vector Nj. = [NIj ..... Nsj]' has multinomial distribution
----nj
h i , j , . . . , n,),!

"qTn9)

If the samples are obtained independently, the vectors Nj, j = 1,...,s are independently distributed. The random vector N = IN'1..... N;]' has, then, the product-multinomial distribution

PIN=n] = fi
j=l

nj
nl,j,...,ngj "=

'Fi,~J

(1.1)

343

344

Vasant P. Bhapkar

The index i will be said to refer to the 'response' category (or level), w h i l e j is said to refer to the 'factor' category (or level). If we are sampling from only one population, s = 1 and, then, the subscript j would be deleted. When the number of response categories is the same for all populations, i.e., i)= r , j = 1. . . . . s, then the counts (ni,j} in the cells ( i , j ) formed by the cross-classification are usually referred to as a c o n t i n g e n c y table. In the general case, the subscripts i a n d / o r j could be multiple subscripts. Suppose i = (i I. . . . . ik), ia = 1. . . . . t),a, while j = (Jl . . . . ,.it), Jb = 1 . . . . . Sb. That is, the multiple response i comprises k (sub) responses, which are measurements (or observations) on k different characteristics of the same sample unit (or, possibly, observations of the same characteristic on k different occasions on the same unit). Then i)= Ilarj, a. Similarly, the index j describing the population m a y correspond to the combination of l levels Jb of the b-th (sub) factor; however, it is to be understood that all possible combinations might not be selected for the sampling investigation and, in that sense, the lay-out could be incomplete. Thus, for a complete lay-out, s = Ilbs b. In the particular case when the cells arise from cross-classification, we have rj, a = r a, and = r = I I j ~ for all j ; the array of counts {ni ...... i~,j..... ,j,} is then referred to as a ( k + l ) - d i m e n s i o n a l contingency table arising from k responses and 1 factors. W e adopt o in place of a subscript as a summation symbol over that subscript. Thus,
no, i=. . . . .
i,,,j~, .. ",Jl = ~-" i l n i l . . . . . ik,j l,...,Jr'

no ..... o,jl,....Jt

= ~a~ianil

..... ik,jl,...,Jt

= nJl,...,Jd

nil ..... ik,o,J2, ... Ji = ~ j l n i l

..... ikdl,...,Jl'

and so on. A similar notation will be used for the corresponding probabilities 7ri ...... ,.,j ..... j . It should be noted that To, ...,odl,-..,Jr = l, and that all probability sums with zeroes occurring only at some or all of the response subscripts indeed represent probabilities of some events; on the other hand, the number represented by such a ,r-term with zeroes occurring at some factor-subscript (regardless of whether they also occur at response-subscripts) has no probability interpretation. F o r the development of the basic methodology, it is convenient to revert to the earlier condensed notation ( i , j ) for the overall response-factor categories for the general case with 1) response-categories for the j - t h population.

A N O V A and M A N O V A

345

Let then ~rj-[ ~,,j


..... %j,

~ ' = [~,' .... ,~,1 '

R= ~rj,
j=l

n= ~'~nj
j=l

(1.2)

ej = ~ Nj,

P' = [Pi ..... P;].

Note that Pj is the ~)-dimensional random vector of proportions of counts within thej-th sample, while P is the R-dimensional random vector of such proportions for the s separate samples. For the product-multinomial probability law (1.1), we have E(P) ='a', l~(t)-- Cov(P) = Diagonal( ~ [ Aj(~rj) -- ~rjcrj], j = 1..... s) where (1.3)

Aj(~rj) = Diagonal(%,j,

i = 1..... rj).

If all ~rij>0, i = 1 ..... % then the j-th diagonal block is of rank I ) - 1 . Thus, the covariance matrix of P is of rank R - s , provided all ~5,j are non-zero. This rank is reduced by one for each of the zero probabilities %d" For the large-sample theory discussed in Sections 3-5, the following results are fundamental: (i) (ii) Pj,5 --->~rj, P,~'ct,
a.s.

asnj--->oo
(1.4)

as nj---~oe, j = 1..... s.

(iii) Under the assumption:

A:nj/n---~Xj,
for j = 1..... s,

where 0<Xj < 1

- ~) --+ N R ( 0 ,*(~r)),

(1.5)

where *(t) = Diagonal(Xj- 1[ Aj (rj)- ~rflrj],j=

1..... s).

Here we have introduced nj, n as additional subscripts to denote the sample sizes on which the proportions are based in order to study the limiting behaviour. (i) is the consequence of the strong law of large numbers and (ii) follows from (i); moreover, (ii) implies
Pn --~ ,rr.
P

346

Vasant P. Bhapkar

In (iii) N R denotes the R-variate normal distribution, and (1.5) is the consequence of the central limit theorem. As described above, in this chapter we confine ourselves to the relatively simpler single-stage sampling context. The problems of statistical inference with more general sequential or multi-stage samples or of the optimal strategies for such multi-stage sampling will not be discussed. Similarly, for the most part, the discussion will be within the c l a s s i c a l (frequency) framework of statistical inference, although a brief discussion of work within the Bayesian framework will be presented in Section 12. For discussing the various techniques that are available to tackle the problems of statistical inference concerning ~r, or related parameters, we begin with the log-linear representation that is found quite useful in the analysis of contingency tables. 2. Log-linear representation Consider first a one-dimensional array (%) of probabilities. Let
Oi=log%, i = l . . . . . r.

Writing I * = X i O i / r and cti= 0 i - I*, we have log~i =/~+ o~ i, (2.1)

where X i a i =0. Thus, the r ~-parameters, only r - 1 of which are independent, are expressed in terms of r + 1 new parameters in the log-linear representation (2.1). However, the number of independent new parameters remains r - 1 since ~ i % = 0 and /t is determined from a's in view of the constraint
1 = Ziggi
=

eUXie~.

Similarly, for a two-dimensional table arising from one population (i.e., k = 2, s = 1 in the notation of Section 1), let
Oi,,i2=log~ri.i2 , 1
~L = - - -

ia = 1 . . . . . ra,

a=

1,2,

rlr 2

~-~ilXi20i,,i2 ,

0t(2) !2 = Z

EilOi,,i 2 --

/t,
tl '2 "

, a!ll'i~ ) = O i l , i 2 -

~'~-

R ('1) __ 0L(.2)

A N O V A and M A N O V A

347

Then we get the log-linear representation


logqri~.i 2 = ~ + or(l) + O~(2) + ~ ~(1,2) it i2 il,i 2

(2.2)

where the parameters on the right hand side satisfy the constraints

iz

i2

ol}l,i 2) = ~
ii i2

OLil,i 2(1,2= ) 0,

(2.3)
e I1 12 il,i 2 " ~,,,+~(2,+,(lm

1= E E ~ril,i = e ~ E E
il is il i2

The number of independent new parameters is ( q - 1) from oq(l)'s, (r 2 - 1) from a}~)'s and ( q - 1)(r 2 - 1) from ag"Z)'s,,,,2 in view of constraints (2.3); these numbers add up to q r 2 - 1, which is precisely the number of independent
1~i1, i2 'S"

Proceeding in this manner, for a k-dimensional contingency table from one population we get the log-linear representation
k

log%, ,...~ ik-----/z+ E

a= 1

Og! a)'{- E E OL~al'a2)'~" " " " -]- Ol~''"'i k)" a lal,la 2 ,''', al=/=a2

(2.4)

Moreover, the new parameters satisfy the constraints


a l~ ) = 0 a io

E oL{al"a2): E 0L('al"a2) : 0 lal~ ta2 . lal ~la 2 iaI la2

(2.5)

~(1 ..... k) ~ i ...... ik


ia

=0,

a = 1..... k,

and

oi ......
a i~ i

exp(

al=/=a2

In analogy with the classical linear models in ANOVA, the a a) are called ,a the main-effects of response a, a (a''a9 terms are called the effects of interaction o f the f i r s t order between two responses al, a2, and so on, and finally a (1..... k) terms are called the effects of the ( k - 1)-st order interaction

348

Vasanl P. Bhapkar

among k responses. The number of independent new parameters is

k Z (r,-1)+~Z(G--l)(ra2-1)+-..
a=1 a l q"a2

+(r,-l)---(r/,-1

= { ( r l - 1 ) + l } { ( r 2 - 1 ) + 1 } . . . { ( r ~ - 1)+ 1}-- 1
= qr 2. r~-

1;

this is precisely the number of independent ,r-terms. Thus, for atl r: i . . . . . . ik > 0, we can get the log-linear representation involving the same number of independent parameters; there is a 1 - 1 correspondence in the sense that not only ~r's are expressible in terms of a's but, conversely, the a's are expressible in terms of ~r's. Along similar lines we can obtain the log-linear representation for a (k + /)-dimensional contingency table arising from k responses and l factors. In addition to the a-terms in (2.4) coming from responses (1 ..... k ) , we have similar fl-terms arising from factors (1, .... l) and mixed 2/-terms arising from response-factor combinations. Thus,

log , . . . . . .

.....

a=l
_[_(EEa( al ~a2

b=l

E BJf
bl ~b2 ~,v(.a',,b'~]

. . . . 2) ..[- E E ~)blb,~bb22) .31,o:,o=

+...

+v(.l ..... .k;.1..... () IlD...~lkJD,,.Jl

(2.6)

The parameters a,fl,3,'s satisfy relations of the type (2.5) except that, in (2.5), instead of just one relation coming from the basic d e s i g n - c o n s t r a i n t here we have such relations coming from all the basic design-constraints, viz. / \
...... ,.j ..... "'" }

i~

bl ~b2
7~t,

~ffa /~ exp(~a a ~X ' ~+" al'c'a d..~'~ ~5~a2 a(al"az)t~l, ta2 "[- ~a L~

y~aj~,+... ), (2.7)

for all combinations (Jl .... ,Jl) of levels of l factors. Thus, # and/3-parameters are completely determined by the a and -/parameters. The number of independent parameters on the right hand side of the log-linear representation (2.6) is, thus,
(r,r2. . . rks:2. . . St --

1) -

(sis2.

. . st -

1) =

( qr2.

. . rk -

1)S1S2" " " Sl;

(2.8)

ANOVA and MANOVA

349

this is precisely the number of independent probabilities ~r i ...... ;k,J......J," Representations of type (2.4) or (2.6), without side-constraints of type (2.5), are sometimes termed super-saturated in the sense that there are more new parameters on the right hand side than the number of independent parameters on the left hand side. Such a representation is always possible and it is not unique. However, with the side constraints of type (2.5), the number of independent new parameters is exactly equal to that for the earlier set. Such representations are called saturated; such a representation also is always possible and, moreover, it is now unique. It is only when the number of new independent parameters is less than that for the original set (i.e. the unsaturated case) that such a representation is actually a model in tile sense that it may or might not apply for the given ~r.

3.

Methods of estimation

Consider first the problem of estimation of unknown modelparameters 0 in the case where the model specifies the design probabilities ~r as given functions of 0, say ~ri4= 7ri,j(O), i = 1..... rj, j = 1..... s, (3.1)

where 0 ' = [01..... On]. Let O be the parametric set o f possible values 0 for which the description (3.1) applies for some ~" satisfying the basic constraints ~rid > 0, Z,.~ri,j= 1,j = 1..... s. Some of the well-known methods of estimation are the following: (i) Maximum Likelihood Estimation. Let L(0; n) be the likdihood function of O, given the data n, when the expression (3.1) is substituted for ,rr in the probability law (1.1). A maximum likelihood estimate (m.l.e.) 0 is defined by the property L(O; n) = sup L(O; n).
OEO

(3.2) Let the function X2(O; n) of O,

(ii) Minimum Chi-Square Estimation. given n, be defined as

x:(0;n)= j= 1 i=l E

[ni,j_njqri,j(O)] 2

nj~r;,j(O)

'

(3.3)

where it is assumed that ~r~,j.(0)>0. Then a minimum chi-square estimate (m.c.e.) O* is defined by the property X2(O*; n) = inf X2(O; n).
0~o

(3.4)

350

Vasant P. Bhapkar

(iii) Minimum Modified Chi-Square Estimation. y2(0; n) of 0, given n, be defined by

Let

the function

Y2(0;n)=
j=l

E
i=l

r)

[ ni,j -" 12jWi,j(O) ]2 ni,j

(3.5)

where it is assumed that n<i >0. Then a minimum modified chi-square estimate (m.m.c.e.) 0 is defined by

y2(O; n) = inf Y2(O;n).

(3.6)

(iv) Minimum Discrimination Information Estimation. Let the discrimination information function I(0; n) of O, given n, be defined as l(0;n)= ~ rj ~] n i o t g -a,ni'~"
I~J'~i,jtv]

(3.7)

j = 1 i= 1

Then a minimum discrimination information estimate (m.d.i.e.) {J is defined by

l(O;n)= inf l(O;n);


O~O

we note that this turns out to be the same as m.l.e. An alternate m.d.i.e, is obtained by considering

r,
I*(0;n)= ~
j=l

nFi,j(O)
ni,j

~] nFi,j(O)log
i~l

(3.8)

and defining m.d.i.e. 0 by I*(d; n) = inf I*(0; n).


OE

(3.9)

In order that such estimates exist, one of the regularity conditions that is usually needed is that %o(0)> 0 for all i,j, unless ri, j is a-priori required to be zero under the model. Even if %,j(0)> 0, there is a positive probability that ni,j-- 0. In order to define m.m.c.e. (iii) in such a case, a modification is usually suggested in (3.5) that zero hi,j be replaced by some small number like I/2, 1/nj or 1~2hi etc. (see Section 9 for further comments). Under the regularity conditions stated later, the estimates are obtained by the solution of appropriate equations (e.g. likelihood equations, mini-

A N O V A and M A N O V A

351

mum chi-square equations etc.). These estimators, then, have some optimal properties only in the asymptotic sense. Moreover, such estimators are later used in constructing the corresponding test criteria for judging the goodness-of-fit of the model specified by (3.1). Such goodness-of-fit statistics, discussed in the next section, in turn, have optimal properties only in the asymptotic sense. Suppose that the model (3.1) is true, i.e. there exists 0, say 00, in such that ~r= ~r(00). Assume that the functions ~7ij(0) are differentiable and O is an open set. Then the maximum of L (or minimum of X 2, y2 etc.) is attained at 0 (or 0", 0 etc.) satisfying equations OL(O)/OO=O, (or OX2(O)/O0 etc.), or equivalently,

01ogL(0) 00 =0,

(3.10)

usually referred to as likelihood equations, unless the maximum (or minimum of X 2 etc.) occurs on the boundary of O. If 0 o is in O and, thus, is an interior point of O, then with probability approaching one, as n---~c~, under regularity conditions which will be indicated later in this section as C2, can be obtained as a solution of equation (3.10) and similarly for 0", 0 etc. Barring relatively simple situations, the equations like (3.10) usually do not provide direct (i.e. explicit) solutions and have to be solved by iterative techniques (see, e.g. [10], [19]). In the special case where the functions ~rij(0) are linear in 0, however, the technique (iii) leads to solution of linear equations. Hence, if the parameters 0 are independent (see condition (iii) in C2 later) and m <~R - s, a m.m.c.e, can be obtained by matrix-inversion and a direct solution is possible. In the case where 0 = ~r it is seen that the observed value p of random vector P is indeed the m.l.e, as well as the m.c.e., the m.m.c.e, etc. of ~r. However, in the non-trivial case, the model (3.1) is correct only if ~rEII M, where n~= {~l~=~(O),O ~ 0 } (3.11)

= {~rlf(~r ) =0),

(3.12)

where ft(~)= 0, t = 1..... u, are the constraints that ~r has to satisfy in order that (3.1) holds for some 0 in O; these constraints can be obtained, in theory, by eliminating 0 from equations of type (3.1). (3.11) is termed the freedom-equations specification of the model M, while (3.12) is termed the constraint-equations specification (see, e.g., Aitchison and Silvey [1]). With the constraint-equations specification, estimates of ~r, subject to ~r~ HM, can be obtained by techniques (i)-(iv) with the obvious modification in the definitions. For example, if all ni,j > 0 (or replacing zero ni,j's by

352

Vasant P. Bhapkar

small positive number like 1 / 2 or 1/2nj), a m.m.c.e. ,~, of ~, is defined by

Y2(~';n)=

inf k E ~EHMj~I i~l

(nio-nFiJ)2",
ni,j

(3.13)

here we have used the same letter Y for the function in the generic sense. If the freedom equations (3.1) are linear in 0, the constraint equations

are linear in ~r and, hence, a m.m.c.e. ~i is available by solving only linear equations. However, with non-linear f(~r), the following linearization technique, due to N e y m a n [23], enables us to obtain a direct solution by solving only linear equations. Consider the first-order Taylor expansion ft(~r) about p, ignoring the remainder term, say ft*(~r)=ft(p)+f}l)'(p)(~r--p),
i,e

t = 1..... u (3.14) (3.15)

f*(~r) = f(p) + r(p)(~- p), F(p) = Vf~l)'(p)

where

L
Let II~t--{~r]f*(~)=0}. Then a m.m.c.e, using the linearization technique minimizes YZ(~r) subject to 7r ~ II~, regarding p and n fixed; therefore, the minimizing equations turn out to be linear. (v) Weighted Least Squares Estimation. Consider now a model which is more general than (3.1), viz. ft(~r)=xt(0), t = 1..... u, (3.16)

where f and x t are specified functions of ~r and O, respectively. Assume the following regularity conditions on f: (C.1) (i) The functions ft have continuous partial derivatives with respect to %j. Let then

A N O V A and M A N O V A

353

(ii) The functions ft are independent in the sense that, if all ~r~,.i> O, Rank F(~r) = u. (iii) The functions f are independent of the basic constraint functions

Ei~ri,j(~ 1),j--1 ..... s, in the sense that

if all ~r~,j> O; here E is a diagonal block matrix of blocks [1']~ x ~ , j = 1..... s. We then have u ~<R - s. Let
= t3.17)

and

$2(0) = [ f(p*) - x(0) ] ' H - l ( p , ) [ f ( p , ) _ x(0) ],


where
l~i,j + aid

(3.18)

Pi*d= nj + EiagJ

(3.19)

Here aid are small adjustments (see Section 9) that are also useful for ensuring that H(p*) is nonsingular when some p~,j =0. If all p~,j >0, we could take all aid = 0, so that p * = p; some suggestions for choice of a's are given in Section 9. $2(0) is termed the weighted (or generalized) sum of squares (and products) of residuals. is called a weighted least squares estimate (w.l.s.e.) of 0 in the model (3.16) if SZ(0)= inf $2(0).
OEO

If all nij > 0 and we take p* =p, then it can be shown that $2(0)= Y2(0) in the special case u = R - s where the model (3.16) becomes indeed the previous model (3.1). In this sense, the WLS technique is an extension of the modified chi-square estimation technique. If the function x(0) happens to be linear in 0 so that the model becomes f(~r) =XO, (3.20)

then the WLS technique reduces to a solution of linear equations (for 0) X ' H - l ( p * ) X 0 = X'H-~(p*)f(p*). (3,21)

354

Vasant P. Bhapkar

In order that the estimates defined by techniques (i)-(iv) exist, be unique, and also possess some asymptotically desirable properties (see Theorem 3.1) some regularity conditions, C2, are needed (see e.g., Cramer [13], Neyman [23], Birch [9] and Rao [25]). We have assumed that ~r= ~r(0), 0o being the true point in the open set .

(c.2)
(i) %d(O)> 0 for all i,j. (ii) ~(0) is totally differentiable on 0 so that ~r(0) =~r(q~)+ [ ~ as 0--~0 in O. (iii)

]~(O~q~)+o(l,O-q~[,)

Rank[~]=m<<.R-s.
(iv) 0 is identifiable in the sense that, given e > 0, there exists 6 > 0 such that II0-c/,ll >e implies

liar(0) -,,(q~)II >8.


Although the conditions are strictly needed only at 00, since 00 is unknown and could be any point q~ in O, these conditions are needed in effect for the whole . THEOREM 3.1. Assume A: nj/n-+~, 0 < ~ < 1 as n--~oo, and the model (3.1) with the conditions C2. Then, as n---~oo, (i) with probability approaching one, there exists the m.l.e. 0 which satisfies the equation (3.10) (and m.c.e. 0", m.m.c.e. O, etc. which satisfy the corresponding equations), L 1 (ii) n l / 2 ( 0 - 00) --)' N m ( 0 , F - (0o)), where

Iff~r)=Diagonal[Xj-lAj(~rj), j = 1..... s].

The property (ii) holds also for 0", 0 etc.


Estimators like m.l.e., m.c.e., m.m.c.e., etc. satisfying property (ii) have been referred to in the literature as regular best asymptotically normal

A N O V A and M A N O V A

355

(RBAN) estimators.
More generally, consider the model (3.16) and regularity conditions C1 and C3 which are obtained f r o m C2 with appropriate modification as follows: (C3) (i) ~ri, j > 0, all i,j. (ii) x(0) is totally differentiable on . Let then

(iii) R a n k X ( O ) = m < u < R - s. (iv) Given e > O, there exists 8 > 0 such that IIO - ~H > e implies II(O) - x(~)I1 > & THeOReM 3.2. Assume A and the model (3.16) with conditions C1 and C3. Then as n--->oe, (i) with probability approaching one, there exists a w.l.s.e. which satisfies
-o.
L

O0

(ii) n 1/2(~_ 00) _~ N m ( 0 ' r - 1(0o, '~0), where

F(0, ~r) = X'(0) [ F(~r)@(~r)F'(-a) ] - 'X(0),

(3.23)

where ~P is defined by (1.5). Here % is the true ~r such that f(~r0)=X(0o).

It can be verified that for the spedial case (3.1) of (3.16), when ~r=~r(0), F(0, ~r) reduces to F(0) given in (3.22).

4.

Tests of goodness of fit of models

In the notation of Section 3, suppose now it is desired to test goodness of fit of the model (3.1) on the basis of the observed data n. Thus, we n o w want to test the hypothesis Ho: where 0 E O. = (4.1)

356

Vasant P. Bhapkar

Some of the well-known test procedures are the following: (i) Likelihood Ratio Test. The likelihood-ratio criterion, h, is defined by sup L(O; n) )t-----X(n)= o~o = L(0;n) (4.2)

sup
qr

L*(p;.)"

Here L* is function of proportions. X. For large

the likelihood function of ~r, i.e., the expression (1.1) as a qr, given n, 0 is the m.l.e, and p is the observed vector of The hypothesis H o is rejected for sufficiently small values of n, assuming regularity conditions C2, the LR test rejects H 0 if --21ogX>x21_,(R--s--m), which is the quantile of order 1 - a of the x2-distribution with R - s - m d.f., where a is the desired level of significance. Note that
9

- 2 1 o g X = 2 '~ E nij{lgPij-lg%,j(O~)},
j=l i=1

(4.3)

where/~ is the m.l.e, of 0 under (4.1). (ii) (Pearson) Minimum Chi-Square Test. tistic is

The minimum chi-square sta-

X2=-XZ(n) = X2(0*; n),

(4.4)

with 0* defined by (3.4). In practice, however, it is usually computed by substituting the m.l.e. O, rather than 0". The large-sample properties are unaffected by this substitution (see the remark following Theorem 4.1). H 0 is rejected at level of significance a if XZ(n)>x21_~(R-s - m) for large n under regularity conditions C2. (iii) (Neyman) Minimum Modified Chi-Square Test. This test is based on the modified form y2, viz. g 2 = y2(n ) = y2(~; n), (4.5)

with 0 defined by (3.6). It is assumed that all ni,i > 0. If not, some minor modification is needed (see Section 9 for comments in this context). H 0 is rejected at level a if YZ(n)> X ~ _ ~ ( R - s - m ) for large n, assuming conditions C2. (iv) (Kullback) Minimum Discrimination Information Test. If we use the m.d.i.e, on the basis of definition (3.7) of 1(0), it turns out to be the same as m.l.e, as noted in 3(iv). Hence the test based on the minimum value of I(0), defined.by (3.7), happens to be equivalent to the LR test. However, if the m.d.i.e. 0 is obtained as in (3.9), then an alternate test is based on the

A N O VA and M A N O VA

357

minimum discrimination information statistic

I*~I*(n)=I*(~,n).

(4.6)

H 0 is rejected at level a if 21" > X~- ~(R - s - m) for large n. Although the discussion so far is in terms of the freedom-equations specification (4.1) of the hypothesis H o, it applies with appropriate modification (as in Section 3) also to the constraint-equations specification of H o, viz. H 0 : f(~r) = 0, (4.7)

as in (3.12). If f(~") is linear in ~, say F~r+f 0, the m.m.c.e. ~, subject to (4.7), is directly available solving only linear equations. It can, then, be shown (Bhapkar [3]) that the minimum modified chi-square statistic is y 2 _ y2(n ) = f,(p) [ FX(p)F' ] - ~f(p), (4.8)

provided all n~,j> 0. See Section 9 for modifications suggested especially if some ni,j happen to be zero. Neyman [23] suggested the use of the estimate ~ obtained by the linearization technique in Section 3 if f(~r) happens to be non-linear. Bhapkar [4] has shown that the y2 statistic, using such linearization for ~, gives now r 2 - r2(n) = f'(p) [ V(p)~(p)V'(p) ] - 'f(p)

= f'(p)H-'(p)f(p),

(4.9)

with H defined by (3.17). Moreover, he then shows, that this is precisely the form the Wald statistic takes when adapted to the present categorical data problem. Thus, when all n~,j>O and conditions C1 hold, the Wald statistic and the Neyman minimum modified chi-square statistic, using linearization in the non-linear case, are algebraically identical. We would, therefore, call the statistic given by (4.9) also W, the Wald-statistic. Note that (4.8) is, of course, a special form of (4.9) in the linear case. In order to avoid the problem of possible singularity of H in case some n~,j happen to be zero, we define more generally the Wald-statistic by

W= f'(p*)n-l(p,)f(p,),

(4.10)

with p* defined by (3.19). See Section 9 for further comments. The hypothesis H 0 (4.7) is rejected for large n if W > X2_~(u).

358

Vasant P. Bhapkar

The validity of all these large-sample tests for H 0 and the fact that it is permissible to use any RBAN estimate (e.g. those produced by (i)-(iv) in Section 3, including linearization if necessary for (iii)) of 0 (or ~r in the constraints specification) for constructing statistics in (i)-(iv) here follow from the following theorem (see Neyman [23], Gokhale and Kullback [15], Birch [9]). THEOREM 4.1. Assume A = n j / n-->Xj, 0 < ~ j < 1 as n-->c~, and the conditions C2. Then each of the statistics -21ogX, X 2, y2 and 2I*, using any of the estimates satisfying (3.22), has a limiting chi-square distribution with u = R s - m d.f., if H o specified by (4.1) is true. The theorem continues to apply for the specification (4.7) when the estimate #, to be substituted in --21ogX, X z, y2 etc., is obtained by either of the techniques (i)-(iv), with linearization if needed, in Section 3. More generally, now, consider the fit of the model (3.16), i.e., suppose the hypothesis H 0 is specified by H 0 : f ( ~ ) = x ( 0 ). (v) Weighted Least Squares Test. $ 2 = $ 2 ( n ) = inf $2(0).
0~o

(4.11) Let (4.12)

H 0 is rejected if S2>X~_~(u - m). In the linear case, when x(O)=XO, note that the w.l.s.e, is given from (3.21) by

= (X'H- l(p*)X)- ~X'H- l(p,)f(p,),

(4.13)

if X is of full rank m (see conditions C3). Then the WLS statistic becomes S 2 = f'(p*)H- l(p,)f(p,) _ 0'X'H- l(p,)f(p,).

(4.14)

If all ni,j > 0, p* could be taken to be p; however, see Section 9 for general suggestions in view of (3.19). The validity of the WLS test for larg e n follows from the following theorem. THEOREM 4.2. Assume conditions A, C1 and C3. If"/4o, given by (4.11), is true then S 2 has chi-square limiting distribution with u - m d.f. In the special case u = R - s, when for) is essentially the vector of R - - s independent elements of ~r, S 2 reduces to y2 provided all hi,j ~ 0 and p* is

A N O V A and M A N O V A

359

taken to be p. In this sense, the WLS test is indeed a generalization of the minimum modified chi-square test. Some other points to be noted are regarding the relationship between the Wald-statistic W and the W L S statistic S 2. If we take x ( 0 ) = 0 in the specification (4.11), then indeed S 2= W, with W defined by (4.10). In fact this relationship can be carried further for the linear case H0: f(~r) = X0.
(4.15)

Since X is an u m matrix of rank m <<u there exists an (u - m) u matrix, say Z, of rank u - m such that Z X = 0 . Then the specification (4.15) of H o is equivalent to the constraint-equations specification
H o : Zf(~r) = 0. (4.16)

The number of independent functions on the left hand side of specification (4.16) is u ' = u - m . If we construct the Wald-statistic say W ' of type (4.10) for testing H o given by (4.16), it turns out that
W t= S 2

(4.17)

with d.f. = u ' = u - m . Further remarks regarding the relationship between the Wald-technique and the WLS technique, mainly arising out of (4.17), are given in the next section.

5.

T e s t s for n e s t e d m o d e l s

Frequently we are interested in testing goodness of fit of two (or more) models such that one is nested within the other. Suppose the model M 1, given by the freedom-equations specification, M 1 : ~r='trl(0 ) or the constraint equations

M, : fl

-- o,

(5. l)

is accepted either a - p r i o r i or, as is usually the case in practice, on the basis of a preliminary test of significance. Here fl is a vector of u = R - s - m independent functions in the notation of Sections 3 and 4. Consider now a model M 2 which is stronger than M 1 in the sense that it places more restrictions on ~r. Then IIM2 is a subset of IIM, in the notation

360

Vasant P. Bhapkar

(3.11) a n d (3.12), a n d we say t h a t M 2 is n e s t e d w i t h i n M 1 (to be w r i t t e n as 342 C M 0 . Thus M e is specified either b y

342:

g(0)=0, { f2('rr) = O.

(5.2)

or b y the c o n s t r a i n t e q u a t i o n s

M2 :

(5.3)

In (5.2) we assume that the v f u n c t i o n s g a r e c o n t i n u o u s l y d i f f e r e n t i a b l e on O, a n d are i n d e p e n d e n t in the sense that the m a t r i x

G(O)- [ 0g(0) 1
M 2 : ,a = ~r201)

(5.4)

Jom

is of r a n k v < m. E l i m i n a t i n g v O's, we could, in theory, express M 2 as (5.5) t h a t the c o n d i M 2 is true. (5.3) of M 2, we t h a t fl a n d f2

where ~i is the v e c t o r of m - v r e m a i n i n g O's. W e a s s u m e tions C2 c o n t i n u e to h o l d for t h e f u n c t i o n s % if the m o d e l F o r the c o r r e s p o n d i n g c o n s t r a i n t - e q u a t i o n s specification t h e n a s s u m e t h a t the v a d d i t i o n a l f u n c t i o n s f2 a r e such t o g e t h e r satisfy c o n d i t i o n s of t y p e C1. L e t then

v,(,,-)= L
W e thus a s s u m e t h a t

,,/

JoR

Rank

F2(~r) E (u+v+s)R

= u + v + s,

(5.6)

in the n o t a t i o n of C1. L e t - 2 1 o g k / , Xi 2, y.2 (or I,V~), 21" b e the l a r g e - s a m p l e c h i - s q u a r e criteria with 4 d.f. for testing M i, i = 1,2 respectively. T h e n d I = u = R - s - m a n d d 2 = u + v = R - s - (m - v). L e t 112/1 b e the h y p o t h e s i s t h a t M 2 holds, given that M 1 is true. T h e n the a p p r o p r i a t e l a r g e - s a m p l e c h i - s q u a r e criteria for testing H2/1 are - 2 ( l o g ~ k 2 - 1 0 g ) ~ l ) , X22- X21, Y2 2 - y 2 (i.e., W 2 - W1) ,

A N O V A and M A N O V A

361

2 ( I ~ - I ~ ) ; each of these carries d 2 - d 1= v d.f. For the validity of these criteria, we refer to N e y m a n [23] and Gokhale and Kullback [15]. A similar extension is available for more general models of the type (4.11). Thus, let now

and

Ml:f(q)=xl(O) M~:{ f(~)=x,(O) g(O)=O


Mz:f(~)=x2(~).

(5.7)

eliminating v O's we m a y express M 2 also as (5.8)

Assume that condition C1 for f and of type C3 for x 2 are satisfied. Let S~(O), S~O1)be defined as in (3.18) with respect to models M1,M2 respectively, and similarly S~, $2z as the corresponding WLS chi-square criteria on u - m and u - ( m - v) d.f.'s. The appropriate large-sample statistic for testing M 2, given M1, is

s2= s g -

s?

(5.9)

on v d.f. For the linear case, where x ( 0 ) = X0 and g(0)= GO + go, the statistic (5.9) can be obtained without any need to use an iterative technique, as pointed out in Section 3. In fact, one can then show that

S2=(GO+go)'[G(X'H-I(p*)X)-IG']-I(GO+go),

(5.10)

where 0 is the w.l.s.e, given b y (4.13). In view of the equivalence between the Wald-statistic and the WLS statistic as symbolized by (4A7), we can then see that indeed

s:=

s? =

(5.11)

where W[, i = 1,2, are the Wald-statistics appropriate to test the hypotheses specified by constraint equations corresponding to models M 1 and M2, respectively, in (5.7). We might make here a general c o m m e n t concerning such nested models. On the one hand it is possible for a weaker model M 1 to hold, while the stronger model M 2, nested within Ml, is ruled out for lack of fit. On the other hand, it might happen that the stronger model M 2 is not ruled out perhaps because the overall test is not sensitive enough; however, a more

362

Vasant P. Bhapkar

sensitive test directed at some specific sub-features of Mz, say M l, might succeed in establishing significance. A n illustration of this type, for e x a m ple, is provided in Section 7 (i a n d ii).

6.

Some models for one population

Consider first the case s = 1 where, then, the subscript j is deleted. W e are interested in fitting models of various types of association and s y m m e try a m o n g the k responses. S o m e of these are described below.

k=2
(i) Independence of Two Responses. T h e m o d e l of i n d e p e n d e n c e is (6.1)

no:qri,,i2=qTi, oq70i2; i a = l . . . . . ra, a = l , 2 ,


in log-linear representation (2.2), H 0 is equivalently stated as Ho:~}~,'/~)=0, all

i,,i 2.

The m.l.e, are :il,i~= ni,ono, i J ha, and, using these,

X2= X X
il i2

nil, i2

nil,ono, i2 )2

n
'

(6.2)

nipono, i 2 / n

with d.f. = ( r I If we define

l)(r 2 -

1)

Aili2 -- ,.17.rl, i2q-i.ipr2

'~ri"iJrrl'r2, i 1 = 1. . . . . r I -- 1, i2= 1. . . . . r e - 1,

(6.3)

then (6.1) is true if, and only if, Ai~,i2=l for all il, i 2. T h u s , A'S (rather logA's) m a y be taken as measures of association between the two responses. Indeed, logAi,,i2=a(ll,,i~)_a::,,r22)_a:l,,2)+ Olr|,r 2(1,2). O n e thus requires (r I - 1) (r 2 - 1) separate m e a s u r e s to describe completely the nature of association between two responses. If we are interested in only one particular feature of association, some overall m e a s u r e s of association can be constructed. See G o o d m a n a n d K r u s k a l [17] for such measures. F o r the simple case r I = r 2 = 2, we have only one A, viz.

A = ~rl':r2'2 ",

(6.4)

q7"2,1~1,2

ANOVA and MANOVA

363

this is the cross-product ratio for a 2 2 table with two responses. The statistic (6.2) takes the form

X2m~ n(nl, ln2,2-nl,2n2,1) 2 n 1,on2,ono, lno,2

(6.5)

with 1 d.f. It is r e c o m m e n d e d to use the continuity correction whereby the absolute value of the n u m e r a t o r term, before squaring, is reduced by n / 2 in order to improve the chi-square approximation. For the exact procedure to test h o, especially in the 2 2 case, see Section 10.

Symmetries in square tables


Consider now a square table with r2---r x. When two responses are expected to be considerably similar, the focus of interest shifts from the investigation of mere existence of association to that of the nature of association. In (ii)-(iv) we consider some models of different types of symmetry. (ii) Complete Symmetry of Two Responses is specified by
H o : Tril, i2 qT"i2, il,

all il,i 2

(6.6)

The m.l.e, are ril,i2= (nil, i + ni~,il)/2n, and, using these,


X2"~- E E il<i 2 (nil'J2-- ni2'i1)2 nil, i2 "~- ni2,i I

(6.7)

with d.f.-- rl(r 1- 1)/2. For the case r I --2, the statistic (6.7) reduces to the M c N e m a r statistic X 2 = ( n l , 2 - n2,1) 2 , (6.8)

nl,2"l- n2,1

with 1 d.f. It is r e c o m m e n d e d to use the continuity correction whereby the numerator term is reduced in absolute value b y one before squaring. For the exact procedure see Section 10. (iii) Marginal Symmetry of Two Responses. Instead of the relatively strong model (6.6) we might want to check whether a weaker model of marginal symmetry (Or homogeneity), specified b y
Ho;Tri, o=Tro, i,

i = 1 ..... r 1

(6.9)

is applicable.

364

Vasant P. Bhapkar

Under H 0 (r 1 arbitrary) direct m.l.e, solution is not available. For explicit large-sample chi-square criteria with r 1- 1 d.f., including the Wald statistic, refer to [4]. For the special case r~ = 2, the hypotheses specified by (6.6) and (6.9) are equivalent. In this case one could then use the statistic given by (6.8), preferably with the continuity correction. (iv) Quasi-Symmetry in a Square Table. Another model, as in (iii), which is implied by the stronger model of complete symmetry is the model of quasi-symmetry which pertains to symmetries between rows and colo umns of the square, regardless of the marginal totals in (6.9). A convenient way to specify the hypothesis of quasi-symmetry is in terms of the log-linear representation (2.2), viz.
fo" U ~o,2) _. ~ (1.2) il,/2 -- tti2, i I

all il, i z.

(6.10)

The two models in (iii) and (iv), viz. (6.9) and (6.10), together are equivalent to the model (6.6) of complete symmetry. The constraint-equations specification of quasi-symmetry, equivalent to (6.10), is given by
~1,i2~2,i3~3,il = ~ 1 , i 3 ~ 3 , i2~2, i1'

(6.11)

for all i l, i 2 and i 3. Large-sample chi-square criteria with d . f . ~ - ( r l - 1) (r 1- 2 ) / 2 are, then, available for testing the hypothesis of quasi-symmetry specified by (6.10) (or, equivalently, (6.11)).
k=3

With three responses, there are different models of association that might be of interest. Some such models are described in (v)-(viii). (v) Complete Independence o f Three Responses. This model is specified by
~7i.,i2, i3 = 7ril, O, OTrO,i2,0WO, O, i3

(6.12)

for all i,,, a = 1 , 2 , 3 . The equivalent specification in terms of a's in the log-linear representation is
= =

a(1,2,3)

__ n

(6.13)

il,i2, i3 m v .

The m.l.e, solution is directly available, viz.,


^
~il,i2,i3 ~"

ni"'n'i2'n''i3
n3 '

(6.14)

A NO VA and MANO VA

365

and the large-sample chi-square criteria using (6.14) are immediately available; these have d.f. = rlr2r 3 - (r 1 + r 2 + r 3 - 2). (vi) Multiple Independence o f Response 1 and the Set o f Responses (2, 3). This is specified by

or equivalently

The direct m.l.e, solution is

and the large-sample chi-square criteria have d.f. = (r 1- 1)(r2r3 - 1). In fact, this case (vi) is merely a special case of (i) with the second subscript in (i) further partitioned into a double subscript. (vii) Conditional Independence o f Responses 1 and 2, given the Third Response. If the level i3 of the third response is given, then the first two responses are conditionally independent if

This model is equivalently expressed as

A direct solution is available for m.l.e, viz.

Using this, one gets the Pearson chi-square statistic

366

Vasant 17. Bhapkar

(viii) No Second-Order Interaction Among the Three Responses. model is specified by a(lDl2~l 14, 3) = 0, 3

This

(6.22)

using the classical linear models analogy. The concept of no second-order interaction among three responses allows the existence of first-order interactions (or association) between any two responses of the three. The second-order interaction is said to be absent if the measures of association (of type (6.3)) between a n y two responses, given the level of the remaining response, are independent of that level. Starting f r o m such an approach, we get the constraint-equations specification of this requirement, viz.
$fil,i2,i3~Trl,r2,i3$Til,r2,r3~rl, i2,r3
~rrl, i2, i3~il, r2, i3~'i 1, i2, r3~r 1, r2, r3

= 1,

(6.23)

for all ia= 1,...,ra-1, a = 1,2,3. It turns out that this formulation (6.23), due to Roy and K a s t e n b a u m [26], is equivalent to the log-linear specification (6.22). For the model specified by (6.22) or (6.23), direct m.l.e, solution is not available. Hence for computing likelihood-ratio or m i n i m u m chi-square statistics, some iterative algorithm is needed. However, W a l d statistic can be obtained relatively easily from (6.23), or rather, from its logarithmic version (see, e.g., [8], [16]). These criteria have (r 1- 1)(r 2 - 1)(r 3 - 1) d.f.

Symmetries in cubic tables


Here we have r 3 = r 2 = r I. As in the two-dimensional square table, we are now interested in exploring whether the association pattern a m o n g the three responses is symmetric. In (ix)-(xi) we consider different types of symmetric pattern. (ix) Complete Symmetry of Three Responses is represented by the hypothesis H0a: . . . . (6.24)

for every permutation (Jl,J2J3) of (1,2,3) with ia= 1..... r l, a = 1,2,3. The m.l.e, are
^ 1

%1.i2,,,= 6---n~ nij,'~2"iJ;

ANOVA

and MANOVA

367

where Ep denotes the sum over six permutations (Jl,J2,J3) of (1,2,3); note here that in 3~p some terms occur more than once if two or more i, coincide. The Pearson chi-square criterion can be immediately obtained using m.l.e.; the d.f. for testing//o3 is rl(r 1- 1 ) ( 5 r I + 2 ) / 6 . (x) Marginal Symmetries of Order One and Two. The hypotheses postulating symmetry (or homogeneity) of one and two-dimensional marginal probabilities are, respectively, H01 : ~ri,o,0= ~ro,i,o = %,o,i and
qri, i,,0 --'~ q7i, 0, i, = ql"O,i, i, H02 :
71-i,i,,O ~- 'T[i,,i,O

(6.25)

(6.26)

for all i, i' -- 1..... r 1. The direct solutions are not available, either under Hol or HOE, for the m.l.e. However, the Wald statistics can be constructed with d.f. = 2 ( r I - 1 ) and (r I - 1 ) ( 5 q - 2)/2, respectively, for testing H01 and HOE. In view of the relationship Ho3~HoE~Hol difference-statistics can be used as described in Section 5. (xi) Quasi-Symmetries of Order One and Two. The log-linear representation is more convenient to use in order to specify the hypotheses of quasi-symmetry, H0(Q), H(Q),o2of order 1 and 2 respectively. We have
H o ( Q ) ; ~ ( 1 , 2 , 3 ) _ .~(1,2,3) tarot,/j2, ~ 3 - - Util, i2,i3

(6.27)

for every permutation (Jl,J2,J3) of (1,2, 3); silmlarly

S(o?): {

~,(1,2) _ ~ ( 1 , 3 ) _ ~,(2,3) ~,(1,2) __ (1,2) ~i,i' - - t'ti, i" - - uti, i" , ~ti, i' - - Cti',i

(6.28)

for all i, i'. H0(2 Q) implies symmetric pattern for interaction effects of second order, while Ho(e) requires such symmetric pattern for interaction terms of both second and first order. It can be shown that quasi-symmetry and marginal symmetry concepts are complementary to each other in the sense that Ho3=Ho~

and

Ho(?)=Ho2

and

H~Qz ).

(6.29)

The direct m.l.e, are not available for testing either Ho(Q) or H 02 (Q) The d.f. for large-sample chi-square criteria (in Section 4) are (r I - 1)(5r 2 + 2r 1- 12) /6, (r 1-- 1)(r I - - 2 ) ( 5 r 1 - - 3 ) / 6 for testing Ho(Q) and H(z q) respectively.

368
General k

Vasant P. Bhapkar

The concepts (and hypotheses) of complete independence, multiple independence and conditional independence of sets of responses can be defined as in (v)-(vii). The m:l.e, are directly available and, thus, largesample chi-square criteria can be easily constructed (see, e.g. [27], [20] and [61). The hypothesis of no interaction of order k -- 1 among the k responses is defined, as in (6.22), using the log-linear representation. Alternatively, it is possible to define such an hypothesis in terms of constraints of type (6.23), on the basis of suitable measures within a given level of the remaining response. In either case, no direct m.l.e, are available for the case k >/3. The concepts (and hypotheses) of marginal and quasi-symmetry can be extended to the general case of a hypercube with the same number r~ of categories along each dimension. The hypothesis of marginal symmetry of order 1, viz. //ol : ~ri,o ..... o = %,i,o..... o . . . . . % ..... o,i, (6.30)

has been discussed in the literature quite extensively especially for the case r 1--2. Such a problem arises naturally in the context of 'matched samples', as considered by Cochran [11], who devised an exact conditional test for comparing k 'treatments' and showed that a large-sample chi-square test can be used for comparing such k matched binary responses. It should be noted that Cochran's test is designed for testing H0k and, that, it is not necessarily valid for testing H0~, by itself, unless a certain extraneous condition is satisfied. The reader is referred to [7] for some of these details and also for discussion concerning the Wald-statistic for testing Hol.

7.

ANOVA models for several populations

In this section we consider A N O V A type models dealing with the effects of different levels of one or more factors on one response. k=l, l=1 This hypothesis is specified by i = 1.... ,r. (7.1)

(i) Homogeneity of s Populations. H0:%,1=%,2 . . . . . ~i,s,

In terms of the log-linear representation (2.6), H 0 may be equivalently specified by H0 : 7i,j = 0, all i,j. (7.2)

A N O V A and M A N O V A

369

The m.l.e, are given by ;ij = ni, o / n and, using these, we get the well-known Pearson chi-square statistic
ni,j

njni, o )2 '

=EE j i

njni, o / n

(7.3)

with d . f . = ( r - l ) ( s - 1 ) . Note here the formal similarity of the statistics (6.2) and (7.3), of the d.f. and also the log-linear representations of the hypothesis. Comments similar to those in 6(i) apply here for the 2 2 table. Define now

mi, j

'

i = 1 ..... r - I ,

j = l ..... s - 1 .

Then the homogeneity hypothesis (7.1) holds if and only if all A= 1. In this sense, A's or, rather logA's could be considered measures of factor-effects on the response. Indeed, logAij=-/i,j-yi, s - y , j + T r , s. Refer to [17] for other measures. Often, in practice, the statistic of type (7.3) merely confirms the existence of the effect of factor-level on the response in view of a large significant value of the statistic. One would, then, like to explore further, if possible, the nature of this dependence. Some models that can be explored for goodness of fit on the basis of some structure for the categories of the response a n d / o r the .factor are given in (ii) and (iii). (ii) Mean Homogeneity of s Populations. Suppose we are given on a priori considerations suitable scores a i for the i-th response level such that Ziai~sd, the expected score for the j-th population, is of interest. Then the hypothesis of mean homogeneity is
H o : ~'iai~i,1 = ~'iai~i,2 . . . . . Eiai~i, s.

(7.4)

Usually this hypothesis is much weaker than the previous hypothesis (7.1) of homogeneity in the sense that (7.4) might hold even if (7.1) does not apply; for the case of binary response (i.e. r = 2 ) , however, the two are equivalent. For the general case r > 2, the m.l.e, under the constraints (7.4) are no longer directly available. However, the freedom-equations specification of H 0 (7.4), viz.
Ho:Ziai~ri,j=O,

j = l ..... s,

(7.5)

is suitable for the application of the weighted least squares technique. The

370 WLS statistic is

Vasant P. Bhapkar

s==/: j=,

(7.6) hi N~.=,(1/hf) '

with d.f. = s - 1, where

fjm~iaiPi,j,

hf = [ Y i a ~ i , j - ( N i a ~ i , j ) 2 ] / n j

We are likely to encounter two different types of situations. In one, the weaker model (7.4) provides a reasonable fit while the stronger model (7.1) does not. But in the other situation, while the over-all test (of type (7.3)) with ( r - l ) ( s - 1) d.f. for the stronger model (7.1) happens to be insensitive to reveal the differences, the test (of type (7.6)) with ( s - 1) d.f., specifically directed towards the weaker (or specialized) model (7.4), succeeds in bringing out these differences. (iii) Linearity of Regression of M e a n Scores. This type of model is worth consideration, if the model (7.4) happens to be rejected on the basis of the statistic (7.6), especially in the case where a suitable system of scores, say bj., is available for the (ordered) categories (or levels) of the factor. In order to test the linearity of regression of mean response-score on the factor score, we formulate the hypothesis
H o : Ziai~rij = 01 + 02b2, j = 1 . . . . . s.

(7.7)

Here 02 may be interpreted as the regression coefficient of response-score on the factor-score. The reader is referred to [5] for the details concerning the WLS statistic (with s - 2 d.f.) for testing the goodness-of-fit of the model (7.7). Notice that (7.5) is a model nested within (7.7) with 02 = 0. Hence, if (7.7) is acceptable, the test of model (7.5) of mean homogeneity may be interpreted as the test of hypothesis Oa=O, given the model (7.7). The appropriate statistic with 1 d.f. is $22- Sl2, where S,.2, i = 1,2, are the WLS statistics for testing the hypotheses (7.5) and (7.7) respectively. For the special binary case (i.e., r = 2), the formulation (7.7) is essentially equivalent to
~rl,j = 01 + 02bj, j = 1 . . . . . s.

(7.8)

This specification is somewhat restrictive in view of constraints 0 <Tr~,j < 1, although in some cases it might provide a good fit.

ANOVA and MANOVA

371

An alternative approach is in terms of logits. The logit hj. for the j-th binomial population is defined by hj = log trl'-J.
q'/'2,j

The populations are identical only if h 1=h2 . . . . . ~s. If not, one could explore the fit of the model of linearity of regression of logits, viz.

Ho:hj=O~+O2bj, j = l ..... s.

(7.9)

Such linear models in terms of logits are referred to as linear logistic models (see [2], [12]). The weighted least squares technique permits easy computation of the appropriate WLS statistics by taking fj(c0~Aj. For example, to test homogeneity we get a statistic of type (7.6) with s - 1 d.f., taking fj=log(pl,j/p2,j ) and h2=(1/ nl,j)+ (1/ n2j).

k = l , l=2
Here we have a three-dimensional table (n;jij~) assuming that all combinations (Jl,J2) of the levels of two factors are sampled. The probabilities rrij,,j2 satisfy the constraints EirrijiJ2 = 1 for all j l = 1..... s t, l----1,2. The corresponding log-linear representation is, say, log ~ri,j.,j =/z + a. + fl~l)+ fl}2)+ fi}ll,)~ ~+ ri,j~'~l)+ri,j2"~2) + ri,A,j2-'~l'2) (7.10) (iv) No Effect of Two Factors. The hypothesis is specified by (7.11)

Ho : rriJlj2= Oi..,

where the asterisk in place of a subscript denotes that the parameter does not depend on that subscript. The hypothesis is equivalently specified by
" ri,j I --

,~1) _

li,j 2 --

,~2) _ .

Yi,jpj 2

0,2)

= 0.

(7.12)

The relations (7.12) imply that the /3-terms vanish in view of the design constraints mentioned in Section 2. the m.l.e, are given by ~i.j,,j~= n~,o,o/n, and, using these, we have

(
n i , j i ,J2

njl'j2ni'o') 2
n , (7.13)

X2= Z Z Z

J2 Jl i

nj,,jzni, o,o/n

372

VasantP. Bhapkar

with d . f . = ( r - 1 ) ( s i s 2 - 1). Actually, this case is a special case of (i) w i t h j in (i) written as a double subscript in (iv). (v) No Effect of Factor 1. The more interesting question here is whether one factor, say the first, has any effect on the response in the presence of the second factor. The corresponding hypothesis is
HO : ffi~jllJ2 = Oi" *J2;

(7. 14)

this is equivalent to the specification .~(1) lJI ~ ~g}1112)2 , , ~ 0. (7. 15)

Note that (7.15) implies fl(~) and fl(~,2) terms vanish. The m.l.e, are seen to be .fri,j,,j2=ng,o#Jno, o,j~ and, using these, we get the Pearson chi-square statistic ~i, OJ2 )2

( ni#,,j~ nj,,:2no,oj----~ X2= Z Z Z ./2 J, i nj,,i~ni, o,jJno, o#~


, (7.16)

with d . f . = ( r - 1 ) ( s 1- l ) s 2. We can write X2~--~j2Xj2 , where Xj~ is indeed the chi-square statistic (with ( r - 1)(s 1- 1) d.f.) of type (7.3) for testing the homogeneity hypothesis of type (7.1) within each level J2 of the second factor. (vi) No Interaction Between Two Factors. Suppose both factors have some effect on the response in the sense that the probabilities ~ri#,#2 do depend on the levels Jl,J2 of the two factors. Then we are interested in exploring whether the factors act independently of each other. We say that there is no interaction between the two factors with respect to a particular feature of the response if the differential effect due to the levelsj~ and s I of factor 1 on that feature, within the level J2 of factor 2, are uniform across the various j2 levels. The differential effect measure should be such that the formulation turns out to be symmetric with respect to the two factors. Two such measures are 6i*,j,,j2 = qTi,j,,j 2 -- "2Ti,s,,j 2 - - q'gr,j,d2 "~- qTr, s,,j 2 and 6i,j,,j== ~ria,,j:rr,,,a2/~r~,,l#:r,a,,j: The corresponding formulations of the hypothesis of no interaction are

Ho: ~,j,j~ = O~j.,. + O~,.j, + O j,j~


and

(7.17) (7.18)

/4o: ~,:,,j: = O~j.,.O~,.jfl.j.a~

respectively. These are the hypotheses of no interaction of second order

ANOVA

and MANOVA

373

among the response and the two factors in the a d d i t i v e and m u l t i p l i c a t i v e sense, respectively. The multiplicative formulation (7.18) is equivalent to the specification Ho : -v'(.1,2) ttdld2
m- D

v.

(7.19)

Note here that one could have considered a linear representation of &s, as in (7.10), with the same number of parameters satisfying the side constraints. With this linear representation, the additive formulation (7.17) would be specified by precisely the same statement, viz. (7.19). Although such a model might provide a good fit for some data, in general we prefer the log-linear specification or, equivalently, the multiplicative formulation. For testing such an hypothesis of no interaction, viz. (7.19), with either linear or log-linear representation, direct m.l.e, are not available. However, Wald-statistics can be computed with ( r - 1 ) ( s 1- 1 ) ( s 2 " 1) d.f. for testing either (7.17) or (7.18) in its log-linear version, without requiring an iterative routine. In 7(x) is presented an alternative model of no interaction between two factors when the response is binary. In the model (7.27) the interaction is explored with respect to the effects of factor-levels on the logit. Observe that the parameter of the highest order (viz. 2 in this case), t i.(1,2) ,jpJ2 has an o p e r a t i v e interpretation as an interaction effect, viz. H 0 (7.18) (or (7.19)) is equivalent to H0:81,j,,j2 = 8i, i,,s:
Jl = 1 . . . . . s I - 1,
J2 = 1 ..... S 2 --

1.

Thus, Aia,,j2= 6i,j,,j2/ri,j,,, 2, (rather logNs) may be considered measures of interaction effect of second order; H 0 holds if and only if all A's = 1. In fact, A's are related to 7~l',~'s and vice versa. However, such an operative interpretation is no longer available for interaction effects of l o w e r order y~j~!,7/(2.)2in (7.10) unless H o (7.19) holds. Hence a test of interaction of the first order between the response and factor 1, say, would have no satisfactory interpretation unless the no-interaction of the second order model provides a good fit. The differencestatistics (in Section 5) provide the appropriate criteria with d.f. = ( r - 1 ) (s I - 1)s2 - ( r - 1)(s I - 1 ) ( s 2 - 1 ) = ( r - 1)(s I - 1) for testing .//(,xd=0, assuming
y(1,2) = O.
i,jl,J2

If the response categories are ordered and a i is the score assigned to the i-th response category, then models similar to those in (iv)-(vi) with respect to the m e a n score Eiai~r~,j,,j ~ would be of interest. In case the effects do exist, one might want to explore further goodness-of-fit tests of regression models especially when one or both the factors have ordered categories with assigned scores, say (bj,} and {ej2) respectively. Some of these

374

Vasant P. Bhapkar

models are presented below: (vii) No Effect of Two Factors on Mean Score. H0: ~iaiqri,jljz
= 0.

(7.20)

(viii) No Effect of Factor One on Mean Score.

HO : ~iaielTi,jl,J2 ~- 2"
(ix) No Interaction between Two Factor-Effects on Mean Score. /4o: ~.iaiq'gi,jld2 = .. + O.j 2.

(7.21)

(7.22)

Wl,S statistics with d.f. = s1s 2 - - 1, ( S 1 - - 1)s z and (s t - 1)(s e - 1), respectively, can be obtained in a straightforward manner. T h e reader is referred to [5] for details concerning these a n d also some regression models of types H0: Xlai~ri,j,,A= 0o+ Ojbi ' /4o: or (7.23)

Xiairri,j,j2=0o+ Otbj, + 02cj2

etc. T h e models (7.20)-(7.22) are seen to be weaker versions of t h o s e in (7.12), (7.15) and (7.17) respectively and comments similar to those in the case (k = 1 , l = 1) apply here as well. The two formulations are essentially equivalent for the binary case (i.e. r = 2). (x) In the binary case, the alternative approach in terms of logits (using linear logistic models as in the case k = 1, l = 1) seems attractive. Let, then,

~,,./2 = log

qTl'jl'J2- .

(7.24)

~2,jlj 2

We can obtain the W L S statistics in a straightforward m a n n e r for linear logistic models (see [2], [12])

,.+ 0.j2

(7.25) (7.26) (7.27)

with d.f. (sis 2 - 1), (s I - 1 ) s 2 a n d (s I - - 1 ) ( s 2 - 1 ) respectively. If (7.26) does not fit, one could explore goodness-of-fit of regression models, as in (7.23), viz. ~.J2 = Oo + Oj2bJ,, (7.28)
or

2~,~i2= 0o+ Olbj, + 02cj2.

(7.29)

A N O V A and M A N O VA

375

General case: k = 1, l arbitrary The discussion for the previous cases l = 1 and 2 can be extended to the case 1 ) 3 . Interactions of the third order (or higher) a m o n g the response and three (or more) factors can be defined, as in the previous cases, by considering either the linear or, preferably, log-linear representations. Models of various types postulating that some of these interactions of higher orders vanish can be tested for goodness-of-fit b y statistics discussed in Sections 4 and 5. For the binary case, the linear logistic models appear quite attractive to consider.

8.

MANOVA models for several populations

Here we consider models concerning effects of different levels of one or more factors on two or more responses. k--2, l=1 We begin with a three-dimensional table { ni,,i2,jJ with probabilities qT"i,,i2, j satisfying the design-constraints ~1,i2%,,i>j= 1 for j = 1. . . . . s. The corresponding log-linear representation is
log~Til,i2tj=]~'Jl-O~}~ )'~-O~(2)dl-I 2 J~i%."~y , 0/!~) ( 2) ' -~ 2 IIDj "~]( ) l')""q, 'V .12j
..~_ ,V (1,'.2).

Ill,t2,.]"

(8.1)

(i) Independence of Two Responses for each Factor Category is specified by the hypothesis
n 0 : q7il, i2,j = qTil, O,jqTo, i2,j.

(8.2)

The corresponding log-linear representation is


H o : ~ij,i2 -- .il,i2,j -- ~.
,tO,2) -- vO,2) - - f~

(8.3)

The m.l.e, are directly available as 7ri,,i2, ^ s = ni, o,jno, i2,j/nf. Using these one gets the Pearson chi-square statistic
nil,iz, j nil''Jn'i2'J 12

X2 = Z Z ~]
j il i2

ns
ni.,o,jno, i 2 d / n j

(8.4)

with d.f. (r I - 1)(r2 - 1)s. The overall chi-square statistic can be partitioned

376

Vasant P. Bhapkar

into independent c o m p o n e n t s X~, of type (6.2), each with (r I ---1)(r 2 - 1) d.f., as X2-~ ~x)2 -E:= . (ii) Hypothesis of No Interaction of Second Order among Two Responses and One Factor. T h e more interesting m o d e l is whether the p a t t e r n of association between the two responses (within each category j of the factor) is uniform across the levels j. With m e a s u r e of association, say Ah,~2J (or rather logA,. ,.,j) of the type (6.3) within factor category j the hypothesis of interaction of second order is f o r m u l a t e d as

H o : Ai,,i2,j = Ai,,i . . . . . ] = 1. . . . . s -- 1.
This specification (with A's defined b y (6.3)) is indeed equivalent to H0". ,,(1,2) = ( /v. lil, i2,j

(8.5)

(8.6)

T h e m.l.e, are not directly available now. However, the Wald-statistic (d.f. = ( r I - 1 ) ( r 2 - 1 ) ( s - 1)) a n d the W L S technique are convenient to use (see e.g. G o o d m a n [16]). One might w o n d e r whether a specification of the type (8.6), but where the y's are defined by a linear representation of type (8.1) for ~ri,,i~j rather than for log~r, is a reasonable alternative formulation of the hypothesis of no interaction of second order. Such a formulation would c o r r e s p o n d to the specification A*,,,,2j = A*i,,i2,s,J ' = 1, .. . , s - 1 where n o w A*,,,,~,:..is defined as an association m e a s u r e in the additive sense (rather than the multiplicative sense as for hh,i~,j ) b y A* 2 = qTil,i2~ q'i'rl, i 2 q'fil,r2"~-q'fr .... in analogy with the definition (6.3) for Ai,~2. However, A*'s do not a p p e a r to be satisfactory measures of association, since their vanishing does not necessarily imply independence of the two responses. F o r this reason, the linear representation does not a p p e a r to lead to a satisfactory definition of interaction effects of second order (in the additive sense) except for the f o r m a l resemblance to the classical linear m o d e l format. A m o d e l specifying absence of lower-order interactions, say ,//(:} f r o m (8.1) has a meaningful interpretation only if (8.6) model provides a g o o d fit. H e n c e the goodness-of+fit of such a m o d e l (with ~-(~). = 0) can be tested +/l'J by using the difference-statistic (see Section 5) with d.f. = (r 1 - 1)r 2 x ( s - 1) -(r 1 - 1)(r 2 - 1 ) ( s - 1)= (r I - 1 ) ( s - 1), assuming the model (8.6). Alternatively, one could ignore the response two altogether a n d work with a 2-dimensional contingency table. But then we have to bear in m i n d that .(1) in (8.1). the new ,v. (1).,,:p a r a m e t e r s here are not the same as ri,,: (iii) Linearity of Regression of Association Measures on Factor Score. If the previous models (8.2) and (8.6) (i.e. (8.5)) do not fit, one might w a n t to

ANOVA

and MANOVA

377

explore the nature of dependence of the association pattern on the factor category. If b/ is the score assigned to the j-th level, regression models of Ai,,i20 (or better, logA's) on bs are worth exploring. In the case of two binary responses (i.e. r~ = r2=2), the hypothesis of linearity of regression is H0 : log2X j = 01 +

02bj,

(8.7)

with Aj defined as in (6.4). The WLS statistic with s - 2 d.f. is produced very easily for testing H 0 given by (8.7). The models (i)-(iii) above are mainly concerning the effect of factor level on the association pattern between the two responses. We could also consider models along the traditional M A N O V A approach dealing with the effect of the factor on each response jointly or separately. (iv) No Factor Effect on Two Responses. Such a model is specified by
O o ". qTil,i2,j ~-- Oil, i2,

(8.8)

or equivalently by
H o : lilj
,(1) __ ,v~2).__
--

i t 2 d -- ri~,i2,j -~- O.

,(1,2)

(8.9)

The m.l.e, are


^ nil'i2'o effil, i 2 , j ~ ~ ;

(8.10)

using these we get the Pearson chi-square statistic

(
ni l, i2,j

.,. njn~'~) 2

x 2= E E E
j

i~ i2

njni,,i2,o/n

'

(8.11)

with (rlr 2 -- l ) ( s - 1) d.f. Observe that this case is merely a special case of (7.1) with i there written as a double subscript (il,i2). (v) No Factor Effect on Two Responses Separately. A model which is weaker than (8.8) concerns the effect of the factor category only on the marginal probabilities, viz.

Ho:

~ri" = Oil*'
qTO, i2,j ~ O.i 2,

(8.12)

378

Vasant P. Bhapkar

Now, this has no convenient log-linear representation. If one or both the responses have ordered categories with assigned scores {ai,),(bi~ }, then one could test weaker forms of (8.12) relating to the effect of factor level on only the mean scores of the responses, viz.

no:

{ ~ilail'TJ'il,O,j= O, qTO,i2,j ~" 0.i2~


= 01'

(8.13)

140: I Ei'ai'~rq'4

(8.14)

L Y~i2bi2cgo,i2,j = 02,

for goodness of fit. The WLS Technique is especially convenient for testing the hypothesis of type (8.14), and even (8.13), while for testing (8.12) the Wald-statistic would be more convenient. The reader is referred to [6] for details. (vi) Regression Models with Ordered Responses and Factor. If the models in (v) do not provide a good fit, regression models could be explored if the factor also happens to be ordered with the system of scores (cj). A weaker analog of (8.14), e.g., would be

tto

( ZilailqT"il,O,j = 01 d- 03bj ~i2bi2q.l.o,i2,j = 02--[- 04hi.

(8.15)

The WLS technique would produce a large-sample chi-square criterion with 2 ( s - 2 ) d.f.
General k, l = 1

The methods in (i)-(vi) can now be extended in a straightforward manner to the case of arbitrary number k of responses with 1 factor. For example, corresponding to (ii), we consider a suitable measure of interaction of order k - 1 among k responses, say Ag...... i~,j within the j-th level of the factor. The formulation, similar to (8.5), of the hypothesis of no interaction of order k would be
Ho:Ai ...... i,,j=Ai ...... ik,~' j = 1..... s-- 1.

(8.16)

Such multiplicative measures can be based on similar measures of the lower order (see [8] for details) and, then, the formulation (8.16) is seen to be equivalent to the formulation of the type (8.6) in terms of the log-linear representation.

ANOVA

and MANOVA

379

k=2, l=2
Here we have a four-dimensional table (n/i,i~j,,j~ } with the design constraints Yi E/2~/,,i2j,,j.2-- 1 for all (Jl,J2). The corresponding log-linear representation is
log~il, i2j,,j2

= p, -1- 0/},1) "1" 0/(2) 12 -1-/~::) "4- 8 ) : ) "t- (1,2) + B 2 g )


2 2 | la~Jb 2 2 ~ taJ I J 2 Y i I, i2,Jb

a=l b=l
"~- ~il,i2,jl,j2

a=l

b=l

,v(l,2; 1,2)

(8.17)

With more than one factor, there is now a larger variety of models of the M A N O V A type that can be explored than in (iv)-(vi). Some of these have convenient representations in terms of parameters in (8.17). For example, consider the following models. (vii) No Effect of Factor One on Two Responses is formulated as Ho : ~ri.,i~,j,j2= Oi,,i2,.j: (8.18)

This is equivalent to the formulation that requires all the y-terms involving Jl in (8.17) to vanish. Actually H 0 given by (8.18) is precisely 7(v), regarding i as a double subscript (i1,i2). Hence the Pearson chi-square statistic (7.16) with d.f.= (r~r2 - 1)(s I - 1)s 2 follows. (viii) No Effect of Factor One on Two Responses Separately. Here we consider a weaker version of (8.18) in the spirit of MANOVA, viz.

(
O 0 :

~'ril,Oji,j2 ~-

Oi,,.,.j:
(8.19)
*,J2"

q'gO, i2,jl,J2 "~- 0",i2,

Such an hypothesis, dealing with marginal probabilities for the two responses separately, as in (v), has no convenient formulation in terms of the parameters in (8.17). The m.l.e, are not easy to derive for testing (8.19); however the Waldstatistic can be obtained by the method in Section 4. (ix) No Interaction of Third Order among Two Responses and Two Factors. Let Ai,,i2,j,j2 be a suitable measure of association between two responses of the type defined by (6.3), within the factor-combination (Jl,J2). We say that there is no interaction of third order if the differential effect of levelj l, relative to level s 1, of factor 1 for a given level j2 of factor 2 is uniform across the levels j2. If this differential effect is measured in the

380

Vasant P. Bhapkar

multiplicative sense, as in 7(vi), then we get the hypothesis of no interaction of third order as Ho : Ail,i2,JlJ2 ~-" Oil,i2,Jl, *Oil, i2, *,J2"

(8.20)

With A's defined as in (6.3), this is indeed equivalent to the specification


Ho : ~i,,i2,JlJ2

~,V,2; 1,2) _j 0 -- -"

(8.21)

A formulation of the type (8.21) with the p a r a m e t e r s in the linear representation of ~r's, rather t h a n that of log~r's as in (8.17), does not seem to be satisfactory for reasons similar to those pointed out in 8(ii). F o r p a r a m e t e r s (in (8.17)) representing second-order interaction a m o n g 2 responses and 1 factor, there is no suitable interpretation unless H 0 given by (8.21) is acceptable. Hence, the m o d e l lil,izdt ,(1'2;1)=0 can b e m e a n i n g f u l only if (8.21) holds; a difference-statistic (see Section 5) with d . f . = ( r l 1)(r 2 - 1)(s I - 1)s z - ( q - 1 ) ( r z - 1)(s 1- 1)(s 2 - 1) = (r 1 - 1)(r 2 - 1)(s 1- 1) can be obtained. Similarly, as in 8(ii), a test f o r /~9; ~,?)= 0 can be meaningful only if (8.21) l IJ IJ2 holds. T h e n a difference statistic with d.f. = ( q - 1)(s 1 - 1 ) ( s 2 - 1) c a n be obtained provided the statistic for testing (8.21) is insignificant. Alternatively, one could ignore response 2 altogether a n d work with the 3-dimensional table (nq,0,j,,_h}, but n o w the p a r a m e t e r s that we are testing for are not the same as the corresponding p a r a m e t e r s in the 4-dimensional representation (8.17). General k, general 1 T h e discussion in the previous case k = 2, l = 2 can now b e extended to the general case of k + / - d i m e n s i o n a l tables with k responses and l factors. T h e details b e c o m e increasingly complicated. T h e log-linear representation can be used for the definition of the interaction o f order k + l - 1 a m o n g k responses a n d l factors and an operative interpretation can b e p r o v i d e d for the highest-order interaction term. However, as pointed out in earlier simpler cases, such a n operational interpretation is no longer available for interaction effects of lower orders unless the higher-order interaction terms vanish. Hence, the simpler m o d els can be explored for goodness-of-fit in a meaningful w a y only in a hierarchical m a n n e r , each simpler (and stronger) model nested within the previous weaker model. T h e difference statistics, as discussed in Section 5, can be used for judging the goodness-of-fit at each stage.

A N O V A and M A N O V A

381

9. Computation
As pointed out in several applications in Sections 6-8, the m.l.e, can be directly calculated only in relatively simple problems. For models dealing with the log-linear representation in multi-dimensional (cross-classified) contingency tables, conditions under which the direct m.t.e, are available are given in [10]. For the case where the direct m.l.e. (or m.d.i.e.) are not available, an iterative proportional fitting algorithm producing the m.l.e. (or m.d.i.e.) after a sufficient number of iterative cycles has been suggested. The reader is referred to [10] and [15] for further details. These two techniques first produce the estimates under the model being fitted and, then, use these to compute the statistic to judge the goodness-offit. This approach is also followed by the WLS technique discussed in Sections 3-5. However, the latter technique is more convenient to use from the computational point of view when the models being fitted are linear in 0 (i.e. of the type (4.15)); one needs to solve only linear equations and no iterative routine is required. In contrast with these the Wald-statistic tests the goodness-of-fit of a model specified by constraint equations directly without computing the estimates of ~r under the model being fitted. This approach is convenient when a number of models are being considered in a search procedure for the "best" model. Once such a model is located on the basis of the observed values of goodness-of-fit statistics and their d.f.'s, then only the need arises for the fitted parameters. These estimates can be obtained either by the WLS approach for the equivalent freedom-equations specification (see (4.15)-(4.17)) or by finding the m.m.c.e, as in 3(iii). In Sections 4 and 5 we have already referred to the relationship between the Wald and the WLS statistics. The WLS technique is more convenient to use when the number m of independent parameters 0 in the specificao tion (4.15) is relatively small compared to u ' = u - m , which is the number of independent constraints in the equivalent specification (4.16). On the other hand, if u ' is relatively small compared to m, the Wald-statistic computation is more convenient. For further details concerning the computation of W and S 2 statistics in the linear and log-linear cases, and combinations thereof, see [18]. In Section 3 we have also referred to the possible singularity of the matrix to be inverted in computing Wald and WLS statistics with raw data if some nij happen to be zero. There it was suggested that one could use the adjusted proportion vector p*, defined by (3.19), rather than the raw vector p, especially in the case where some frequencies are zero. Some suggestions for choosing the constants a's in p* are:

382

VasantP. Bhapkar

(i) Choose aij = 51 if n i j = 0 ; aij= 0 if Eli, j >0 (see [18]). (ii) Choose aij= 1/nj if n i j = 0 ; ai,j=0 if ni,:>O. (iii) Choose ai,j = 1/2 for all (i,j). Suggestion (ii) is being made in the spirit of (i) but with the intention that the bias introduced is only minimal, compared to (i). The choice (iii) is especially appropriate in the case of a logit, i.e., where
=

with 0.=2 (i.e. binary response). See Cox [12] for discussion of this case. There are reasons to believe that (iii) is appropriate also in the more general case where f ' s are linear functions in log~r's.

10.

Exact tests

Although the goodness-of-fit tests discussed in Sections 4 and 5 are asymptotic in nature and, as such, require large sample size n for their validity, in practice these are the ones that are more often used than the alternative exact procedures which are theoretically available in some cases. The exact procedures become intractable, at least from the computational point of view, very quickly even with moderately large n. Also, the asymptotic procedures give a reasonably close approximation for even moderately large samples sizes. Thus, in most practical problems, the large-sample procedures appear to serve the need for at least moderately large sample sizes. However, exact procedures need to be explored, whenever feasible, especially with small samples. As an illustration, consider the case of a two-dimensional contingency table (n/j} and the hypothesis of homogeneity of s populations from which the data are obtained. It can be shown that if this hypothesis, H 0 given by (7.1), is true, then the conditional probability of the observed table, given the marginals {ni,o), is

j~I=l(nl,j ..... nr,j)

I71ini, o!IXjnj!
n!IIj YIini,j!"

( 1"/1,0,"?", ~r,O)

(lO.1)

Such a computation is needed for all tables with the given sample sizes {nj} and marginals {ni,0}. Then the critical level (or the P-value) attained by the given data is the sum of the probability (10.1) of the given table and

ANOVA and MANOVA

383

such probabilities of other equally or less likely tables (i.e. with probabilities of type (10.1) < the probability (10.1) for the given table). H 0 is rejected if this P-value is sufficiently small, say < a. This computation gets out of hand even for small r and s with moderately large nj. For the special case r = s =2, the exact probability (10.1) reduces to
o2

(nlno)

hi,2 )

(10.2)

which is the probability for the hypergeometric distribution. The exact procedure is then known as the Fisher-Irwin test. One comes up with essentially the same exact procedure, based on (10. l), for testing the hypothesis of independence, H 0 given b y (6.1), for the contingency table (ni,,i2) with two responses. It can be shown that if H 0 is true then the conditional probability of the observed table, given the marginals (ni,,o) and (n0,i2) is given by (10.1) with minor modification in the notation (i.e., (i 1, i2) for (i,j), noj ~ for nj etc.). Another illustration is for testing the hypothesis of symmetry in the 2 2 table (ni,,i2) with two responses. Under H 0 : ~rl,2 = rr2,1 (in 6(ii)) (or, equivalently, 7rl,o=rr0,1 in 6(iii)) the conditional distribution of N1, 2, given nl, l + n2,2, is binomial (nl,2+ n2,1, 1) 2 Hence appropriate one-sided or twosided binomial test can be used. Indeed the M c N e m a r statistic (6.8) is seen to be the large-sample version of the two-sided binomial test. The essential element in this technique of exact-test construction is to consider the conditional distribution of the data given the ancillary statistics which, loosely speaking, are the sufficient statistics corresponding to the nuisance parameters in the model. Since the probability model (1.1) belongs to the multi-parametric exponential family, such conditioning on the ancillaries produces a completely specified distribution under the null hypothesis. Thus the exact test can be carried out as illustrated above.

11.

Conditional tests

In the previous section we have noted the use of conditioning techniques to eliminate all the nuisance parameters in constructing exact tests. Here we consider a similar use to eliminate some nuisance parameters in constructing large-sample tests for hypothesis of interest. We introduce the basic idea with a simple illustration.

384

Vasant P. Bku~pkar

Suppose in a random sample of size n, two characteristics are measured for each individual giving, thus, a contingency table (n6,6} corresponding to probabilities %,,,. satisfying the design-constraint ErE,.2~ri,,6= 1. Assume that the first response is of primary interest on which the effect of the second eoneommitant characteristic is to be studied. As illustrations, we consider the goodness of fit of the following models:

(i)
MI : Yi ai ~r~,6 = O~ + 02bg2, (ii) M 2 : Ei,ai,~r~,g2 = 01, (iii)
M3 : qT~l,i2-~- O g , , . .

Here 7r/~l, i2 ~- ~'il,i2/~0, i2 is the conditional probability of category i I for the first response, given the category i2 of the second response, and {ai,}, {bi: } are the scores assigned to the levels of the two responses. Notice that M 1 is in the spirit of regression model (7.7), while ME, M 3 are as (7.4) and (7.1) respectively The probability law of the observed data, n, is

H i , D ~ n r l,r 2

i2

lb12

(ll.1)

Now the rlr 2 - 1 independent parameters e~il,i 2 determine (and are determined by) the new set of parameters {%*1,g2}and (~ro,6}; this new set contains precisely the same number of independent parameters (r 1- 1)r 2 + (r z - 1)= rlr 2 - 1, in view of the relations Eg ,~, 6 = 1, Zgdro, 6 = 1. Moreover, (n0,6} is the set of ancillary statistics corresponding to the set of nuisance parameters {%,6}" Conditioning on this set of ancillaries gives the probability model

II
i2

[(

Hi i n / ~J, 2. . . . . . t, i 2 ] il

no,,

il,'t'2 I' J

(ll.2)

which is the model (except for a different notation) in a 1-response, 1-factor contingency table in 7(i), (iv), (vii) etc. Under the conditioning principle we use the methods in Section 7 for testing goodness of fit of models M I - M 3. In other words, although the

ANOVA and MANOVA

385

second classification characteristic is a response under our initial design constraints, it is now being viewed as a factor under the conditioning principle, when the methods in Section 7 are applied to models M ~ - M 3. These conditional goodness-of-fit tests continue to be valid under the initial probability model (11.1). For, if a certain statistic has a limiting x 2 ( r 2 - 1) distribution, if M 2 holds, given (no, g2 }, i.e. under (11.2), it would have the same limiting distribution also unconditionally under (ll.1). However, the power-properties would be affected. In this context one can interpret naturally the correspondence between two or more models, under different sampling schemes, which lead to the same statistic for testing goodness-of-fit. We have already noted above such correspondence between (6.1) and (7.1) leading to Pearson chi-square statistics (6.2) and (7.3) of identical form (and, similarly, for the exact tests in Section 10). A similar phenomenon may be noted with respect to the models (6.18). (7.15) and (8.2) and Pearson chi-square statistics (6.21), (7.16) and (8.4) of identical form and d.f.

12.

Remarks

Although all the discussion so far is in terms of the multinomial or product-multinomial distribution (1.1) for the categorical data, much of this continues to be valid under somewhat different models. One such important case deals with independent Poisson random variables N~j with means )~,j. If the problem is such that the marginals, say {N0,j) is one set of ancillary statistics then conditioning on the observed values no,_/reduces the probability model to the product-multinomial model (1.1) with nj = no,J and ~ri,j--~j/)~0j in the notation of Section 1. Then the methods under the model (1.1) would be applicable here under the conditioning technique as in Section 11. Suppose, as an illustration, we want to explore whether the column-classification (j) has any interaction with the row-classification (i) in the rnultiplicative sense (see 7(vi)). The log-linear representation log X~,j- / x + ai + ~ + y~j, (12.1)

with a, fl and 7's satisfying side-constraints as in Section 2, is a saturated "model" expressing rs )~'s (all independent) in terms of 1 + ( r - 1 ) + ( s - 1) + ( r - 1 ) ( s - 1 ) - - r s independent new parameters. The hypothesis that we need to test is Ho:Vi,j =0. (N0,j) is one set of ancillaries (corresponding to /z and fl's) and conditioning on n0js reduces the hypothesis H o to the equivalent form rri,j--0~,., and thus the problem is reduced to the test of homogeneity 70) under the product-multinomial model.

386

Vasant P. Bhapkar

Alternatively, conditioning on the ancillary No, 0 (corresponding to /0 would reduce the problem to the test of independence 60) under the multinomial model, while such a conditioning on all the ancillaries {Ni,0}, {No,./} would lead directly to the exact test based on (10.1). More generally, the methods continue to be applicable under log-linear models for independent Poisson variables in a multiway classification; see[10] for further details. Another point that m a y be m a d e concerning the methods in Sections 3-5 is their asymptotic equivalence so far as e f f i c i e n c y is concerned (up to the first order). This is already implicit in the term R B A N that has been used in Section 3 for estimates. A similar result holds also for test-criteria in Sections 4 and 5 (see, e.g., N e y m a n [23]). Indeed, considering a Pitmansequence of alternatives converging to the null-hypothesis at a suitable rate, it can be shown (see, e.g., [22]) that these criteria have the same limiting noncentral chi-square distribution with the noncentrality p a r a m e ter depending only on the sequence of alternatives. There is some work within the Bayesian framework where the unknown parameters are allowed to have some a p r i o r i distributions. With uniform prior distributions the Bayesian argument based on p o s t e r i o r distributions leads to the familiar Pearson chi-square and modified chi-square statistics, at least in the simple one and two-dimensional contingency tables (see, e.g. [21]). For some work concerning Bayesian estimates and p s u e d o - e s t i m a t e s , where parameters in the prior distributions are replaced by data estimates, see [101. Although some monographs (e.g. [12], [14], [24]) have come out recently on the present subject matter, more comprehensive treatment of this subject had been lacking until the recent books b y H a b e r m a n [19], Bishop, Fienberg and Holland [10], a n d Gokhale and Kullback [15].

References
[1] Aitchison, J. and Silvey, S. D. (1960). Maximum-likelihoodestimation procedures and associated tests of significance. J. Roy. Statist. Soc. B 22, 154-171. [2] Berkson, J. (t953). A statistically precise and relatively simple method of estimating the bio-assay with quantal response, based on the logistic function. J. Amer. Statist. Assoc. 48, 565-599. [3] Bhapkar,V. P. (1961). Some tests for categorical data. Ann. Math. Statist. 32, 72-83. [4] Bhapkar, V. P. (1966). A note on the equivalence of two criteria for hypotheses in categorical data. J. Amer. Statist. Assoc. 61, 228-235. [5] Bhapkar, V. P. (1968). On the analysis of contingency tables with a quantitative response. Biometrics 24, 329-338. [6] Bhapkar, V. P. (1970). Categorical data analogs of some multivariate tests. Essays in Probability and Statistics. University of North Carolina Press, Chapel Hill, 85-110.

ANOVA and MANOVA

387

[7] Bhapkar, V. P. (1973). On the comparison of proportions in matched samples. Sankhya A 35, 341-356. [8] Bhapkar, V. P. and Koch, G. G. (1968). Hypotheses of "no interaction" in multi-dimensional contingency tables. Technometrics 10, 107-123. [9] Birch, M. W. (1964). A new proof of the Pearson-Fisher theorem. Ann. _Math. Statist. 35, 817-824. [10] Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975). Discrete Multivariate Analysis. MIT Press, Cambridge. [11] Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika 37, 256-266. [12] Cox, D. R. (1970). The Analysis of Binary Data. Methuen, London. [13] Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press, Princeton. [14] Fleiss, J. L. (1973). Statistical Methods for Rates and Proportions. Wiley, New York. [15] Gokhale, D. V. and Kullback, S. (1978). The Information in Contingency Tables. Marcel Dekker, New York. [16] Goodman, L. A. (1964). Simple methods of analyzing three-factor interaction in contingency tables. J. Amer. Statist. Assoc. 59, 319-352. [17] Goodman, L. A., and Kruskal, W. H. (1954). Measures of association for cross-classifications. J. Amer. Statist. Assoc. 49, 732-764. [18] Grizzle, J. E., Starmer, C. F., and Koch, G. G. (1969). Analysis of categorical data by linear models. Biometrics 25, 489-504. [19] Haberman, S. J. (1974). The Anaylsis of Frequency Data. University of Chicago Press, Chicago. [20] Hoyt, C. J., Krishnaiah, P. R. and Torrance, E. P. (1959). Analysis of complex contingency data. J. Experimental Education 27, 187-194. [21] Lindley, D. V. (1964). The Bayesian analysis of contingency tables. Ann. Math. Statist. 35 (1), 622-643. [22] Mitra, S. K. (1958). On the limiting power function of the frequency chi-square test. Ann. Math. Statist. 29 (1), 221-233. [23] Neyman, J. (1949). Contributions to the theory of the X2 test. Proc. Berk. Syrup. Math. Statist. Prob., University of California Press, Berkeley, 230-273. [24] Plackett, R. L. (1974). The Analysis of Categorical Data. Hafner, New York. [25] Rao, C. R. (1965). Linear Statistical Inference and its Applications. Wiley, New York. [26] Roy, S. N. and Kastenbaum, M. A. (1957). On the hypothesis of no " interaction" in a multiway contingency table. Ann. Math. Statist. 27, 749-757. [27] Roy, S. N. and Mitra, S. K. (1956). An introduction to some nonparametric generalizations of analysis of variance and multivariate analysis. Biometrika 43, 361-376.

P. R. Krishnaiah,ed., Handbook of Statistics, Vol. 1 North-Holland PublishingCompany (1980) 389-406

12

Inference and the Structural Model for ANOVA and MANOVA


D.A.S. Fraser

The traditional statistical model is a class of distributions--the class of possible distributions for the response variable. For example, in a simple context, the model could be the class of normal distributions for a sample on the real line, ( (2'/72) n/2exp -- Y,(y i --/*)2/202: p~ER, a ~ R + }; this is a doubly infinite class with/x taking values on R and a on R . In any particular application, however, just one of the distributions in the class is the true distribution, the distribution that actually describes the response variable (to some reasonable approximation). In what sense is it appropriate or necessary then to model a single distribution by a class of distributions, doubly infinite say or even more complex? The formation of a statistical model has been examined recently in the monograph, Inference and Linear Models (Fraser, 1979). The model for an investigation describes the particular variables, performances, randomization, and other conditions determined by the investigation. It is not an idle or arbitrary construct but satisfies very specific requirements: descriptive, exhaustive, probabilistic. (a) Descriptive. The components of the model correspond to objective components for the performances determined by the investigation. Thus the components of the model are real, not arbitrary. (b) Exhaustive. There is a component in the model for each objective component for the performances determined by the investigation. Thus the descriptions in the model are full not partial. (c) Probabitistic. The use of probability in the model conforms to the requirements of probability theory. In a sense this is covered by (a) but the following two requirements concerning conditional probability need emphasis: Requirement (i). If there is an observed value of an objective variable with known objective probabilities then the requirement is that all probability descriptions be conditional probabilities given the observed value (observed value on an objective probability space); Requirement (ii). If 389

390

D. A. S. Fraser

there is information concerning an objective variable that takes the form of an observed value of a function of the variable, then marginal probability describes the observable function, and conditional probability describes the variable itself given the observed value (observed value of an objective function). The detailed definition of a statistical model has implications for many statistical applications, in particular it has very specific implications for the basic analysis of a statistical model with observed data. The typical application where the definition becomes incisive involves background information that identifies the distribution form, identifies the form in an objective manner. For example consider the normal( #, o) application mentioned earlier; the distribution form is normal and can be identified objectively due to closure properties of the location-scale presentations. For a detailed discussion see Section 1.2 in Fraser (1979). The traditional normal( #, o) model is one that allows a sufficiency reduction to a 2-dimensional location-scale statistic. Standard analysis then gives reasonable tests and confidence regions. The same locationscale model but using say a Student(7) distribution for variation does not however lead unequivocally to any satisfactory inference reductions or any satisfactory tests and confidence regions. However if the criteria (a), (b) and (c) are imposed on the statistical model, then the distribution form in the location-scale application is included as an objective component of the model. The analysis then leads unequivocally to a location-scale statistic and gives the appropriate tests and confidence regions. For the normal case these results are in agreement with the usual normal analysis. The general results are, however, available widely, and extend the familiar kind of results traditionally available for just the normal type cases. As examples of applications amenable to the incisive models and the consequent inference results, we mention the following: location model, scale model, location-scale model, regression model, multivariate model, multivariate-regression model. In each case a specific distribution form for variation is possible or more generally a parametric class for the distribution form is possible. With more complex problems more detailed computer analyses are needed. The computer programs for nonnormal regression analyses are now available for the simpler cases, and are in the process of implementation more generally. For multivariate and multivariate regression analyses a variety of techniques give indications that the computer analyses can progressively be brought in hand. In this chapter we examine models satisfying the three requirements (a), (b), (c) and covering the regression analysis context and the simple multi-

Inference and the structural model

391

variate regression context. The models are called structural models and the basic analysis is necessary analysis, analysis that follows f r o m the model and data alone. For more detailed discussions and other applications see Fraser (1979). In conclusion we note that the methods here provide the definitive analysis for location-scale type models with any specific distribution form for variation, say Weibull or extreme value, or a numerically recorded distribution form. The only restriction is that more complicated integrations arise with more complicated location parameters. For examples, see Fraser (1979, Section 2.4).

1.

ANOVA: the regression model

Consider a stable system with a response variable y and input variables x 1..... xr; some of the input variables m a y be combinations of other variables thus allowing polynomial and interactive regression dependence. We suppose that background information has identified the distribution form c o m m o n to independent performances of the system, o r - - h a s identified the distribution f o r m up to a shaPe parameter )t in a space A. Let fx designate the density for the distribution form; we suppose that it has been standardized in some understandable and useful fashion; for example, the central 68.26% of the probability is in the interval ( - 1 , + 1). F o r convenience let z designate a variable for this standardized variation. N o w consider the response presentation in terms of the variable describing the variation. Let o designate the scaling of the variation and let ~ix~ +... + fl~xr designate the general response level; we are assuming that the response location is linear in the input variables. W e then obtain
y = f l l X l + " " " + fl~X~ + OZ

(1.1)

where z has the standardized objective distribution fx(z) with )t in A. This is close to the familiar way of writing the regression model although usually the combination oz is written as e and called error. In the present context where we have acknowledged the objective nature of the variation we are taking the formula to m e a n explicitly what it says. For repeated performances of the system under varied conditions we let

[ Xnl'''xn~]

designate the design matrix from the input variables x 1..... x~; we assume

392

D. A. S. Fraser

that X has rank r < n . We also let f l = ( f l l ..... fir)' designate the linear location parameters and Y=(Yl ..... y,)' and z = ( z 1..... z,)' record the n values for the response and corresponding variation. The model then is yfXfl+oz, where z has the distribution f x ( z ) = I I f ~ ( z i ) with 2~ in A. Let ~)L designate this model.
1.1. Necessary reduction

Now consider the combination _ (~,yO) (1.2)

consisting of the model ~ and the observed response vector yO. The combination is called the inference base for the particular investigation. We assume that yO~ E(X). Let z designate the variation vector corresponding to the observed response vector yO. We have then that
z = - X o - l ~ + o - lyO

~(Xb+cy: b~Rn, c~R +) ~ E + ( X ; y ) which is half of the linear space E(X; yO), the half subtended by E(X) and passing through the observed y0. We have thus determined from the observed yO that the variation z is not arbitrary in R" but restricted to half of an r + 1 dimensional subspace. Formally we can say that we have the observed value of the function E+ (X; z), E+ (X; z) = E+ (X; yO),

(1.3)

but have no information otherwise concerning the location of z on the identified half s p a c e - - n o information other than that coming from the density functions describing the variation.
1.2. Notation

We have seen that a vector z should be examined in terms of which r + 1 dimensional half space contains the vector and in terms of where the vector lies on the r + 1 dimensional half space. Any choice of coordinates

Inference and the structuralmodel

393

would work as well as any other but there is convenience in choosing familiar coordinates, for we are just choosing a way to present points that are already t h e r e - - i n a specified space R n. The vectors forming X are a natural choice of r of the r + 1 vectors needed for E(X,z). F o r the remaining vector to define E+ (X;.' 0 let d(z) be the unit residual: d(z) = s - l ( z ) ( z - Xb(z)), where b(z) = ( X ' X ) - I X ' z , s2(z) = z ' ( I - X ( X ' X ) - ' X ' ) z .

Note that d(z) indexes the possible half spaces E+ (X; z) and that (b(z),s(z)) gives the coordinates of z on the half space E+ (X; z). The equation y = X ~ + oz can now be expressed in terms of the new coordinates: d(y)=d(z), and b(y)=fl+ob(z), s(y)=as(z). (1.5) (1.4)

For our model 9]L with data y0 we see that the value of the unit residual for variation is directly calculable or observable: d(z) = d(y). (1.6)

Also we see that there is no information concerning the location b(z),s(z) of the variation: b(z ) = a - '(b(y ) -/3), s(z ) = a - ' s (y).
1.3. The marginal and conditional distributions

(1.7)

In line with the requirement (ii) in (c) we now determine the marginal distribution for the observed half-space as given by d(z)=d(y) and the conditional distribution for points on the half-space as given by (b(z),s(z)) for the variation z or as given by (b(y), s(y)) for the response presentation y. The initial distribution for z and the induced distribution for the presentation y can be written fx(z) dz =
a -nf~(o -

'(y - X/3)) dy.

(1.8)

394

D. A. S. Fraser

The change of variables

4z), d(z))
y-->(b(y), s(y), d(y)) is straightforward in terms of the local orthogonal Euclidean coordinates and da for surface area or volume on the unit sphere in E (X). For the variation z we obtain J~(Xb + sd)s ~-r-11X'XI ~/2 db ds da with b(z)= b, s(z)= s, d(z)= d, and for the response y we obtain

o-~fx(a- l ( X ( b - ii3)+ sd))s ~-~-11X'X] l/2db ds d a


with b(y)= b, s(y)= s, d(z)= d. The marginal distribution for d = d ( z ) = d ( y ) is then obtained by r + 1 dimensional integration hx(d)da=

f f fx(Xb+sd)s"-r-l[x'xll/2dbds, da
R + R ~

(1.9)

This may not be available in closed form but can be accessible by mixtures of quadrature and simulation methods. The conditional distribution for b(z)= b, s(z)= s given d is then hx- l(d)fx(Xb + sd)s ~-~- 11X'XI i/2 db ds, and for b(y)= b, s(y)= s given d is

(1.10)

hZ~(d)o-nfx(o-l(X(b-13)+sd))s"-r-llX'Xl'/adbds

(1.11)

These conditional distributions for z and y on the half space E+(X; z)= E (X, y) are expressed in terms of the familiar coordinates b, s but could equally be in terms of any other choice of coordinates. Any other choice of coordinates would be one-one equivalent and amount just to a relabelling of given points with a given distribution. The equations (1.5) or (1.7) giving the response presentation can be separated uniquely into a o component and a/3 component
o-~s(y) = s(z) s- 1(y)(b(y) -/~) = s - ~(z)b(z) = T(z) (I. 12)

Inference and the structural model

395

The variable T(z) could be replaced by the more usual t-statistic and be more useful for numerical integration but for recording formulas here we would be introducing a variety of simple constants that unnecessarily complicate the formulas. The conditional distribution for s(z) is obtained by integrating (1.10):

g#(s:d)ds=

fh;'(d)A(Xb+sd)lX'XU2db.s"-'-'ds (1.13)
Rr

The conditional for T(z) is obtained by change of variable and integration, g~(T: d) t i T =

~ hC l(d)fa(s(XT + d ) ) s " - ' d s . IX'Xll/2 dT.


0

(1.14) This can be further integrated to obtain the t-statistics for single or for several location components of/1.

1.4. lnferencefor X
The distribution (1.9) for d describes the only observable variable from the objective distribution for the variation. Accordingly it is the only source of information concerning the parameter X. Usually however the distribution of d on the unit sphere in E(X) is not in a sufficiently tractible form to permit the calculation of tests and confidence intervals; of course, an occasional exception exists such as normal serial correlation (Fraser, 1979, Section 6.3). Thus available inference methods are largely restricted to the observed likelihood function for X which is available immediately by substituting d o in (1.9); this can be assessed directly or in comparison with experience with possible likelihood functions as obtained from simulations. The observed likelihood function for 2, is

L ,(d ) = ch,,(d),

+.

(1.15)

A plot of this function of )t can often produce quite sharp discrimination among possible ~ values. For numerical examples see Fraser (1976a, b). In certain cases there may not be this preliminary inference step for )t. For example, the model may fully prescribe the distribution for the variation, say Student (6), or standard Wiebull or standard extreme value; see Fraser (1979, Section 2.4). Or for example the model may prescribe the standard normal; this then gives the usual analysis and is the only case readily available by traditional models and methods.

396

D. A. S. Fraser

1.5.

Inference for o and

We now consider inference concerning the parameters o and ~Bassuming as given a value for the shape parameter )t. For the scale parameter o we have the equation o - ~s(y) = s(z) (1.16)

and the distribution (1.13) describing the unobservable s ( z ) = s for the variation. Consider a hypothesis o = %. On the assumption that o = a o we can calculate s(z): s(z ) = s(Y)
oo

(1.17)

This observed value can b e compared with the distribution (1.13) with d = do to see whether it is a reasonable high density value or a marginal value or an almost impossible value far out on the tails of the distributions where the density is essentially zero. The hypotheses can then be assessed accordingly in the usual test-of-significance manner. A confidence interval for a can be formed by first determining a 1 - ~ central interval (sl,s2) for s(z),
$2

f g#(s: d)ds -- 1 s1

a,

(1.18)

and then inverting the equation (1.16) to obtain the 1-o~ confidence interval s(y)
S2

s(Y) )
S1

(1.19)

The space for s is isomorphic to that for o and the choice for sl,s 2 could reasonably be say equal-tailed or highest probability density. The usual and bothersome arbitrariness is eliminated by the necessary reduction to s. This is of course a conditional confidence interval with (s~,s2)= ( S l ( d ) , s 2 ( d ) ) b u t it is also a 1 - a marginal confidence interval. Shrinking these 1 - a central confidence intervals down gives a very natural median type estimate for o. Now consider inference for the parameters/3. The usual framework for this is the analysis of variance table; this can be adapted to the present more general (nonnormal variation) method of analysis.

Inference and the structural model

397

An analysis of variance table implicitly assumes a succession of orthogonal subspaces. Notationally, this can be handled most easily by replacing X by an equivalent orthonormal n r matrix V so that
X[~= Va X= VE

(1.20)

where E is a positive upper triangular matrix giving


E~=a [3=E-loz

Eb=a

b=E-la

This pattern assumes that the parameter components ill,B2 . . . . . fir have been ordered from the most obviously present parameter fl~ to the least obviously present parameter fir. In accordance with this the parameters are usually tested sequentially: (1) Test fl, = fir,0, say 0; (2) If fir=fir, o, then test/?, 1=fl,_1,0, say 0; and so on. The analysis of variance table is based on the null test values fir,o = 0,fl,_ 1,0 = 0 ..... The orthonormal matrix V transforms the preceding into the equivalent succession: (1) Test a t = 0 ; (2) If a , = 0 , then test a t _ l = 0 ; and so on. Now consider the modified analysis of variance table appropriate to inference given a specific ~ value (say given some particular form of nonnormal variation). Source
vI v2

Dimension
1 1

Projection coefficient
al(y ) az(y )

Distribution
a I + oal(z) a 2 + oaE(z)

vr
~(x)

1
n--f"
n

ar(Y )
s(y)

ar + Oar(Z)
os(z)

An ordinary analysis of variance table is organized in terms of squared lengths in component subspaces. With nonnormal variation the relevant distributions are no longer necessarily symmetric and accordingly the signed projection coefficients are needed for the analysis. First we consider the test for the hypothesis fir = 0 or equivalently a r = O. This, with a minor qualification, will provide the pattern for the subsequent steps with the analysis of variance table. Under the hypothesis o~ r = 0 we can calculate the value of
Zr(Z)
=

am(z) _ a,(y) s(z) s(y)

398

D. A . S. F r a s e r

and compare it with the distribution for Tr(z) obtained from (1.14) (but using V in place of X) to see whether it is a reasonable, high-density value, or a questionable value, or an impossible value out on the tail of the distribution where the density is essentially zero. The hypothesis would then be assessed accordingly. A confidence interval for a r can be formed by finding a central 1 - a interval (T1, T2) for the distribution of T~ obtained from (1.14) (but using V in place of X). In the usual inversion pattern we would then obtain the 1 - a conditional confidence interval for at: ( a , ( y ) - T2s(Y),a r ..... Tls(y)), where typically T 1 and T 2 depend on d(y)o Now consider subsequent tests. Suppose we test a r_ 1= 0 given oL r 0. If the information a r = 0 is fully used; then there would be what is ordinarily called pooling-of-the-error-variance, pooling that could inflate the error variance if a r is in fact different from zero. The familiar procedure is to be safe and not pool the error variance; in effect this amounts to testing ar-1 = 0 without formally assuming o~r=0. The nonnormal analysis case however is more complicated than one of just pooling-of-error-variance, and it seems appropriate in the nonnormal analysis to test in the s a f e manner we have just been describing generally: the lack of significance for c~ gives grounds for testing ~,_ 1, but the test is performed in a safe manner that does not assume ~ = 0 for the analysis. The preceding gives the pattern of testing 0~r,a~_ 1,%-2 .... as long as the testing is appropriate. The computer integrations for handling the preceding can be large but a variety of procedures give indications that the integrations will progressively become more manageable. For a numerical example see Fraser (1979, Section 6.4). The non-pooling sequence of tests in this Section is presented as having certain practical preferences. F r o m certain theoretical points of view however, an exact sequence of conditional tests would seem to be the proper and appropriate procedure. For details on such a "pooling" type procedure see Fraser and M a c K a y (1975). Arguments can be given for each side; the present development adheres to the more familiar non-pooling patterns of analysis.
=

2.

MANOVA: multivariate regression model

Consider a stable system with a p-variate response Y=(Yl ..... yp)' and with an r-variate input x = ( x 1. . . . . x , ) . For n independent performances of

Inference and the structural model

399

the system let

Y=(Yl. . . . .

[ Y l I " " "Yln ]

Y,) . . . .

_-.-.. . . . , LYel Y,.]

X = ( X 1..... Xn) = ; [-Xl-.1-' Xr'" [ -'-'-X-!n-] . Xrn


note, now, that our sample sequence forms a row not a column vector. We suppose that the general response level depends linearly on the input variables: Location(Y) = @ X, where the regression coefficients are (2.1)

[/~11"-"&r
= [ -~;1"-'-'-~;; ~-~'(~1 - ..... ]~r)"

(2.2)

F o r this we assume the rank of X is r < n. For notation we suppose as in the preceding section that the variables have been ordered f r o m the most obviously influential x 1 to the least obviously influential x r. In this form when null tests are being considered it is equivalent to replace X by an orthonormal matrix V and to let designate corresponding regression coefficients: Location(Y) = ~ V where (2.3)

=[

] = (-,

.....

(2.4)

Now suppose that the background information identifies the underlying distribution form for the response. This can occur in a variety of patterns and depths. For example, the response variables could follow a natural sequence and the identified error form be related to the order in the sequence. This is examined in Fraser (1979, Sections 8.2 and 12.1), or in a more general manner the distribution form for the variation could be identified up to a linear distortion or positive linear transformation. In this section we examine this type of identification for the distribution form. Let fx(z) be the density function for the objective variation with a possible shape parameter )t taking values in a space A. As before we suppose that the density fa(z) has been suitably standardized both within

400

D. A. S. Fraser

and between coordinates. Then for tile compound response we have


n

A(z) = IIA(z,).
1

The error in the observable response is a linear distortion of the variation: y=~x+Fz,

Y=~X+FZ,
where the response presentation involves the positive linear transformation matrix

r = [-~-'-';--'-~-'L] [ ~ , . . . ~ j' Irl>0,


such transformations form a closed class under composition and inversion. The model we now examine is given by

r= ~x + rz,
where the objective variation variable Z has the distribution

(2.5)

A(Z) = IIA(zi),

(2.6)

with f known or with ~ in the index set A. Let 9IL designate this model.

2.1. Necessary reduction


Now consider the inference base = (91C, y0) (2.7)

consisting of the model 91L and the observed response matrix y0. F o r this let Z 0 designate the corresponding realized matrix for the variation. Now consider the information available concerning the realized matrix Z . For notation we let Y1. . . . . Yp designate the row vectors in Y, and similarly for X and Z. We can then write zO= -F-I~X+F-1yO e E+(X, ..... Xr; yO..... yO) (2.8)

where the final expression designates the (r+p) dimensional subspace

Inference and the structural model

401

~ ( X l . . . . . Xn ; yO..... yp0) together with an orientation called positive, the orientation of the r + p vectors as recorded in sequence. The expression (2.8) is taken as meaning that the vectors of Z complete X in such a way that they fall in the subspace represented by the right side and have the positive orientation mentioned above. As a basis for the subspace ~ + ( X l ..... X r ; Z l ..... Zp) we take of course the row vectors of X as the first r basis vectors and then take p further vectors completing the span with the positive orientation and, for notational convenience, satisfying the property of being orthonormal and orthogonal to X; the choice could be made by projecting successive axes into the subspace and successively orthonormalizing. It is important to emphasize that the choice must depend only on the subspace and be the same for say Z and Y having E+(X; Z ) - - E + ( X ; Y). As in Section 1.2 we can note that any choice of coordinates on the subspaces would work as well as any other but the present regression-type coordinates have advantages for computational familiarity. For further details see Fraser (1979, Sections 8.4.2 and 13.3.2). We can then write

Z=B(Z)X+ C(Z)D(Z)

(2.9)

where B ( Z ) and C ( Z ) ([C(Z)I >0) are the regression coefficients on the basis vectors and D ( Z ) has as rows the orthonormal vectors just described that complete the basis initiated by the rows of X. In the pattern of Sections 1.1 and 1.2 we then have the following results. The equation Y = @ X + F Z can be reexpressed in terms of the new coordinates:

D(Y) = D(Z)
and

(2.10)
(2.10.

B(r)= ~ +rB(Z),

C(Y)=rc(z).

For our model 91L with data y0 we see that the value of the orthonormal residual for variation is directly calculable or observable

D( Z ) = D(yO).

(2.12)

Also we can see fairly easily that there is no information concerning the location B(Z), C ( Z ) for the variation:

B(Z ) =

r-'(B(

V ) - ~3 ),

C(Z ) = r - 'c(r).

(2.13)

402

D. A. S. Fraser

2.2 The marginal and conditional distribution


Following the pattern presented in Section 1 we now determine the marginal distribution for the observable as given by D(Z)-- D(Y) and the conditional distribution for the unobservable as given by the coordinates

B(Z),C(Z).
The initial distribution for Z and induced distribution for the presenta-o tion Y can be written

f (Z)dZ =

Ir

1( r -

d r.

(2.14)

The change of variables

Z~-~(B(Z), C(Z), D(Z))


can be managed routinely; see for example Fraser (1979, Section 12.3.3):

fx(Z)dZ=fx(BX+ CD)[CI"-P-rIXX'Ip/2dBdCdD,

(2.15)

where dB and dC are the obvious Euclidean volumes and dD is Euclidean volume calculated orthogonal to the manifold in Rp" generated by the location scale transformations; note this special definition for the volume measure dD. The marginal distribution for D is available by p (r +p) dimensional integration:

hx(D )dD= Icf> o .f fx(BX + CD)ICI"-e-rtXX'IP/2dCdBdD"


(2.16) The conditional distribution for B(Z) = B and C(Z) = C given D is then

hh- 1(O ) f A ( B X + C O )[ C [n - - P - r l X X ' U 2 aM dC.

(2.17)

The corresponding response distribution for B(Y) = B and C(Y) = C given D is

h~I(D)fx(F-I((B - ~ )X + CD )) ICl"-P-r Irl" IXXT'/2dBdC"


(2.18) The equations (2.11) or (2.13) giving the response presentation can be separated uniquely into a F component and a ~ component: r-'C(Y) = c(z) (2.19)

Inference and the structuralmodel


and

403

'(Z)B(Z)=I4(Z).

(2.20)

The variable H ( Z ) is of multivariate t-statistic form but omits some constants that would leave formulas less than tidy. We will find it convenient to write

H ( Z ) = ( h i ( Z ) ..... hr(Z)).

(2.21)

The conditional distribution for C ( Z ) = C is obtained by integration from (2.17):

h~(D) f/x(BX+ CD)IXX'IP/ZdB. ICI"-P-~dC,

(2.22)

The conditional distribution by H ( Z ) = H is obtained by change of variable and then integration:

h~'(D) f fx(C(HX+D))ICt"-PdCIXX'IP/2.dH.
ICl>0

(2.23)

This can be further integrated to obtain the distribution for particular columns in H or for groups of columns in H.

2.3.

Inference for )~

The distribution (2.16) for D describes the only observable variable from the objective distribution for variation; it is thus the only source of information concerning the parameter )~. As in the simple regression case it seems unlikely that tests and confidence regions can be formed easily. Thus available inference methods seem restricted to the likelihood function for X; it is available immediately by substituting D o in (2.16) as in Section 1.4.

2.4.

Inference for F and

Now consider inference concerning the parameters F and ~ assuming of course a specified value for the shape parameter )~. For the scale parameter F we have the equation

F - I C ( Y ) = C( Z ),

(2.24)

and the distribution (2.22) describing tile unobservable C ( Z ) = C for the variation.

404

D. A. S. Fraser

Consider a hypothesis F = F 0. On the assumption that F = F o we can calculate C( Z):

C( Z ) = F o ~C( yO).
This observed value can be it conforms or otherwise to A confidence region can region K for C(Z) from inverting to obtain the 1 - a

(2.25)

compared with the distribution (2.22) to see if the hypothesis. be formed by first determining a 1 - a central the distribution (2.22) for C(Z), and then confidence region (2.26)

C( Y)K -- ~

from the observed y0; we use K -~ to designate the set of inverses of matrices in the set K. The space for C is isomorphic to that for F and the choice for K could reasonably be based on say highest probability density or some rectilinearity with respect to component coordinates. The usual and bothersome arbitrariness is eliminated by the necessary reduction to C. A central type estimate can be obtained as I" = C ( y 0 ) ~ -1 (2.27)

where C is, say, the maximum density point for the distribution (2.22). Now consider inference for the parameters in ~ . In the usual analysis of variance style we assume that the columns are ordered from/31 the most obviously influential to/3 r the least obviously influential,

=(IL ..... &),


B = (bx,...,br). (2.28)

Then with respect to an appropriate orthonomal basis V we have the corresponding sequence from al the most obviously influential to a r the least obviously influential,

A = (a 1. . . . . at).

(2.29)

The a's in the sequence 6~ refer specifically to corresponding subspaces generated by the rows in V. F o r the correspondence with the fl we have

~X=~

X=E~

Inference a n d the structural m o d e l

405

where E is a positive lower triangular matrix giving ~SE~, ~=~E -1, (2.30)

BETA,

B=AE-1.

Now consider the typical analysis of variance test sequence: (1) Test a r = 0 ; (2) If a r = 0 , then test ~xr_ l = 0 ; and so on. The analysis of variance table appropriate to inference given a specific X value (say in the nonnormal case) takes the following form involving actual projection coefficients rather than squared lengths and inner products.
Source vI 2 Dimension 1 1 Projection coefficients al(Y) a2(Y) Distribution et I + Fa1(Z ) a 2 -I-Fa2(Z )

e(X)

n- r

C(Y)

Fc(z)

First we consider the test for the hypothesis/3 r = 0 or equivalently ar = 0. This provides the basic pattern for subsequent tests. U n d e r the hypothesis a r = 0 we can calculate the value of
hr(Z) = C - ' ( Z ) a r ( Z ) = C -

and compare it with the distribution for hr(Z ) obtained by integrating out r - 1 columns in the distribution of H(Z) in (2.23) (of course using V in place of X). The hypotheses would then be assessed appropriately based on whether the observed value was reasonable, marginal, or essentially impossible. A confidence region for % can be formed by finding a central 1 - a region K from the distribution of h r derived from (2.23) (but using V in place of X). We would then obtain the following 1 - ~ confidence region
f o r O~r:

c(Y)K.

Now consider subsequent tests. Suppose we test otr_ ~= 0 given a r = 0 . If the information a r = 0 is fully used then we would be working with a full model having p fewer parameters and be in correspondence with the method of pooling-of-error variance; this gives no protection if ~r is

406

D. A. S. Fraser

s o m e w h a t different from zero. T h e familiar safe p r o c e d u r e is to a v o i d the p o o l i n g p a t t e r n ; a c c o r d i n g l y we c o n s i d e r the test a r _ 1= 0 w i t h o u t f o r m a l l y a s s u m i n g a t = 0 . T h e a s s u m p t i o n a t = 0 is n e e d e d h o w e v e r to relate a hypothesis a~_ 1= 0 t o the " c o r r e s p o n d i n g " hypothesis ~1 r _ l = 0. The test for say ar_. 1= 0 has t h e n exactly the same p a t t e r n as for a r = 0 b u t with just the o b v i o u s c h a n g e of subscripts. T h e same holds for the c o n f i d e n c e regions.

References
Fraser, D. A. S. (1979). Inference and Linear Models. McGraw-Hill International, DiJsseldorf. Fraser, D. A. S. (1976a). Probabili O, and Statistics, Theory and Applications. Duxbury, North Scituate MA. Fraser, D. A. S. (1976b). Necessary analysis and adaptive inference. J. Amer. Statist. Assoc. 71. Fraser, D. A. S. and J, MacKay (1975). Parameter factorization and inference based on significance likelihood and objective posterior. Annals Statist. 3.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 407-441

| l[ J

Inference Based on Conditionally Specified ANOVA Models Incorporating Preliminary Testing


T. A. Bancroft and Chien-Pai Han

1.

Introduction and definitions

1.l.

Introduction

The application of statistical theory to the analysis of specific data obtained from an investigation usually begins with the specification of a mathematical model containing unknown population parameters. Given such a model specification, statistical theory provides methods for obtaining estimates of, a n d / o r test of meaningful hypotheses concerning the unknown population parameters by using the specific data as a sample of observations from a population containing such unknown parameters. The specification of a particular mathematical model may be either: (1) unconditionally specified or (2) conditionally specified. In assuming an (1) unconditionally specified model the investigator depends upon either theoretical knowledge from his particular substantive field of investigation or his or colleagues' previous experience with similar data to provide a 'reasonable' model specification; using the currently collected sample data to make statistical inferences (estimation and tests of hypotheses) regarding the unknown population parameters. On the other hand, in assuming a (2) conditionally specified model the investigator uses the currently collected sample data in making preliminary tests to assist in validating a 'reasonable' model, in addition to making inferences regarding all remaining unknown population parameters in the determined model specification. For both model specifications, inferences made regarding the unknown parameters actually selects a particular subclass of the particular 'reasonable' model specification. In the case of (1) unconditionally specified models, such a selection is made in one step; while for (2) conditionally specified models, the final selection is made after one or more steps in accordance with the number of preliminary tests.
407

408

T. A. Bancroft and Chien-Pai Han

The use of (2) conditionally specified model would seem appropriate in situations in which the investigator is uncertain as to: (l) the inclusion or not of one or more parameters or the modification or not of one or more side assumptions in a particular 'reasonable' model given a single sample; and (2) given two or more samples there may be some doubt as to whether all of them may be from the same 'reasonable' model. The term 'reasonable' in referring to a model is used to point out the unlikelihood of any mathematical model exactly representing real data from any investigation. The authors believe that a clearer understanding of the concepts and procedures, involving inferences based on conditionally specified A N O V A models, would result from the discussion of a special case of the use of preliminary tests as a means of resolving uncertainty (1) above. It is well known that ANOVA models can be interpreted as special cases of the general linear model Y=X~+e, e ~N(0,o21) (l)

For example a 'reasonable' model for an investigation involving a hierarchical classification with a classes and b subclasses and c observations in each subclass, might be initially specified as
Yijk = # + ai + flij + eijk, eijk ~ N I D ( 0 , 02).

(2)

If the usual ANOVA procedures are used to obtain the tests of Hol : flij = 0, and H o 2 : a i ~-O, then the inferences made are based on an unconditionally specified model. On the other hand, if the investigator, from a priori information, should suspect but is not certain that flij = 0, he may decide to base his inferences concerning the ai on a conditionally specified model, incorporating a preliminary test of the hypothesis that fljj = 0 by using the n = abc observations from the investigation. O f course, subsequent inferences concerning the % assume to be the inferences of primary importance, would need to take into account the step-wise nature of this latter inference procedure. Multiple regressions and polynomial regression are two other types of statistical methods which also may make use of the general linear model (1) and inferences based on conditionally specified ANOVA models incorporating preliminary testing. In particular, such a model and inference procedures would be appropriate when the investigator is uncertain whether or not to include a suspected subset of regressors in the multiple regression; or in the case of polynomial regression, uncertain as to the degree of polynomial to fit to available data.

Inference based on conditionally specified A N O VA models

409

For a more general discussion and examples involving other uses of inference based on conditionally specified models incorporating pre liminary testing, other than those involving ANOVA, see Bancroft (1972) and Bancroft and H a n (1977). In the authors' opinion, the model specification and particular inference theory used by investigator should be determined by circumstances peculiar to each research investigation. In particular, account should be taken of information available on such matters as: the nature of the particular problem under consideration; the nature and extent of pertinent data that is, or can be, made available, e.g. data from a designed experiment involving randomization and replication or observational data; amount of previous experience the investigator a n d / o r colleagues have had with similar data; pertinent substantive field theory available for specification of a 'reasonable' model; the necessity or not of arriving at an immediate inference to be used in an action program; and cost considerations including both cost per observation and cost of making incorrect inferences. Since such matters would be expected to change from one investigation to another, it follows that the investigator should be aware of alternative choices for model specification and inference procedures. Conditionally specified models incorporating preliminary tests and accompanying inference procedures is one such alternative.

1.2.

Definition of a conditionally specified model

We will confine our definitions to the special case given in equation (2) and assume that the experimenter has decided that inference based on conditional specification incorporating a preliminary test i s appropriate. The definitions given below for this case may be easily extended to more complicated ANOVA models to be discussed in later sections. It is well known that ANOVA models may be characterized as: (i) random, (ii) fixed, and (iii) mixed. For our purposes here, let us consider equation (2) as a random model and that the main objective of the investigation is to make inferences concerning the % We re-write equation (2) as

Yijk = ~ + a, + flij + eij*,

eijk ~ N I D ( 0 , o2),

(3)

where i = 1,2,...,a, j = 1, 2,...,b, k = 1,2 ..... c, oL i ~ N I D ( 0 , O 2) and f l i j ~ NID(0,a~) and all the random variables are mutually independent. The structure of the ANOVA table for this model is given in Table 1. Taking into account the 'Expected mean square' column in Table 1 and assuming an unconditionally specified ANOVA model, the usual test of

410

T. A . Bancroft a n d Chien-Pai H a n

Table 1 A N O V A table for model in equation (3) Source of variation


A

Degrees of freedom
n3~a--1 n 2 ~ a ( b - 1)

Mean square
II3 V2 Vl

Expected mean square


02+c02+abo 2 02 + co~ 02

B within A C within B

n I = a b ( c - 1)

i =~ 0 or its equivalent 02 = 0 would be accomplished by comparing the calculated Fo= 113/V2 with a tabular F(a3;n3,n2) , where the notation F(a3;n3,n2) is the upper 100a 3 percentage point of the central F distribution with (n3, n2) degrees of freedom and a 3 is the pre-chosen significance level. However, the investigator, because of a priori information as noted earlier, may decide to assume the conditionally specified A N O V A model: ao2:a

Y i j k "~"

{~

+ai+fl~j+%k

if a~v~O,
if

(4)

p~ + Og i "1- 8ij k

ab 2 = O.

In such case, research workers and applied statisticians have often used the preliminary test FpT= V2/Vl compared with F(al; n2,nl) of H e r : a 2 = 0 to determine whether to use V3/ V2 or V3/ V, where V=(n~ V1, + n2Vz)/(n 1+ n2) to test the main hypothesis of interest, i.e. Ho2:a = 0 or 02= 0. In the past, such stepwise testing procedures, including a choice of an appropriate value of a s, have been justified on intuitive grounds without benefit of any theoretical investigations as regards the effect of the preliminary test on the probability level of the subsequent test of main interest. For the conditional model (4), a preliminary test has also often been used for similar reasons, if the inference of main interest was to estimate 02 rather than to test H02 : 02 ~ 0 . Since it was obvious that inferences (tests of hypotheses and estimation), using the same data, made subsequent to preliminary tests are conditional on the outcome of preliminary testing, it became necessary to develop a sound theory for inferences based on conditionally specified models incorporating preliminary testing in general and including A N O V A models. To be of use in practice, such a theory should lead to recommendations for the significance levels of the preliminary tests based on an acceptable criterion as regards the final inference. Also, as an aid to the investigator in deciding whether to use a conditionally specified model rather than an unconditionally specified model, a means of comparing the two resulting inference procedures should be made available. For the nested model, a comparison of the t e s t H 0 2 : 0 2 = 0 assuming the conditionally specified model (4) with that of the same test using the unconditional model (2)

Inference based on conditionally specifiedANO VA models

411

could be made by comparing the power of file two tests for the same fixed sizes of the two tests.

1.3. Definition of inference for conditionally specified model


Confining ourselves to the special case given in equation (4) of a conditional specified model, let us consider first a precise formulation of a relevant inference problem. Mathematically, testing the main hypothesis //o2:0 2 = 0 at a pre-chosen ot3 and a 2 significance levels m a y be stated as: Reject Ho2 : aa2 = 0 if either
or

( V 2 / V 1/>F(al;n2,nl) and ( V 2 / V 1 <F(O~l; n2,nl) and

V3/Vz)F(aE;n3,n2)}
V3/V/~ F(o~3; n3,n 1 +
n2)},

(5)

where

v = (n, v, + n2v2)/(n, + ng.


The value F(al;n2,nl) is the r e c o m m e n d e d upper lOOa 1 percentage point of the F distribution with (n2, nl) degrees of freedom (dr), obtained from a special study made of this particular A N O V A model as described in the first paragraph of Section 1.2 above. Since the final test of main interest, Hoz:Oz, = 0 would logically be m a d e at the same pre-chosen significance level, we set a 2 = a 3. Conventional levels of a 2 = a 3 = a, say, are, of course, either 0.05 or 0.01. Also F(a;n3,n) is the upper 100a% point of the F distribution with either (n3,n2) or (n3,n I + n2) degrees of freedom as determined b y the outcome of the preliminary test. For the conditionally specified A N O V A model given by equation (4), should the problem be to estimate o~ rather than to test H02 : o~ = 0, then a point estimation procedure would be mathematically defined as follows: if if

liE~ V, >> F(al; n2,nl), use ( V3 - V2)/ab to estimate a2; V2/V 1< F ( a l ; nz, nl), use ( V 3 - V)/ab to estimate 0~,
(6)

where similar definitions hold for all symbols involved here as for the test of Hoz:O~ = 0 given above.

1.4. Extensions and a proposed symbolic designation


In Sections 1.2 a n d 1.3 above, expository and mathematical definitions were given for inferences based on conditionally specified A N O V A models incorporating preliminary testing using a simple r a n d o m nested or hierarchal classification model. As implied earlier these definitions m a y be

412

T, A. Bancroft and Chien-Pai Han

extended easily to more complicated ANOVA classification models to be discussed in later sections. In particular, such extensions would also include fixed and mixed ANOVA models in addition to the simple random model given and more than one preliminary test as implied by the word testing or test(s). Also, analogous definitions may be constructed for inferences based on conditionally specified multiple regression and polynomial regression models incorporating preliminary testing. Such regression problems will be considered in this chapter, since it is convenient to make use of an ANOVA table in displaying the steps involved in the inferences leading to the construction of a final prediction model of main interest. In view of the above, we have now extended our inference procedures for conditionally specified ANOVA models from two to three kinds, i.eo from testing and estimation to these two plus prediction. Special studies and their uses in these areas can be identified briefly as falling in one of three classes: (1) test after preliminary testing, (2) estimation after preliminary testing, and (3) prediction after preliminary testing. T o shorten these designations further, the following respective words have been coined: (1) testitesting, (2) testimating, and (3) testipredicting, see Bancroft (1975). It should be noted that these three kinds of inference procedures, based on conditionally specified models incorporating preliminary testing may be appropriate in many general applications of statistical methods, i.e. they are not limited to ANOVA models. The above stepwise inference procedures have been investigated in studies including in their titles such phrases as: "preliminary testing", "incompletely specified models", "pretesting", "adaptive inferences", and "sample dependent estimation in survey sampling". However, as brought out in the Bancroft and Han (1977) paper the most important common features of these studies is that they all involve particular kinds of statistical inferences; namely inferences based on conditionally specified models. Preliminary test(s) are used, of course, in these stepwise inference procedures, but such test(s) are merely techniques used as a means to accomplish the main objective, i.e. the main inferences (a test, an estimate, or a prediction). In View of observations made above, and hopefully as an alternative to further proliferation of names for the category of these related inference procedures, the authors propose the shortened designation 'conditionally specified inference' as a simple way of identifying such inference procedures, in general, including those using ANOVA. Should this designation be acceptable, we suggest the abbreviation CSI (Conditionally Specified Inference) as a memory aid for this general category of inference proce-

Inference based on conditionally specifiedANO VA models

413

dures. In such case, CSI procedures would include the three subclasses identified earlier by the coined words: testitesting, testimation, and testiprediction.

2o l-listorical remarks
2.1. Early intuitive use of C S I procedures in general applicatiom'

As noted earlier, CSI procedures cannot be used in any 'exact' probability sense unless a special study has been made, pertinent to the particular applied investigation undertaken, of the final inference of main interest. In such case, recommendations could be made available for the significance levels of the preliminary tests based on an acceptable criterion as regards the final inference. In other words, account must be taken of the fact that the final inference is conditioned on the outcome of the preliminary tests. While research workers in substantive fields and applied statisticians have used what amounts to CSI procedures for m a n y years on an intuitive basis, no theoretical studies of their properties or the effect of the preliminary tests on subsequent inferences were available prior to 1944. See Bancroft and Hart (1977). The earlier workers included such leaders in the field of statistics as R. A. Fisher and G. W. Snedecor. In m a n y such intuitive uses, either no recommendation was given as to any particular significance level at which to make, what amounted to, implied preliminary test or a recommendation was given based on a subjective judgment. It is interesting to note that over 50 years ago R. A. Fisher (1920) proposed using what amounts to a CSI procedure. Towards the end of his paper, Fisher discusses what to do in a case where it is known that the sample observations are from either a normal or a double exponential population distribution. He suggested 1 calculating the sample measure of kurtosis and recommended: If this is near 3, the M e a n Square Error will be required (from which = ~/MSE); if on the other hand, it approaches 6, its value for the double exponential curve, it m a y be that 61 (based on the absolute deviations) is a more suitable measure of dispersion. In the several editions of G. W. Snedecor's excellent text and reference book Statistical Methods, and, in particular, the 6th edition co-authored with W. G. Cochran (1967), attention has been called to the importance of model specification in the application of statistical methodology. The 1Referred to in Dalenius (1976).

414

T. A. Bancroft and Chien-Pai Han

Snedecor and Cochran 6th edition give statements of the mathematical model specification, including side assumptions, required to validate analyses and subsequent inferences. Attention is then called by these authors to the uncertainties that may arise in deciding oil an appropriate model specification to meet these requirements. For example, as regards ANOVA to be used in the case of a specified fixed effects model, we quote, Snedecor and Cochran (1967), page 321: "In the standard analyses of variance the model specifies that the effects of the different fixed factors (treatment, rows, columns, etc.) are additive, and that the errors are normally and independently distributed with the same variances. It is unlikely that these ideal conditions are ever exactly realized in practice..." Snedecor and Cochran then refer the reader to Scheff6 (1959) for a review of research done to investigate the consequences of various types of failure in the model specification assumptions. It is, of course, the objective of CSI procedure studies to develop an objective stepwise inference methodology, with built-in preliminary test(s) of certain critical uncertain elements, which should assist in minimizing such consequences. It is interesting to note that most texts on statistical methods, including the one by Snedecor and Cochran (1967), provide tests for nonconformity to model specification, for example tests for: non-normality of errors, outliers, homogeniety of variances, non-additivity, equality of means or correlations or regressions considered for pooling, etc. Actually, when used as a test to verify or not a model specification assumption, each such test is a preliminary test. Also, if all of the observations from an investigation are used both in making the preliminary test(s) and also in providing subsequent inferences of primary importance, the overall inference procedure is clearly CSI in nature. In view of the above, and despite the difficulty of deriving appropriate CSI procedures, statistical methodology should make available such procedures to assist in drawing conclusions or making decisions in critical situations involving uncertain model specifications. Of course, should such final conclusions or decisions be unaffected by the non-conformity to a particular model assumption or only slightly so, then no or little disturbance in subsequent inferences would be expected. Should the investigator be uncertain as to the validity of critical elements in a proposed 'reasonable' model assumption, i.e. elements whose validity do effect the drawing of conclusions or making decisions, then CSI procedures would provide a means of using the data from a single investigation to assist in the model selection process as well as in providing a subsequence inference of primary importance.

Inference based on conditionally specifiedANO VA models' 2.2. Development o f general C S I procedures

415

Over 160 papers have been published in this general area since the first paper by Bancroft (1944), see Bancroft and Han (1977). For a review of early CSI investigations, in case the final inference is estimation, the reader is referred to the paper by Kitagawa (1963) and for a review of such investigations in general to Bancroft (1964, 1965). For review of later general CSI investigations the reader is referred to the papers by Bancroft (1972, 1975). However, the most recent bibliography and justification of the choice of the designation CSI procedures, over earlier designations for this kind of inference, was given only recently in the paper by Bancroft and Han (1977). It should be made clear that the authors are well aware that CSI procedures provide only one of several alternative procedures designed to assist the research worker or applied statistician in making final inferences of primary importance in situations where there exists uncertainty as to one or more elements of a model specification. Even after taking into account any. previous experience with similar data on the part of the investigator and or others, often such uncertainties exist as regards to some proposed 'reasonable' model. Other alternative inference procedures would include dividing the data from an investigation at random into two parts, then use one part to make a preliminary test of some doubtful element in a proposed 'reasonable' model specification and the other part of the data for final inferences. Again, should the necessary a priori information be available to validate the required assumptions, the investigator may decide to use a Bayesian approach in making final inferences in the face of uncertainties regarding one or more elements in a proposed 'reasonable' model specification. A further alternative in case of such uncertainties, would be to use distribution-free or non-parametric inference procedures in making final inferences. In the above connection it should be noted that a comparable Bayesian approach would require more a priori information than a CSI procedure for a particular investigation. Again the designation, CSI procedures would appear appropriate for either case in which some or all the data is used respectively for preliminary testing and a final inference. In both cases, final inferences are conditional on the outcome of the preliminary test(s). However, special research investigations wilt be required for each of these two kinds of CSI procedures for each inference classification. Such investigations should include respective recommendation regarding the significance level of the preliminary test based on an acceptable criterion as regards the final inference. Also, for any case in which the data is divided into two parts, say n 1 and n 2, and n~ is used to make the

416

T. A. Bancroft and Chien-Pai Han

preliminary test with n 2 being reserved to make a final inference, the relevant research investigation should also include a recomtnendation for the relative sizes of n~ and n 2 where n 1+ n 2 = n (all the data).

2.3. Development of CSI procedures for ANOVA models


As stated earlier in this chapter we shall be primarily concerned with tile development of CSI procedures for A N O V A models. A s u m m a r y will be provided for the results obtained, using such procedures, in turn for the random, fixed, and mixed A N O V A models. Also, when available, recommendations regarding the significance levels of the preliminary tests based on an acceptable criterion as regards the final inference, will be given. The bibliography given in the Bancroft and H a n (1977) paper also provides a subject classification index. Beginning with the paper by Paull (1950), the A N O V A classification lists 34 investigations, concerned with the development of CSI procedures, u n d e r the subclassifications of fixed and r a n d o m models. Results obtained in these investigations, in certain instances are also found to hold for certain mixed models. In addition the subject classification mentioned above lists 46 references for regression concerned with CSI procedures studies, which in some instances could make use of A N O V A tables.

3.

Random ANOVA models for classified data

Given a model, whether conditionally specified or not, the investigator usually wishes to test hypotheses or estimate the parameters in the model. In this section we consider these two inference procedures in r a n d o m models based on conditional specification.

3.1. Testing hypotheses after preliminary testing


A typical A N O V A table associated with a r a n d o m model such as the model in equation (3) m a y be exhibited in Table 2. Table 2 ANOVA table for a random model Source of variation
Treatments Error Doubtful error

Degreesof freedom
n3 n2 n1

Mean square
V3 V2 V1

Expectedmean square
02 02 02

Inference based on conditionally specified A N O VA models

417

I n the A N O V A table we have o3 2 > o 2/> o2. T h e investigator wishes to test 2 2 H0:o 3 = 02 against H 1 : 0 ~ > 0 2. T h e usual test statistic is 1 / 3 / V 2 a n d H o is rejected if

(7)

This test procedure is referred to as the never-pool test. In certain investigative situations, the n u m b e r of degrees of freedom n 2 is small. Therefore the p o w e r of the never-pool test m a y also be small. If in fact 0 2 = 0 2, the investigator, f r o m s o u n d theoretical considerations, m a y pool the two m e a n squares V2 a n d V 1 and use V = ( n I V l + n 2 V 2 ) / ( n I -b/'/2) as the error term in the test procedure. The test statistic is V 3 / V a n d H 0 is rejected if V~ 2- ~- ;) F(a3; n3,nl + n2) V

(8)

This test procedure is referred to as the always-pool test. It often happens that the investigator is uncertain whether O 2 2~ _ O2 1. In order to resolve the uncertainty, he usually uses a preliminary test to test 2 Hr, r0 : 0 2 = 0 2 against H v n : o 2 >0 2 1. If H v r 0 is accepted, V 1 and V2 are pooled a n d the test in (8) is used; if HpT 0 is rejected, the test in (7) is used. H e n c e the model is a conditionally specified A N O V A m o d e l as described in Section 1. The test procedure u n d e r conditionally specified model is to reject the main hypothesis H o : o2 = g~ if

either or

{ V2/V l ~>F(al; n2,rq) and V3/V2>~F(a2; n3,n2) }

(9)

{V2/VI<F(al; n2,nl) and V3/V>~F(as; n3,nl+n:) }.

This test procedure is referred to as the sometimes-pool test. It should be n o t e d that the investigator should n o t use the always-pool test when he is uncertain whether a 2 = 0 2, because if the always-pool test is e m p l o y e d and in fact o22>o 2, the pool estimator V would under estimate the error. Consequently the final F test will give too m a n y significant results w h e n its null hypothesis is true. So the significance level m a y be too high and unacceptable. Because of this reason, we shall n o t consider the always-pool test a n y further. W h e n the investigator is uncertain whether tr22= 0~, he m a y use either the never-pool test or the some-times pool test. As m e n t i o n e d before, w h e n n 2 is small, the power of the never-pool test is also small. In such a situation it

418

T. A. Bancroft and Chien-PaiHan

is advantageous to use the sometimes-pool test. If the preliminai2 test accepts Hpa-0, V 1 and V2 are pooled and the error degrees of freedom is increased, so the power of the test is also increased. However, it may also happen that the preliminary test accepts a wrong hypothesis and there are disturbances in the level of signficance and power of the final test. We will study the advantages and disturbances of the sometimes-pool test by comparing it with the never-pool test. Let us consider the significance level and power of the sometimes-pool test. The probability of rejecting H 0 is obtained by integrating the joint density of V1, Vz and V3 over the rejection region defined in (9). The three mean squares 17/ are independently distributed as X~o~/n i, where X/2 is the central X2 statistic for ni degrees of freedom, i = 1,2,3. So the joint density of V1, V2 a n d V 3is

kV~l,,,_lv~2,,_,v~,,_,exp~ ~ , (l[n,V,_.2\ +[u1 - - ~ n2V202 -b n3V3 }2 o ]


where

(10)

k-l=N(nl+n2+n3)/2F(-~-)r( n2\--~-)Y'[-~-). n3 I
Since the two regions given in (9) are mutually exclusive, the probability P of rejecting H 0, which in general is the power of the test procedure, is the sum of the following two components

el = P { V2/V1 >/F(al; n2,nl) P2 = P { I/2/V, < F ( a l ; n2,n,)


So

and and

V3/V2>F(a2; n3,n2)},
(11)

V3/ V >>.F(a3;na, nl + n2)},


(12) (13)

P=PI + P2

In order to evaluate P1 and Pz, Bozivich, Bancroft and Hartley (1956) derived recurrence formulas for these two components. Define the following quantities:

"J
/'/3 b - (n I + n2)032 F(o~3; n3,nl + n2),
. (14)

?/2 a= n~-21F(O~l; rt2,nl),


b n3

c= 02--71,
1

d= n203--~2 r(a2, n3,n2),


a(1 + b)

xl= l + a + a d '

x2= l + c + a ( l + b ) '

l+a x3= l + c + a ( l + b ) "

Inference based on conditionally specified A N O IrA models

419

Then the recurrence formula for P~ is given as (d) "3/2


P l ( n 3 ) -=

lIx,(n~/2,(n2+

n 3 ) / 2 = 1)

(n,/2- 1)B(n312- 1, n2/2)(1


4- P l ( n 3 - 2),

+ a) {"2+"9/2-'

(15)

where B(-, ,) and lx(', ") indicate the complete and normalized incomplete beta functions respectively. For the set of initial values at n 3 --2 it is found that P,(2) =

I,,,(n,/2,n2/2)
(1 +

d) '6/z

(16)

The recurrence formula for P2 is

P2(nl,n3) =

(1 / a ) " ' / 2 - 1 I x , ( ( n l

1 + c [(n, + n 2 ) / 2 -

+ n2)12-- 1, n3/2 ) 1]B(nl/2,n2/2)(1 + l / a ) (nl+n2)/2-|


(17)

+ c'Pz(nl,n 3-

2) + P2(nl -- 2, n3).

The formulas for the initial values are

P2(nl,2) =
and

Ix2(n2/2,nl/2) (1 -Jr-b)n2/2(1 4- c) nl/2


}.

(18)

pz(2,n3)= l__~c { Ix,(n212'n3/2) + c.P2(2,n 3-2) (1 + 1/a) n2/2

(19)

Bozivich et al. (1956) also give series formulas for P1 and P2 when the degrees of freedom are even and approximate formulas for large degrees of freedom. The series formulas reduce to those given in Paull (1950) when n3=2. The size of the sometimes-pool test can be obtained from P~ and P2. The probability of a type 1 error is computed by setting 032 = 1. When it is plotted on a graph, it is called the size curve by Bozivich et al. (1956). We shall adopt this name. In general, the degrees of freedom n t, n2, n 3 are fixed in a given experiment. The only parameters under the control of the experimenter are %, a z, and a 3. Usually the experimenter would set a 2 = % = a at some nominal level, say 0.05. Then he would like to select the significance level ~xI of the preliminary test such that the size of the final test is close to a and the power of the test is as large as possible.

420

T. A. Bancroft and Chien-Pai Han

The general behavior of the size curve under the r a n d o m model is that it has its minimum at 021--- 1 which is less than a, it increases to a m a x i m u m as 021 increases, then it decreases to a as 021---~oe. The m a x i m u m value is called the size peak. It is desirable to control the size peak to be as close to a as possible. After studying the size curves for various combinations of the degrees of freedom, Bozivich et al. (1956) found that tile size peak is usually too high when the preliminary test is carried out at the 0.05 level. This is due to the fact that at this level, the preliminary test will frequently admit pooling V 1 and V2 when the doubtful mean square o2 is smaller than the true error mean square o~, and thereby increase the probability of type I error. Therefore the level a S=0.05 is unacceptable in most cases. When a I is selected at 0.25 level, the size control is considerably better. In using a nominal size of 0.25, one can control the size peak within 10 percent except when n 3~>n2 and n 1/>5n 2 (20)

(It should be noted that the occurrence of n3>n 2 is rare.) Generally speaking, when condition (20) is satisfied, the size disturbance with o/1 0.25 m a y be considerable a n d a more conservative level of a 1 should be used. In such cases it appears that a preliminary test at a I = 0.50 would be adequate for size peak control. Any higher a I level will provide a very conservative test. The minimum of the size curve occurs at 021 = 1. Paull (1950) showed that the minimum value equals ( 1 - al)a3. Therefore when a3=0.05, the m i n i m u m values are 0.0475, 0.0375 and 0.025 for aa =0.05, 0.25 a n d 0.50 respectively. The m i n i m u m value is a decreasing function of oq a n d hence the deviation from the level a 3 increases as al increases. We have discussed the control of the significance level of the sometimes-pool test. Let us now consider the comparison of power between the sometimes-pool test and the never-pool test. In Bozivich et al. (1956) the comparison was m a d e when the parameter 021 is fixed. The significance level of the sometimes-pool test is first evaluated for a fixed Ozl, then for this level, the power curve of the never-pool test is obtained; the power for given 032 is then directly comparable with that of the sometimes-pool test corresponding to the fixed value of 021. Table 3 is taken from their paper to illustrate the power comparison. W h e n 021 = 1, the sometimes-pool test is always more powerful than the never-pool test of the same significance level. F o r 021 = 1.5, the powers are very similar; on the other hand, for 021 = 2, the never-pool test is always more powerful. In terms of power gain of the sometimes-pool test over the never-pool test, the gain is large when 021 = 1. T h e power gain decreases as 021 increases and finally there is a power loss when 021 is large.

Inference based on conditionally specified A N O V A models


'Fable 3 Power comparison of the sometimes-pool test and the never-pool test under random model (n 1, n z, n3) = (20, 6, 2), a 1= 0.25, a 2 = ot3 ~ 0.05 021 Test 1 1.0 1.5 2.0 s.p. n.p. s.p. n.p. s.p. n.p. 0.038 0.038 0.060 0.060 0.068 0.068 2 0.161 0.127 0.189 0.178 0.190 0.195 032 4 0.368 0.314 0.385 0.373 0.377 0.396 16 0.757 0.705 0.756 0.757 0.751 0.771 64 0.930 0.913 0.930 0.930 0.935 0.935

421

3.0 5.0

s.p. n.p. s.p. n.p.

0.068 0.068 0.058 0.058

0.178 0.194 0.164 0.175

0.361 0.394 0.348 0.369

0.743 0.770 0.738 0.754

0.926 0.935 0.924 0.930

Note: s.p. ~ sometimes-pool test; n.p. = never-pool test.

Paull (1950) considered a borderline test which would ensure a power gain. The critical value for the borderline test is

nlF(a3; n3,nl +

n2)

(n, + n2)V( 2; n 3 . . 9 -

3,n, + n2) "

(21)

The level a~ corresponding to this critical value is usually in the neighborhood of 0.7 or 0.8. So the frequence of pooling is small when 021 = 1 and the power gain is also small. Based on the investigation of the size curve and power gain for various combinations of the parameters, Bozivich et al. (1956) made the following recommendation for the level of the preliminary test: (i) If the experimenter is reasonably certain that only small values of 021 can be envisaged as a possibility, he is advised to use a~--0.25 except in the case (20) when he should use a I =0.50 in order to ensure size control (ii) If, however, the experimenter can make no such assumption about 021, and wishes to guard against the possibility of power losses, he m a y then use the borderline test. The above recommendations depend on some a priori information regarding 02r Examples are given i n Bozivich, Bancroft, Hartley and Hunstberger (1956) to discuss how this information can be obtained from the general conditions under which the experiments were carried out. We have discussed the pooling of one doubtful error with the error term in A N O V A for a r a n d o m model as given in Table 2. Such a table may arise in a hierarchical classification or a two-way classification. A n extension of the model is to consider a hierarchical classification of higher order

422

7: A. Bancroft and Chien-Pai Han

or a multi-way classification. In such a case there may exist two or more doubtful errors. Again the experimenter may depend on preliminary tests to determine whether or not to pool these doubtful errors. The evaluatio~ of the size and power of the sometimes-pool test involving more than one preliminary test becomes very complicated. Srivastava and Bozivich (1962) considered the pooling of two doubtful errors. In view of the findings in A N O V A with one doubtful error, they studied the case when all levels of the preliminary tests are equal to 0.25. For this level, conditions of the combinations of the degrees of freedoms under which the size is controlled are given. Jain and Gupta (1966) further considered the case with three doubtful errors. It should be noted that another useful model i n applied statistics is the mixed model. For a mixed model, some mean squares in the A N O V A are central X 2 variates and some mean squares are noncentral X2 variates. For example, if Table 2 was considered to be the A N O V A for a mixed model, then we may have that V~ and V2 are distributed as oi2x~/ni, i= 1,2, where X~ z is the central X2 variate with n~ degrees of freedom, and the mean square 2 r2 Xt2 V3 is distributed as 02X / n 3 where is the noncentral Xz variate with n 3 degrees of freedom and noncentrality parameter

2t=

n3o~ - nso ~

The three mean squares are independently distributed. Using the chio squared approximation to the noncentral X2 distribution as given in Patnaik (1949), we can write the joint distribution of V 1, V2 and V3 in a similar form as in equation (10). Therefore the evaluation of the power for the sometimes-pool test in the mixed model is obtained approximately in a similar way as that of the r a n d o m model.

3.2.

Estimation after preliminary testing

Let us consider a one-way classification model ij=#+ai+%,

% ~ N I D ( O , o 2)

(22)

where i = 1,2 ..... t, j = l, 2 ..... n, a i ~ N I D ( 0 , o~) and independent of eij. The A N O V A table for this model is given in Table 4. We are interested in estimating the variance components o~ and 02. Unbiased estimators of o~ and o z are ~ = ( V 2 - VO/n and 62= V 1 respectively. However the estimator ~ can assume a negative value with positive probability which is

Inference based on conditionally specified ANOVA models


Table 4 A N O V A table for one-way random model Source of variation Treatments Error Degrees of freedom t- 1 Mean square V2 VI Expected meart square o 2 + noa z 02

423

t ( n - 1)

dearly embarrassing. This problem of negative estimator has been discussed by many authors, e.g. Herbach (1959), Searle (1971, 1973), Thompson (1962), Thompson and Moore (1963), Wang (1967) and others. Further Klotz, Milton and Zacks (1969) have shown that 62 is inadmissible. Thompson (1962) considered the restricted maximum likelihood estimator of 0"~ which truncates the unbiased estimator at zero, that is, when the estimator is negative, replace it by zero. This is essentially a preliminary test estimator with F = V 2 / V 1 being the test statistic for testing the hypothesis Ho:a ~ = 0. The region of rejection is ( F > 1}. If H 0 is rejected, 0"~ is estimated by ( V 2 - V1)/n; if H 0 is accepted, a~ z is set equal to zero. Han (1978) defined a general form of the preliminary test estimator of aa 2 to be
c V 2 - - V1/FI
if F>f~, (23)

S(c,a)=

if F < f ~ ,

where c is a positive constant and f , = F [ a ; (t-1),f(n--1)] is the 100 (1 - a) percentage point of the F distribution with t - 1 and t ( n - 1) degrees of freedom. It is seen that (i) S(1/n,O) is the unbiased estimator, (ii) when c= 1/n and f~ = 1, it is the restricted maximum likelihood estimator, (iii) when c = ( t - 1 ) / [ ( t + 1)n] and f~ =(cn) -l, it is an estimator suggested by Robson and studied by Wang (1967). Wang reported that the Robson's estimator (T 3 in Wang's paper) gives the smallest mean square error (MSE) among the estimators studied. We note that S(c, a) is nonnegative whenf~ ~ (cn)-1. The constant c and the significance level a are selected by the experimenter. In order to determine the values of c and a, we must study the bias and MSE of S(c,a) (or simply denote it by S). The expected value and MSE of S are given by Han (1978) as

E( S ) = aZ( cgl - n - 1g2),

(24)

M a N ( S ) = 0"4[ ( c2g 3 "- 2cn - l g 4 + F / - 2g5) - - 2 0 ( c g 1 .---n - lg2) + 0 2 ],

(25)

424

T. 14. Bancroft and Chien-Pai Han

where

gl=~[1-Ix(a+ g2= 1 -Ix(a,b+


2

1,b)], 1),

g3=~2(l+~_l )[1-lx(a+2,b)], g4=t~[1--Ix(a+


g5 = I-~ t ( n - 1 ) ~=l+n0, l , b + 1)], (26)

[1-Ix(a'b+2)]'
1)(],

O=o~/o 2, x=(t- 1)f~/[ (t- 1)f~ + t(n a = ( t - 1)/2, b = t(n- 1)/2.

Ideally the experimenter would like to select c and a to minimize MSE(S). For a given a he may differentiate MSE(S) with respect to e and obtain an optimum value of e. However, it is not practical to use such a value because it depends on the unknown parameters a~ and o 2, also it may happen that f~ < (on)-1 which gives negative estimates. In order to ensure that S(c,a) be nonnegative, Hart (1978) suggested using c = (nf~)-1 -- c*, say, and the significance level a is determined by a criterion given in Han and Bancroft (1968). Let e(a,O) denote the relative efficiency of S(c*,a) with respect to T 3. If the experimenter does not know the size of 0 and is willing to accept an estimator which has a relative efficiency of no less than e 0, then among the set of estimators with a E A, where A = {a: e(a,0)>~e 0 for all 0}, the estimator is chosen to maximize e(a,0), the experimenter selects the a E A (say c~*) which maximizes e(a,0) over all a and 0. Since Maxoe(a,O)=e(a,O ), the experimenter selects the a CA (say a*) which maximizes e(a, 0) (say e*). This criterion will guarantee that the relative efficiency of the chosen estimator is at least e 0 and it may become as large as e*. A table is given in Han (1978) for the selection of a in estimating the variance components. The above discussion concentrates on the estimation of aft. When the experimenter is interested in estimating both ~ and o 2, a preliminary test estimator of (a 2, 0 2) is defined as
( c V -- V l / n , V1)

(O~'Sz)= (O,[(t-1)V2+t(n-1)V,]/(tn-1))

if F >f~, ifF<f~.

(27)

The preliminary test estimator of 0 2 is that if the hypothesis 14o:O ~ = 0 is rejected, Vl is used to estimate o2; if H 0 is accepted, V2 is pooled with V1

Inference based on conditionally specified A N O V A models

425

to estimate 02. T h e expected value a n d M S E of 02 are given as

E(02) = ,,2[ 1 + ( t - l ) ( n 0 -

+ g2)/(/1t-- 1)]

(28)

MSE(02) = 04[ { ((tn - 1) 2 - 4b2)g5 - 8abg 4 -- 4a293 + (t 2 - 1)42

+ 8abl~+4b(b + 1 ) } / ( t n - 1) 2 -4(a(g2-g,)+a(+b)/(tn-1)+l
Also

].

(29)

e(0 - 4)(o2-- o2) =


r.= O4E C( g4 -- gl) -}- ( g2-- g 5 ) l n -- 2 a O ( g2 -- g l -'~ n O ) l ( t n -- 1) ].

(30)
Table 4 is the A N O V A for a simple o n e - w a y r a n d o m model. I n estimating 02, it is suggested to pool I/2 with V 1 w h e n the p r e l i m i n a r y test is not significant. In a c o m p l i c a t e d A N O V A m o d e l there m a y be two or m o r e doubtful errors which m a y be pooled to estimate the error. Singh (1971) a n d Srivastava (1972) have considered pooling m e a n squares in r a n d o m models. Let V~ and V2 be two doubtful error m e a n squares and V 3 the true error m e a n square based on /11, n2 and n 3 degrees of f r e e d o m respectively. T h e expected value of V~ is E(V/)=o/2, i = 1,2,3, and it is a s s u m e d that a 3 2 f>02/>01. 2 2 W e are interested in estimating o32. The usual unbiased estimator is V3. If it is k n o w n that o22= 02 3, a better estimator is V23 = (n2V2+ n3V3)/(n2+ n3). F u r t h e r if 021---0~= 032, a still better estimator is V123 = (n I V1 -F/'12 V2 q-/t3 V 3 ) / ( n 1 +/12 q- r/3). In practice the investigator m a y be uncertain whether Ol 2 = 022= 02. In such a case he m a y use preliminary tests to resolve the uncertainty, so he has a conditionally specified model. Since there are two doubtful errors, we have two ways to pool the m e a n squares depending on which two m e a n squares are pooled first. W e m a y pool V2 with V3 first, then try to pool V~; on the other h a n d V 1 a n d V2 m a y be pooled first, then V 1 and V2 are pooled with V 3. These two estimators are, respectively, defined as I13 ~<F(al; n3,n2) and -V23 if -~2 ' ~ l ~<F(o~2; n2+n3,nl) , if ~113 < F ( a l ; n3,n2) and -Vz3 ~ l > F ( a 2 ; n2+n3'nl)' if ~

V l23 II23 V3

(31)

V3

>F(a,;

n3,n2) ,

426

T. A. Bancroft and Chien-Pai Han

and V123
V3 ~ < F ( a 4 ; n3,nl+n2), if -~1 < F ( a 3 ; n2'n3) and "VI~

V2 >F(a3;n2,n3) and V3 < F ( a s ; n3,n2)

(32)
V3 V3 where
Vtj=(niVi~njVj)/(ni-~-nj), V123=(nlVl-~n2V2-~n3V3)/(nl-l-n2-~n3).

V2 <F(a3; nz, n3) and --~12 V3 > F ( a 4 ; n3'n' + n2)' if -~l V2 < F ( a 3 ; n2,n3) and -~2 V3 > F ( a s ; n3'n2)' if ~

Singh (1971) studied the estimator 632 and derived the bias and MSE. The expressions are functions of the degrees of freedom nl,n2,n 3, the significance level a l , a 2, and the expected mean squares Otl 2, 022, 0"3 2. Of these parameters, n l, n2 and n 3 are determined from the data and a~ 2, 02, 2 03 2 are unknown. The only parameters under the investigator's disposal are the significance levels. After a numerical investigation, Singh found that the choice of a~ = a 2 = 0.50 is satisfactory. It should be noted that when a 2 = 1, 632 reduces to a preliminary test estimator studied in Bancroft (1944) where the bias of the estimator is given. The estimator 6~ is studied by Srivastava (1972) who derived the expectation and MSE. The recommendation for the selection of the significance level, in the case a l - - a 2 = a 3 = a , is that, in general, if the experimenter is reasonably certain that only small values of O2 2/ 03 2 can be envisaged as a possibility he is advised to use a = 0 . 2 5 or a--0.50; otherwise, he should take a = 0.05.

4,

Fixed A N O V A models for classified data

In this section we consider the fixed model and the organization will be parallel as nearly as feasible to that of Section 3.

4.1.

Testing hypotheses after preliminary testing

In a fixed model, we are again given three mean squares V1, V2, V3, with corresponding degrees of freedom, n l, n2, n 3, and expectations o~, 02, a3 z. The ANOVA table is given in Table 5.

Inference basedon conditionally specifiedANO VA models


Table 5 ANOVA table for a fixed model Source of variation Treatments Doubtful error Error Degrees of freedom n3 n, n2 Mean square V3 V, V2

427

Expected mean square o~ = a2(l + -~33) 02=02(1+ ~ ) 05

The experimenter is interested in testing Ho:o ~ = o 2 versuS Hi : 032>02.2 In Table 5, ~/> 0 is the noncentrality parameter of a noncentral X 2. The usual test statistic for It 0 is I/3/V 2 with n 3 and n 2 degrees of freedom. This is called the never-pool test. If h 1= 0, the investigator, from sound theoretical considerations, m a y pool V 1 with V2 for testing H 0 a n d use the test statistic V3/V with n 3 a n d nl+n 2 degrees of freedom, where V = (n 1 V 1 + n 2 V2)/(n 1+ n2). W h e n it is uncertain whether Xl = 0, a preliminary test is used to resolve the uncertainty, so we have a conditionally specified 2 2 model. The preliminary hypothesis is Hewo : o r = 02 z against HpT l : o 1 > o 2. If H v r 0 is accepted, V 1 and V2 are p o o l e d and the test statistic is V3/V; otherwise the test statistic V 3 / V 2 is used. This is called the sometimes-pool test which was studied by B a n c r o f t (1953), Bechhofer (1951), Bozivich, Bancroft, Hartley a n d H u n t s b e r g e r (1956), L e m u s (1955), Mead, Bancroft and H a n (1975). The admissibility of this test was studied b y C o h e n (1968, 1974). As an example, let us consider the model in a t w o - w a y classification with equal n u m b e r s a n d m o r e than one replication per cell. W e let y,j = + a, + g. + ( a B ) , j +

Z ai= Z ~j-~" Z (a~)iJ = Z (a~)iJ ~'~0,


i j i j

(33)

,jk--NID(0,4),
i=1,2,...,1,
j = l , 2 ..... J, k = l , 2 . . . . . K, for which the sum of squares can be partitioned as shown in the structure of the A N O V A in Table 5. Considering a as the row effect and fl as the c o l u m n effect, then we m a y view either a or fl as the " t r e a t m e n t " a n d (aft) the interaction m e a n square V 1 a "doubtful error". T h e CSI procedure is to reject H o : o 2 = 022, i.e. 2t3 = 0, if either or

(VI/V2>~F(al; nl, n2) a n d (V1/V2<F(al; nx, n2)


and

V3/V2>/F(a2; n3,n2) )

(34)

V3/V>/F(a3; n3,nl+n2) ).

428

T. A. Bancroft and Chien-PaiHan

The probability of rejecting H o is studied by Mead, Bancroft and Han (1975), the Patnaik's approximation for the noncentral X2 distribution is used to derive the recurrence fornmlas for the probability. The approxio, mate distribution of niV. is a~cix2(vi) where ~,i=n~+4?~?/(ni+4?ti) and c,.--1 + 2 ~ i / ( n ; + 2 ~ ) , i = 1,2,3, and the three mean squares are mutually independent. The probability of rejecting H 0 is the sum of the following two components corresponding to the two mutually exclusive regions in (34),

P, = P { V1/ V2 >~ F(a,; nl,n2) and V3/V2>~F(a2; n3,n2) ), (35) P2 = P ( g l / V2 < F ( a l ; nl,n2) and V3/V>~F(a3; n3,nlq-n2)}, (36)
and
P =: P1 + P2"

(3v)

The recurrence formula for P1 is found to be

Pl(a+l'b)=Pl(a'b)+[(a+l)B(a+l'ln2)]
1

- I ( I + x2 ] a+l

\ x--~]

Xl

Ix, a+l+-~nz, b+l ,

(38)

where

a=-~u3--1,
l+u I
Xl=

b=~vl--1,
2 X2 ~ _

l+Ul+U

l+u 2 _ l+u ~+u 2


n1 U 2 = -

n3 Ul= ---

n2c 3

F(a2; n3,n2),

n2c 1

F(al;

hi,n2).

The initial value is given by


1

xl
The recurrence formula for P2 is

x'\-2 nz'b + l

)
+

(39)

P:(a,b) = (1 - t)P2(a - 1,b) + tP2(a,b - 1)


t(1--q)
b

qz"~i~(b+ 1 n2, a

1)

(40)

Inference based on conditionally specified A N O V A models

429

where
q=(1
+ U2)

-1

t=(l+

,
n3

u3(1 + r.lU2) ]--1 xa= 1+

c3(l+uz)

u3

gll -~ H2 F(a3; n3,nl+n2)

The two initial values are


1

PE(0,b) =

E
-

t - x3(1 - q)

qx3

tb+l+~n~Ix; b + l , ~

t 'n2t
,

(41)

Pz(a,O)= Ixi( l n 2 , a + l)--ql"~Ix,( l n 2 , a + l)


1

--qx3 x 3- t

1 - t) ~+ 1+ ~,,~

where
tm ,X 1 - -

x3(1 -- q) t

t X 2 --

qtx3
t -X3(I --

q) '

t ( x 3 - t)
x;= (l_t)(t_x3(l_q)),

x3 - t
x'4 = -l---t "

Let us first consider the size of the sometimes-pool test, then compare the power with that of the never-pool test. The probability of a type I error is computed by setting a32= o22. The size of the test can be plotted to give size curves. Some graphs of size curves are shown in Mead et al. (1975) for a 2 = a 3 = 0 . 0 5 and ~1=0.10, 0.25, 0.50. The general behavior of the size __ 2 2_ curve under the fixed model is that it has its maximum at 012 : -ol /o 2 - 1; it decreases rather rapidly to 0.05 as 012 increases and usually stays above 0.05. Hence we may say that the sometimes-pool test in the fixed model is not conservative in type 1 error. (By conservative, here we mean that the type I error falls below the nominal level 0.05). It is seen that the behavior of the size curve under the fixed model is very different from that of the random model (see Section 3). As for the comparison of the power of the sometimes-pool test to that of the never-pool test under fixed model, Table 6 gives an illustrative example. When 012 is fixed, the power of the sometimes-pool test in Table 6 is always larger than that of the never-pool test except for 012---3.41. The

430

T. A. Bancroft arm Chien-Pai Hart

Table 6 Power comparison of the sometimes-pool test and the never-pool test under fixed model (nl,n2,n3)=(21,8,7), a I =0.25, a2= a3 =0.05
012 'rest 1.00 1.20 1.43 032 1.81 2.15 3.41

1.00 1.20 1.43 1.81 2.15 3.41

s.p. n.p. s.p. n.p. s.p. n.p. s.p. n.p. s.p. n.p. s.p. n.p.

0.099 0.099 0.074 0.074 0.066 0.066 0.058 0.058 0.055 0.055 0.052 0.052

0.155 0.146 0.131 0.112 0.110 0.100 0.093 0.089 0.087 0.085 0.081 0.081

0.215 0.204 0.197 0.160 0.168 0.145 0.142 0.130 0.129 0.124 0.118 0.119

0.342 0.305 0.305 0.247 0.280 0.227 0.232 0.206 0.215 0.198 0.189 0.190

0.450 0.395 0.402 0.328 0.376 0.304 0.319 0.279 0.292 0.269 0.255 0.259

0.772 0.678 0.728 0.605 0.714 0.576 0.674 0.544 0.645 0.531 0.559 0.51'7

difference of the two p o w e r s is the p o w e r gain. This c o m p a r i s o n is i m p o r t a n t to e x p e r i e n c e d i n v e s t i g a t o r s in p r o v i d i n g i n f o r m a t i o n o n p o w e r g a i n to b e e x p e c t e d for a r a n g e of values of 012. W h e n the i n v e s t i g a t o r has n o k n o w l e d g e of the value of 012, he m a y treat 012 as n o t fixed. T h e size of the s o m e t i m e s - p o o l test is e q u a l to the size p e a k at 012--1 a n d p o w e r c o m p a r i s o n m u s t b e m a d e at t h a t level. T h e p o w e r gain at 012 = 1 r e m a i n s the s a m e as before. But w h e n 012 > 1, the p o w e r gain b e c o m e s s m a l l e r b e c a u s e the comparisor~ is m a d e at a higher level of the n e v e r - p o o l test w h e n 012 is n o t fixed. I n T a b l e 6, the d o u b t f u l e r r o r degrees of f r e e d o m n 1 is larger t h a n the e r r o r degrees of f r e e d o m n2, this c a n o c c u r in designs with u n e q u a l subclass frequencies [see e.g. B a n c r o f t (1968)]. H o w e v e r in m o s t fixed models, n 1 is less t h a n n2; h e n c e the g a i n in degrees of f r e e d o m is s m a l l for the s o m e t i m e s - p o o l test. A t the s a m e time the size d i s t u r b a n c e m a y b e c o n s i d e r a b l e a n d there is little o r n o g a i n for the s o m e t i m e s - p o o l test. I n view of this, care m u s t b e t a k e n in the use of the s o m e t i m e s - p o o l test for the fixed m o d e l case. T h e g e n e r a l r e c o m m e n d a t i o n given b y M e a d et al. (1975) is as follows: W h e n the d o u b t f u l degrees of f r e e d o m are c o n s i d e r a b l y larger t h a n the error d e g r e e s of f r e e d o m , say n~ > 2n2, t h e level of the p r e l i m i n a r y test s h o u l d b e set a b o u t 0.25; w h e n n~ a n d n 2 a r e a b o u t equal o n e s h o u l d choose a I = 0 . 5 0 ; if n 1 is smaller t h a n n 2 a n d n 2 is r e a s o n a b l y large, the n e v e r - p o o l test p r o c e d u r e s h o u l d b e used. T h e a b o v e discussion c o n s i d e r e d the case of o n e d o u b t f u l error. I n m o r e c o m p l i c a t e d models, there m a y b e t w o or m o r e d o u b t f u l errors in the A N O V A table. S u p p o s e t h a t t h e r e a r e two d o u b t f u l error m e a n s q u a r e s V 1

Inference based on conditionally specified ANO VA models

431

and V2 which may be pooled with the error m e a n square V3. Let II4 be the treatment mean square, ni be the degrees of freedom associated with Vi and E(V~) = a~, i = 1,2, 3, 4. It is desired to test H0:042 = o2 versus H , : o42> o~. Whether to pool the doubtful error mean squares V~ a n d V2 with V3 can be made to depend on the outcomes of preliminary tests. One possible CSI procedure for testing H 0 is to reject H 0 if any one of the following mutually exclusive events occur:

{ V2/V3>~FI, V1/V3>~Fz and V4/V3>~I~3},


{ V2/V 3 )Fl, V I / V 3 < F z and V4/V13 ~>r4) ,

( V2/V3<Pl, V1/V23<

; and Vo/V123>

6),

(42)

{ V2/V3<r,. V /V23
where

and V./V2 >rT).

v,j = ( //,

+ nj Vj) / ,,ij,

V123 = (nl Vl '{- n2 V2 "~//3 V3)///123,


nij -~-n i -~- nj,

/2123~ nl +//2 +//3~


F l --- F ( a l ; nz, n3), F z --- F(a~; nl,n3), F 3 --- F(a3; no, n3), F4= F(a4; n4,nl3), F 5 = F ( a s ; nl, n23), F r = F ( a r ; n4,n123), F7= F(a7; n4,n23). G u p t a and Srivastava (1968), Saxena and Srivastava (1970) studied this sometimes-pool test procedure and obtained the size and power of the test by using Patnaik's approximation for the noncentral X2 distribution. In order to reduce the computations, further approximation to the power function is given by Saxena (197l). A different use of the CSI procedure in the fixed A N O V A model is to consider multiple comparisons following a significant F test. Usually when the treatments are declared to be different the investigator wishes to find the differences in specific treatment means. W h e n multiple comparison procedures are used after the F test of the treatment effect is significant, the F test is essentially a preliminary test. Then the effect of the preliminary test should not be neglected. The effects of the preliminary test on error rates are studied by Bernhardson (1975) and Smith (1974). Bernhard-

432

T. A, Bancroft and Chien-PaiHan

son employed a Monte Carlo study to evaluate ernperically the compario sonwise error rates and the experimentwise error rate of the combined use of the F test of A N O V A and five different pairwise comparison procedures. Smith derived an exact expression of the error rate when a contrast of treatment means from a one-way classification model is tested following a rejection of the test for the treatment effects.

4.2. Estimation after preliminary testing


Referring to the fixed A N O V A model in Table 5, we are interested in estimating the error variance o 2 with 02 as a doubtful error. A preliminary test estimator of 02 is defined as

v2

~= (n,V,+,,:Vg/(n,+n9

nl,n2), if VI/ Vz <F(a; nl,nz).


if E l l V 2 >~F ( a ;

(44)

In order to derive the expectation and MSE of 62, we need the joint distribution of V 1 and V2. These two r a n d o m variables are independently distributed with VI as a noncentral X2 variate and V2 as a central X2 variate. Using Patnaik's approximation, we have that nlV 1 is approximately distributed as O2clX2(Vl). The expected value and M S E of 62, given by Srivastava and G u p t a (1965), are

E(~2)=

=o211+ n,
.g
o,,

" ( Y ' Y +'

(45)

MSE('~)=~(1+ 2 ) + (n,+ ng~


o,

+2nln 2 1+ nl ]

( ~t,(xk-~-+l,~' "~+~) ~-

n' n2+2) n'+2n2) I x - ~ , -~n2 + 2 n2


where x= 1+ ......
n2c I

)11

(46)

nlF(a; npn2).

Inference based on conditionally specified A N O V A models

433

After studying the bias and M S E of 622 numerically, Srivastava and G u p t a (1965) recommended that the level of significance a of the preliminary tests should be 0.25 when n2>n l, but if n 1 and n 2 are small and n 1> n 2 , w e should choose a > 0.25. When there are two or more doubtful errors in the A N O V A and it is suspected that the expected m e a n squares are the same as the expected error mean square, the doubtful m e a n squares m a y be pooled with the error mean square by using preliminary tests. F o r example in completely crossed and nested factorial experiments with at least three factors, higher order interactions and factor effects m a y be doubtful errors. Suppose there are three mean squares V,. based on ni degrees of freedom and E(V/)= 0/2, i = 1,2,3. We assume o 2 <o2 and o 3 2 < o 2. 2 It is desired to estimate 02. The usual unbiased estimator of 02 is V3. However if o 2 = o~ a n d / o r o 2 = o 2, it is advantageous to pool V 1 a n d / o r V2 with V3 for estimating o3 z. A preliminary test estimator of o32 is defined as

62=

where V/j, V123 and the F values are defined in (43). G u p t a and Srivastava (1969) studied the bias and M S E of 62. In order to control the bias and MSE, they recommend that the level of significance, when all preliminary tests are made at the same level, should be in general greater than or equal to 0.25. Saxena (!975) studied a similar estimator involving two preliminary tests.

if V2/V3>>F l and V1/V3>~l~2, if V2/V3>~F ~ and V1/V3<F2, if V 2 / V 3 < F 1 and V1/V23>~Fs, if Vz/ V 3 < F l and V l / V23 <Fs, (47)

Vl3

Vz3 V~23

S. Conditionally specified regression models


It is well known that A N O V A models for classified data m a y be written as a linear model Y = X / 3 + e . In this section we consider the regression model with X of full rank. It is commonplace for a researcher to determine a more appropriate regression model by testing the regression coefficients of a preliminary assumed 'reasonable' model. Such tests are clearly preliminary tests when subsequent statistical inferences, e.g. prediction and estimation, are made, using the finally determined more appropriate model. The effects of these preliminary tests will be studied in this section, hence justifying the inclusion of such regression procedures as an important special case of CSI procedures using A N O V A .

434

T.A. Bancroft a n d Chien-Pai H a n

In applied statistics, the investigator may have a sample of n observations on k so-called independent variables or predictors, xi, and on Y for predicting the pertinent dependent variable Y. The usually assumed multiple regression model is then

rj~-,t~o--~lXij--~l~2x2j'qL . . . . ~kXkj"~Ej,

j = l , 2 ..... n

(48)

where the ej are normally and independently distributed with m e a n zero and identical variance 02, the x; are given constants. Based on the sample, the usual least squares estimator of the fli can be obtained, so the predicted value of Y can be readily computed for given x values. At some point in this procedure the investigator may decide that all the predictors he has included in his model are not really necessary to predict Y accurately. To obtain a more appropriate model objectively he tests the hypothesis that the coefficients of the doubtful predictors are all simultaneous equal to zero. Assume that the last k - m predictors are doubtful, then the hypothesis is H o ' f l m + l = f l m + 2 ..... f l k = O . If H o is accepted, he uses only the first m predictors in his model, if H 0 is rejected he retains the last k - m variables in his model. In such case his model is conditional on the outcome of the preliminary test, hence the inference procedure is a CSI procedure. The associated A N O V A table is given in Table 7. In the A N O V A table, R(X; y) and R(X~; y) are the reductions in sum of squares due to regression on all k predictors and the first m predictors respectively, and V is the unbiased estimator of o 2 from the full model. The test statistics for H 0 is F= R(X;y)- R(X.;y)

(k-m)V
(49)

The predictand of the CSI procedure described above is defined as

( Ym, ifF<F(a;k-m,n-k-1), Y= Yk, ifF>/F(a;k-m,n-k-1),


Table 7 A N O V A table for regression model Source of variation D u e to fitting full model D u e to fitting first m predictors Difference Error Degrees of freedom k+ 1 m+ 1
k- m n- k - 1

S u m of squares
R(X;y)

M e a n squares

R(X 1;y)
R(X;y)-

R(XI ;y )

R(X;y) - R(X l;y) ( k - m)


V

Y' Y- R(X;y)

Inference based on conditionally specified A N O VA models

435

where I7m and 17 k are predictands based on the first m predictors ^and all k predictors respectively. In order to derive the bias and MSE of Y, Larson and Bancroft (t963b) first transform the x variables so that they are all mutually orthogonal with m e a n zero and unit sums of squares. The expected value and MSE of ~I 7 are found as
k
i=1

i=m+l

B,x,,

(so)

MSE(I~) =

= 2 _1+
F/ i=1

xi2+h(O) E x~+[7(O)-2h(O)+l]
i=m+l

Xi i~m+l

(50
where
1 k

#//o2.

i=m+l

h(O)=P(r[>
7(0)=P

k--m r(a;k-rn, n - k - 1 ) ) k-mY2

( F ~ > k-m+4 k - m F(a;k-m,n-k-l)}

and F~ and F~ are distributed as the noncentral F distribution with noncentrality parameter 0 and degrees of freedom (k-m+2,n-k-1) and (k-m+4,n-k-1) respectively. It is seen that the bias and MSE of I~ depends on 0 as well as the values of the predictors. In the above selection procedure of the best subset of the predictors, the doubtful predictors were tested simultaneously and a single F test is used as the preliminary test. It often happens that the investigator selects the predictors individually and test the significance of the regression coefficients in a sequential manner. So he tests a particular coefficient first, if he rejects the hypothesis, the variable is retained in the regression; if he accepts the hypothesis, the variable is deleted. After testing one variable, he selects another variable for testing. The process goes on sequentially until all doubtful predictors are tested. The different processes used for testing the coefficients give different selection procedures. These procedures are usually referred to as particular kinds of stepwise regression. In some situations, the investigator has prior information about the predictors and is able to rank the predictors according to their order of importance. Without loss of generality we m a y let xk be the least important, x k_ 1 the next least important, etc. and xm+ 1 the most important

436

72. A. Bancroft and Chien-Pai Han

variable in the doubtful set. A natural ordering of the predictors is found in polynomial regression. In such a case the highest power m a y be judged as the least important variable. Given the ordering, Kennedy and Bancroft (1971) considered two model building procedures (1) Sequential deletion procedure: For the multiple regression model in (48), the investigator decides that the last k - m (O<m < k - 1 ) predictors are doubtful and ranks the doubtful predictors in order of importance with x k being the least important. H e first tests the hypothesis that t~ is zero. If he accepts this hypothesis he deletes x~ and tests that t ~ - t is zero. If he accepts this second hypothesis he deletes xk_ 1 and tests tk-2, etc. He continues deleting variables in this manner until he rejects a hypothesis that a coefficient is zero, or until he reaches tim, then he retains in his prediction equation the variable corresponding to that coefficient and all other variables whose coefficients he has not yet tested. So the predictand for the sequential deletion procedure is

}~*-~- I Yk-i

if event A i happens, i = 0, 1..... k - m - l, if e v e n t Ak_ m happens,

/
where

l~,~

(52)

A i = { Sequentially accept t k = 0, tk-1 = 0 . . . . . t~-i+l = 0 and reject t k _ i = 0},


Ak- m
= (Sequentially

accept fig = O,

t k _ l = 0 ..... tm+2=0, tm+l =0).


The associated A N O V A table for testing is similar to Table 7 with the sum of squares R(X;y)-R(X1;y ) further divided into k - r n terms, each term accounts for the reduction in sum of squares for each one of the doubtful predictors with one degree of freedom. Larson and Bancroft (1963a) considered the case when a 2 is known. Kennedy and Bancroft (1971) derived the bias and MSE of I~* when a 2 is unknown and the preliminary tests are made at the same a level, i.e. the hypothesis/3. = 0 is rejected if the corresponding F test statistic is larger than F ( a ; 1 , n - k - I ) . Let 7 = F ( a ; 1 , n - k - 1 ) / ( n - k - 1 ) , ~.=fli2/202, G,(z[~) denote the cumulative distribution function of the noncentral X2 with s degrees of freedom and noncentrality ~, g(y) denote the density function of the central X2 with n - k - 1 degrees of freedom a n d P(Ai) denote the probability of the event Ai. The bias and MSE of Y* are found

Inference based on conditionally specifiedANO VA models to be Bias(Y*)= k E


j=m+l k

437

[ k--j--|

fljxj

E
s=O

P(A~) +
2

H(Ak_y ) -- 1],
k k--j-- 1

(53)

MSE(1)*) =

Z j~m+l + o2

\i=m+l +
i=1

--2 P(Ak_j) + a -2~xj


i~l

[3ixi H(Ak_j)

+ fl/x2T(Ak_j)

+o2( +Z=xT)P(A,
where k-1
i=k

gx,)2,

(54)

gx,-O,

F,(yyl~.) ] [ 1 - F3(yy[2tk_i) ] g(y)dy,


j= --i+!

T(Ai) = f

j=k-i+l

F,(yy[2~) ]I 1 - Fs(yy[2~k_i) ] g(y)dy.

(2) Forward selection procedure: The investigator also decides that the last k - m predictors are doubtful and they are ranked as before. He first tests the hypothesis that flm+l is zero. If he rejects this hypothesis he adds x,,+l to the model and tests that tim+2 is zero. If he rejects this second hypothesis he adds xm+ 2 to the model and tests that tim+3 is zero, etc. He continues adding variables to his prediction equation in this manner until he arrives at a variable whose coefficient does not differ significantly from zero, at that point he does not add that variable to the model, nor does he add the variables whose coefficients he has not yet tested. So the predico rand of the forward selection procedure is given as ]~** = / I~m+i if event

B i happens,

i = O, 1..... k - m - 1,

Yk

if event B k_m happens.

(55)

438

T. A. Bancroft and Chien-Pai Han

where Bi = { Sequentially reject tim + i = 0,


/~m+2 = 0 . . . . .

[~m+i= 0

and accept/?m+g= 1= 0 } ,
0,

Bk- m

= { Sequentially

reject tim + 1 =

tim + 2 ~ 0 . . . . . / ~ k - 1 = 0 , flk = 0 } .

The bias and MSE of I)** are given in Kennedy and Bancroft (1971) and are omitted here. After an extensive numerical study of the bias and MSE, they conclude that the sequential deletion procedure is more efficient than the forward selection procedure, hence recommend the use of the sequential deletion procedure in practice. Further, based on the numerical results, the level a =0.25 for tests made in application of this procedure appears to be very appropriate. Prediction is an important use of multiple regression. In order to obtain a reliable prediction, the best subset in the regression model is selected so that the bias and mean square error of the predictand are made as small as possible. In the CSI procedure, we determine the significance levels to achieve the best selection. Another important use of regression is to determine a linear relationship between the dependent variable and the x variables, and we are interested in estimating the regression coefficients. If it is decided that some regression coefficients are zero, the corresponding variables are omitted in the regression for estimating the remaining regression coefficients. Bancroft (1944) is the first to consider the estimation of fll in the model
y = fllxl + 32x2 + e

(56)

after a preliminary test of the hypothesis that/32 is zero. If the hypothesis is accepted,/31 is estimated from the model y =/~lXl + e; if the hypothesis is rejected, /71 is estimated from equation (56). Bock, Yancey and Judge (1973) considered the general linear model

v=xt +

(57)

where Y is a n 1 vector of observations, X is a n k matrix of nonstochastic variables of rank k, ~ is a k x 1 vector of unknown parameters; and e ~ N ( O , o2I), I is the identity matrix. The unrestricted least squares estimator of I~ is b--(X'X)-~X'Y. Suppose the investigator has prior information but is not certain that the regression coefficients satisfy the following restriction

Rl~=r

or R l ~ - r = 0 ,

where R is a (d k) known matrix of rank d and r is a known vector. The

Inferencebasedon conditionallyspecifiedANO VA models


restricted least squares estimator is
b - M ( R b - r),

439

where M = S - I R O R S - 1 R ' ) -1 and S=X'X. Because the investigator is uncertain whether the restriction (58) holds, he may test the hypothesis that R ~ = r . Then he decides to use the unrestricted or the restricted least square estimator depending on the outcome of the preliminary test. The likelihood ratio test statistic for Ho: R ~ = r is

U= (n - k ) ( R b - r)'(RS- ' R ' ) - ' ( R b - r)


d ( Y - X b ) ' ( V - Xb)

(59)

which has a central F distribution with (d, n - k) degrees of freedom under the null hypothesis. Alternatively, if H 0 is incorrect, U has a noncentral F distribution with (d, n - k) degrees of freedom and noncentrality parameter
2~ = ( R ~ -- r ) ' ( R S - IR') - 1(11~ - r ) / ( 2 0 2 ) .

The preliminary test estimator is given as b-M(Rb-r) b if e < F ( a ; d , n - k ) , if U>F(a; d , n - k ) . (60)

/]=

The expected value and covariance matrix of /3 obtained by Bock et al. (1973) are E(]~) =ill - p I()k)M(R~ --r), Var(/]) = o2S -1 - or~I(a)MRS where
pi(x) = e <
1

(61)

+ [ 2p,(X) -p~(X) -pz(X) ] M(R~ - r) (RI3 - r)'M',

F(a; d , n - k ) }

and F~ is distributed as the noncentral F distribution with (d+2j, n - k ) degrees of freedom and noncentrality parameter )t. It is seen that the expected value and covariance matrix of ~ depends on X which in turn depends on the restriction in (58). So the selection of the significance level would also depend on the particular restrictions on the parameters specified by the investigator.

440

T. A. Bancroft and Chien-Pai Han

References
Bancroft, T. A. (1944). On biases in estimation due to the use of preliminary tests of significance. Ann. Math. Statist. 15, 190-204. Bancroft, T. A. (1953). Certain approximate formulas for the power and size of a general linear hypothesis incorporating a preliminary test of significance. Unpubl. preliminary report, Statistical Laboratory, Iowa State University. Bancroft, T. A. (1964). Analysis and inference for incompletely specified models involving the use of preliminary test(s) of significance. Biometrics 20, 427-442. Bancroft, T. A. (1965). Inference for incompletely specified models in the physical science (with discussion). Bulletin of the International Statistical Institute, Proceedings of the 35th Session, 41(1) 497-515. Bancroft, T. A. (1968). Topics in Intermediate Statistical Methods, Vol. 1o Iowa State University Press, Ames. Bancroft, T. A. (1972). Some recent advances in inference procedures using preliminary tests of significance. In: Statistical Papers in Honor of George W. Snedecor, Ch. 2, 19-30. Bancroft, T. A. (1975). Testimating testipredicting and testitesting as aids in using Snedecor and Cochran's Statistical Methods. Biometrics 31, 319-323. Bancroft, T. A. and C. P. Han (1977). Inference based on conditional specification: a note and a bibliography. Internat. Statist. Rev., 45, 117-127. Bechhofer, R. E. (1951). The effect of preliminary tests of significance on the size and power of certain tests of univariate linear hypotheses. Unpubl. Ph.D. thesis, Columbia University. Berhardson, C. S. (1975). Type I error rates when multiple comparison procedures follow a significant F test ANOVA. Biometrics 31, 229-232. Bock, M, E., T. A. Yancey and G. G. Judge (1973). The statistical consequence of preliminary test estimators in regression. J. Amer. Statist. Assoc. 68, 109-116. Bozivich, H., T. A. Bancroft and H. O. Hartley (1956). Power of analysis of variance test procedures for certain incompletely specified models, I. Ann. Math. Star. 27, 1017-1043o Bozivich, H., T. A. Bancroft, H. O. Hartley and David V. Huntsberger (1956). Analysis of variance: preliminary tests, pooling and linear models. WADC Technical Report, Volume I, 55-244. Cohen, A. (1968). A note on the admissibility of pooling in the analysis of variance. Ann. Math. Stat. 39, 1744-1746. Cohen, A. (1974). To pool or not to pool in hypothesis testing Jo Amer. Statist. Assoc. 69, 721-725. Dalenias, Tore (1976). Sample-dependent estimation in survey sampling. In: Contributions to Applied Statistics, 39-44. Fisher, R. A. (1920). A mathematical examination of the methods of determining the accuracy of an observation by the mean error and by the mean square error. Monthly Notices of the Royal Astronomical Soeiety, 758-770. Gupta, V. P. and S. R. Srivastava (1968). Inference for a linear hypothesis model using two preliminary tests of significance. Trabajos de Estadistica 19(3), 75-105. Gupta, V. P. and S. R. Srivastava (1969). Bias and mean square of an estimation procedure after two preliminary tests of significance in ANOVA Model I. Sankhy~ A 31(3), 319-332. Han, C. P. (1978). Nonnegative and preliminary test estimators of variance components. To appear in the J. Amer. Statist. Assoc. Han, C. P. and T. A. Bancroft (1968). On pooling means when variance is unknown. J. Amer. Statist. Assoc. 63, 1333-1342. Herbach, L. H. (1959). Properties of model II type analysis of variance tests. Ann. Math. Statist. 30, 939-959.

Inference based on conditiorgdly specified A NO IrA models

441

Jaln, R. C. and V. P. Gupta (1966). A note on bounds of the size of a sometimes pool test procedure in ANOVA Model II. Trabajos de Estadictica 17(2), 51-58. Kennedy, W. J. and T. A. Bancroft (1971). Model building for prediction in regression based upon repeated significance tests. Ann Math. Star. 42, 1273-1284. Kitagawa, T. (1963). Estimation after preliminary tests of significance. University of California Publications in Statistics 3, 14'7-186. Klotz, J. H., R. C. Milton and S. Zacks (1969). Mean square efficiency of estimators of variance components. J. Amer. Statist. Assoc. 64, 1383-1402. Larson, H. J. and T. A. Bancroft (1963a). Sequential model building for prediction in regression analysis, I. Ann. Math. Stat. 34, 462-479. Larson, H. J. and T. A. Bancroft (1963b). Biases in prediction by regression for certain incompletely specified models. Biometrika 50, 391-402. Lemus, F. (1955). Approximations to distributions in certain analysis of variance tests. Unpubl. M.S. thesis, Iowa State University. Mead, R., T. A. Bancroft and C. P. Han (1975). Power of analysis of variance test procedures for incompletely specified fixed models. Annals of Statistics 3, 797-808. Patnaik, P. B. (1949). The noncentral X2 and F distributions and their applications. Biometrika 36, 202-232. Panll, A. E. (1950). On a preliminary test for pooling mean squares in the analysis of variance. Ann. Math. Stat. 21, 539-556. Saxena, K. P. (1971). On power of a STPT procedure in ANOVA Model I using certain approximations. Estadistica 29, 44-53. Saxena, K. P. (1975). Estimation of variance using two preliminary tests in ANOVA Model-I. Biometrische Z. 17, 308-324. Saxena, K. P. and S. R. Srivastava (1970). Inference for a linear hypothesis model using two preliminary tests of significance. Bull Math. Stat. 14(1, 2) 83-102. Scheff~, H. (1959). The Analysis of Variance, John Wiley, New York. Searle, S. R. (1971). Topics in variance components estimation. Biometrics 27, 1-76. Searle, S.R. (1973). Univariate data for multi-variable situations: Estimating variance components. In: D. G. Kabe and R. P. Gupta eds., Multivariate Statistical Inference. North-HoUand, New York, 197-216. Sing,h, J. (1971). Pooling mean squares. J. Amer. Statist. Assoc. 66, 82-85. Smith, W. C. (1974). The combination of statistical tests of significance. Unpubl. Ph.D. thesis, Iowa State University. Snedecor, G. W. and Cochran, W. G. (1967). Statistical Methods. 6th edition, Iowa State Press, Ames. Srivastava, S. R. (1972). Pooling mean squares in ANOVA Model Ii. I. Amer. Statist. Assoc. 67, 676-679. Srivastava, S. R. and H. Bozivich (1962). Power of certain analysis of variance test procedures involving preliminary tests. Bulletin of the International Statistical Institute, Proceedings of the 33rd Session 39(3), 133-143. Srivastava, S. R. and V. P. Gupta (1965). Estimation after preliminary testing in ANOVA Model I. Biometrics 21, 752-758. Thompson, W. A. Jr. (1962). The problem of negative estimates of variance components. Ann. Math. Statist. 33, 273-289. Thompson, W. A. Jr. and Moore, J. R. (1963). Nonnegative estimates of variance components. Technometrics 5, 441-450. Wang, Y. Y. (1967). A comparison of several variance component estimators. Biometrika 54, 301-305.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 443-469

| A 1

Quadratic Forms in Normal Variables

C . G, K h a t r i

1.

Introduction

In the theory of least squares, in variance c o m p o n e n t analysis, in estimation including M I N Q U E theory and testing of hypothesis, and in some problems in time series analysis, quadratic forms play an important role. In this Chapter, we shall study the distribution aspects of quadratic forms in normal variables only. In nonparametric analysis a n d in goodness of fit, the asymptotic distribution theory of goodness of fit Chi-square statistic is given in m a n y text books under the assumption of multinomial variates, see for example Rao (1973). Let x 1..... x n be independent observations with means/~1 ..... /~n and the same variance 02. If /~ ..... /~n are known functions of some unknown parameters to be estimated on the basis of x l , . . . , x n, we minimize the quadratic form ~ = X~= l(Xi- ~g)2. In the analysis of variance and covariance, we have (/~1. . . . . t ~ ) ' = p = A O where A is an n n known matrix depending on the structure of the design of experiment a n d possibly on concomitant variables in covariance analysis, and 0'=(01 ..... On) is an unknown vector depending on the effects due to treatments, blocks, interactions or regression effects due to concomitant variables. Then, q~ is decomposed into various sum of squares due to such effects (including residual effects due to error component e = x - p ) as ~=~kiffi~q i where ql . . . . . qk are quadratic forms. In M I N Q U E theory, we have the model

i=!

where V(e)=covariance matrix of e = x - l z , B l . . . . . B r are known matrices and olz..... o~ are unknown parameters to be estimated b y the quadratic forms x'Agx so that the n o r m of Ag is minimum subject to some restrictions
443

444

C. G. Khatri

on the A i. This part will not be discussed. In its place, some estimates of p are given under the above variance model of 0. Since, in the study of quadratic forms, we use the basic assumption of normality, one would like to know how far this basic assumption is valid when we know that a quadratic form is distributed as a Chi-square variate, or a linear function and a quadratic form are independently distributed, or the regression of a quadratic form on a linear function is constant. We give some well known results for these situations. In finding a structure of a quadratic form under the assumption of normality, we mention some 2 __ results on the conditions when R02= ~ ni=lt"X i - N~) - 2 is equal to R e - Y ' i n= l ( x i - ~ i ) 2 where ~ / = s o m e function of x l , . . . , x . and P'--(t~l ..... t2n) is a solution of the normal equations A ' A p = A ' x . These are mentioned after the distribution theory. W e do not mention the actual time series situations in the applications but the readers are referred to the work of Krishnaiah a n d Sen (1970), H a n n a n (1970), Liggett (1972) and Britlinger (1974). Further, we omit the discussion on multivariate g a m m a distribution and the joint distributions of correlated quadratic forms. For this and its applications, one can refer to Krishnaiah (1977). This will be found in a separate Chapter.

2.

Notations

A : n Xp means a matrix of order p n. A - , A', A* and f(A) indicate respectively a g-inverse of A satisfying A A - A = A , the transpose of A , the conjugate transpose of A, and the rank of A. A + will be denoted as the M o o r e - P e n r o s e inverse of A satisfying (i) A A +A = A , (ii) A +AA + = A +, (iii) A +A and A A + are symmetric for real matrices (or Hermitian for complex matrices). Kronecker product between A = (aij) and B is denoted by A B - - ( a o B ). 7ilae order of I (identity matrix) and 0 (null matrix) will be understood by their contexts. If there is a need, the identity matrix of order n n m a y be written as I n. By a spectral decomposition of a matrix A , we mean a representation given by
m

A= E
i=l

where wl, w 2. . . . . wm are distinct nonzero eigenvalues of A , Pi2=Pi and PiPg,=O for i i ' = 1,2 ..... m. We m a y observe that a spectral decomposition may not be possible for all matrices, but it does exist for real symmetric (or Hermitian) matrices and in this situation, Pi will be real

Quadratic.formsin normalvariables

445

symmetric (or Hermitian). If A is an n n matrix, then C h l A and C h . A mean respectively the m a x i m u m and the m i n i m u m eigenvalues of A, provided eigenvalues of A are real. We shall use the following short abbreviations: i.d.=independently distributed, i.i.d. = independently and identically distributed, r . v . - - r a n d o m variable, r.v.v. = r a n d o m vector variable, d.f. = degrees of freedom, mgf = m o m e n t generating function, p.d. =positive definite, p.s.d. = positive semidefinite, and ' ~ ' - - i s distributed as'. The notation x~N(IX, V) indicates that x is distributed as normai with mean vector ix and covariance matrix V. If X :p x n = (xl,..., Xn) and/~ :p X n (/~1..... IX,), then X~Np,n( ~, gl, I/2) ! I ! t , I ! indicates that x(*) - ( x l , x 2..... x,) ~ N ( # ( ) , V) with IX(')-(IXl..... IX,) and V = V2 V v If X~Ne,,,(I~, V,I,), then S = X X ' will be said to be distributed as non-central Wishart with n d.f. and non-central parameters /~/~' having a scale factor V and this will be denoted by S ~ W(n, V, I~t~') whose mgf is given by
, ~ !

lip - 2ZVI -"/2 etr((Ip - 2 Z V ) - I z a ) ,

(2.1)

where f~= N~', Z is any real symmetric matrix such that I - 2 Z V > 0 (i.e. I - 2 Z V is p.d.) and etr(. ) = exp(tr(. )). When p -- 1, then S = s is distributed as non-central Chi-square with n d.f. and noncentral parameter a = w having a scale factor V - - v and this will be denoted by s/v~xZ(n,,o/v). Its mgf is given by (2.1) by replacing p = 1, Z = z , V = v a n d f~= w. When ~2=0 (or/~=0), W(n, V,0)= W(n,V) and x2(n,O)--x2(n). If n<p, S has a pseudo-Wishart distribution and when f ( V ) < p , then S has a singular Wishart distribution. x = x l + X / ( - 1)x2 is said to be distributed as complex normal denoted by CNp(IX, V) where IX=IX1+ X / ( - 1)IX2 and V = V 1+ ~ / ( - 1) V2 provided

where V is Hermitian p.s.d., (see G o o d m a n (1961)). As in the real case, we can define the complex Wishart distribution and this will be the distribution of XX* in which X~CNp.n(t~,V, In). W h e n p = l , 2 X X * = 2 s ~ vx2(2n, w/v) where V = v and ~2= w. Let x and y be r.v. Then, x and y are said to be uncorrelated of order (r,s) iff p(i,j)=Cov(xi,yi)=O for all i = 1,2 . . . . . r and j = 1,2 ..... s. This condition is equivalent to X(i,j)=(i,j)th cumulant of ( x , y ) - - 0 for i = 1,2 ..... k and j = 1,2 . . . . . s (see Khatri (1961)).

446

C. G. Khatri

3,

Preliminary results

Let x ~ N n ( / ~ , V ) and let ~ ( V ) = r ( < n ) . Then, Anderson (1958) has shown that there exists a r a n d o m vector y : r 1 such that x = By + / , and y ~ Nr(0, It) where V = BB' a n d B is an n r matrix of rank r. Then, q = x'A x + 21'x + c =

y'A o)Y + 21'(l)y +

co),

(3.1)

where A is a symmetric n n matrix,

Ao)=B'AB , Ro)=B'(AI~+I )
Using f

and

co) = #'A/~ + 21'/L+ c. (3.2)

exp(-y'Py+b'y)dy=(2rr)r/Z[P[-1/2exp(b'P-lb/2)

(3.3)

y~R ~

with P > 0, the mgf of q =

x'A x + 21'x +

c is given by

]/~ _ 2tA O)[-1/2 exp(tco) + 2t21,(1)(/~ _

2tA(1)) --li(l))

(3.4)

for all real t for which 2 t C h l A o ) < 1. F r o m (3.4), we get

E(q) = trA
and

V + / , ' A ~ + 21'/~ + c

(3.5) (3.6)

V(q) = 2 tr(A V) 2 + 4 0 + A ~)' V(I + A/Qo Let the structural representation of AO) be given by

B'AB=A(o= ~ )tj),
j=l

Ej2=E), EiEj=O f o r i ~ j

(3.7)

where X~ >Xz > " " >Xm, Xj v~0 for all j, and the multiplicity of Xj= ~(Efl= trEj =fj, (say). Let E o = 1 - YTm=1Ej. Then, Eo2 = E o and Further, we observe that E o Ej = E j E o = 0 for a l l j = 1,2 . . . . . m. (3.8)

(I--2A(,)t)-'= Eo+ ~
j=l

(1--2t~)Ej

=Eo+

~ (1-2/2ta)-'Ey
j=l

Quadraticforms in normal variables

447

and
[ I - 2 t A ( , ) ] = H (l-2t?~j) ~.
j~l

Then, the mgf of q = x'Ax + 21'x + c is given by


m

exp(tc(2) + 2Pot2)j~=l [ ( 1 - 2 t X i ) - Y # 2 e x p ( X j t ( p j / X f ) / ( 1 - - 2 t h j ) ) ] (3.9) where for j = 0, 1,2 ..... m, pj = l'(a)Ejl(1) = (I + A # ) ' ( B E ~ B ' ) ( I + A#) j=l and

(3.10)

From (3.9), we get the following

L~M~ 1. Let ,,--N,O,, V). Vhen, q = x ' A x + 2 r x + c~XT=~82(~,~/X 9)


+ V where X~ ..... X~ and V are independently distributed, the X~ are non-central Cki-squares and U~N(c(2),4~o). Here, )~f s, f f s, ~' s and c(2) are defined in (3.7) and (3.10).

In the above lemma, we have used the decomposition of V as V= B B ' where f ( B ) = the number of columns of B, and the spectral decomposition of B ' A B given by (3.7). To obtain these quantities directly from the matrices V and A, we observe that Xl ..... )~m are the distinct nonzero eigenvalues of VA (or A V) with respective multiplicties fl,f2 . . . . . fro" Then, we require B E j B ' for calculating uj (j--0, 1,2 ..... m). For this, we can use the following LEMMA 2. Let A be a Hermitian matrix and V = BB* where B is an n r matrix of rank r ( = .~(V)). I f the spectral decomposition of B * A B is m IXj E,j, E 0 = I - Y T = mI E j and ~0=0, then B * A B = j=

BEiB*= and

V ]-[ ( A V - X j I j=0
j~i

(h i -

j~O

(AV-

i)=o

448

C. G. Khatri

The last portion of the above result was established by Khatri (1977b) and by Baldessari (1967) for nonsingular V. Khatri (1977b) has established

j ~a-i

We have mentioned above the importance of ~'s and f ' s . Suppose, it is easy to calculate trA/1 ) for i = 1,2 ..... 2m. Good (1969) has shown that if trAo)=trA~l)=f(Ao)), then A0)=A~I ) (that is, A0) is an idempotent matrix). For m = 2, Khatri (1977b) has stated that for X14=X2@0, trA i _ i +X~f2 O)--?tlfl for i = 1,2,3,4 and f(Ao) ) = f l + f 2 ,

iff ~1 and ~k2 a r e the distinct nonzero eigenvalues of A0) with multiplicties fl and f2- It appears that the condition ~(A0))=f 1+f2 should be replaced by the condition f [ A 0 ) ( A o ) - (?t1+ ?,2)I)] = f l +f2; otherwise the result may not hold. Now, let us consider a matrix random variable Q = X A X ' + L1X' + XL"2 + C where A is a given symmetric matrix and X ~ N p , n(Iz, V1, V ). If V 1= B1B ~ and V= BB' where the column vectors s and r of B l and B are linearly independent, then there exists a random matrix Y such that X -~ bt + B 1 YB', Y~N~,r(O, I s, It) and Q = B~ YA(1 )Y'B[ + L(1) Y'B~ + B 1 YL~2 ) + C(l ) where A (1)= B'AB, L(i ) = (L i + IzA)B and C(O = p.4/~' + L 1/z' +/LL~ + C. We observe that for any matrix Z and Z0= ( Z + Z')/2,

tr ZQ = tr( B; ZoB 1YAo) Y') + tr( B~ ZL(1) + B~ Z' L(2)) Y' + tr ZCo)
and if Y = (Yl..... yr) and y' = (y] ..... y'~), then y ~ Ns~(0, I). Further if A(1) B;ZoB 1= A(x), B~(ZLo) + Z'L(2)) = 201 .... , !~) and I' = (l'1..... !~), then tr ZQ = y'A (x)Y+ 21'y + tr ZC(1) and using (3.4), the mgf of Q can be written as [Irs - 2A (x)]- 1/2 etr(ZCo) ) exp(21'(I- 2A (~)) -11). (3.11)

Then, using the spectral decomposition of A0) as given in (3.7), (3.11) can be rewritten as { fi
j=l

[I-2~kjV1Z0[-g/2} etr I ZCo)+


L

~ (-l-2)kjViZo) -l
j=O

+ Z'L(2))'/ (3.12) J

Quadraticforms in normal variables

449

with X0 = 0. If L(~) = L(2 ) and Co) is symmetric, then the mgf of Q (see also Khatri (1963, 1964)) can be written as

etr(Z0C<=>+2 g ZoaoZo)

j=l

{ lZ- 2XjV,Zol
etr(( 1 .- 2~j Z o V~ ) - 1 Z o ~ j/)~j) } (3.13)

: where f~j = Lc1)EjL(I) = ( L I + p~t)(BEjB !)(L 1 ~ IM) t for j = 0 , 1 , 2 , o. , m and C(z) = C(1) - Y7 m ~(f~JXj.). Then, we get the following

LEMMA 3. Let X ~ Np, n( ~' VI' V) and let Q = X A X ' + L 1 X ' + XL~ + C be a symmetric matrix for all permissible values of X. Then, ( L l - L 2 ) V = O and CO) = tl~Al~' + L 1 ~ ' + I~L~ + C is symmetric. Further, Q m 1 Z~=lXj.W:(fj, V1, ~'~,j./ )tj2 ) + 5 ( Y + Y') where WI, . . . , W m and Y are i.d., the Wj are non-central Wisharts and Y ~ N : , p ( C(2), 4~2o, V~) or Nv,p( C(2), V~,4a0). Khatri (1964) has given the above result when L 1 ~ - L 2.

4. Necessary and sufficient conditions for Chi-squaredness and independence


The following is an immediate consequence of L e m m a 3 and it is an important one. Further, one can refer to Khatri (1977b) for the following THEOREM 1. Let X ~ N p , n ( t ~ , V i , V ) and Q = X A X ' + L I X ' + X L ' 2 + C where A is a symmetric matrix. Then Q~Y,]=l)kj Wj.(fj, Vl,f~j/~j2) for distinct nonzero Xj ( j = 1, 2 . . . . . m) iff (i) Xl,)t2 . . . . . A m are distinct nonzero eigenvalues of VA (or A V) with multiplicities fl . . . . . fro, (ii) L 1 V = L2V, (iii) (L 1+ t~A) V = L V A V for some matrix L and (iv) ~: = (L 1 + t L A ) ( B E j B ' ) ( L 1+ t~A)' for j = 1, 2 , . . . , m and /M/~' + L,/~' + /~L~ + C = ]~m:=l(~'~j/~j) = (L~ + t~A ) V( VA V ) - V( L l + l~A )'. Here V = B B ' , f ( B ) = the number of columns of B and the spectral decomposition of B ' A B is B ' A B = Y ~ . = I ) t j E j. We can use L e m m a 2 for calculating B E j B ' , for j = 1,2 ..... m. T h e condition (i) can be written as I I j m 0 ( B ' A B - ~ I ) = 0 and

j=o

~I (B'AB-XjI)~O

for i = 0 , 1,2,..o ,m.

450

C G. Khatri

See also Baldessari (1967). The conditions for central Wishart variates can be given by (i), (ii) and the following conditions: C 0 ) = 0 and (L 1+/~A) V = 0. The special case of m = 1 is useful to see whether Q is distributed as Wishart or not. Hence, the case for m = 1 is given by THEOREM 2. Q ~ 2 t W ( f , Vl,~]/2t 2) iff (i))~ is the nonzero eigenvalue of VA (or A V) repeated f times, (ii) L l V= L2V, (iii) (L 1+ ~A) V= LVA V for some matrix L and (iv) f~ = ( L I + t~A) V( L 1+ tzA)' and I~At~'+ L 1I~' + I~L~+ C = (L~ + ~A) V(L, + ~A)'/~. Q ~ W ( f , V~) iff (i) VA VA V=)tVA V, (ii) L~ V = L 2 V and (iii) (L 1+/~A) V = 0 =/LA/z' + L 1/~' +/~L'1+ C. This Theorem 2 when L 1= L 2 was established by Khatri (1963, 1964) and the comments on the condition (i) were given by various persons like Hogg (1964), Shanbhag (1968, 1970), G o o d (1969), Styan (1970), and Khatri (1978). We mention some equivalent conditions using L e m m a 2. These are 03 V A V ( A V - X I ) = O , (i") tr( VA)/)t = tr( VA)2/~ z = ~( VA V) = f and (i"') t r ( V A / ) g g = f for i = 1,2,3,4. It may be observed that the condition (i) was written by G o o d (1969), the condition (i') was given by Khatri (1964), the condition (i") was given by Shanbhag (1968). Khatri (1978) has given the following TI-IEOR~M 3. Let X ~ Np,n(bt, Vi, V), Q = X A X ' + L ~ X ' + X L ' z + C, (L 1- Lz) V=O, (L 1+ I~A) V = LVA V for some matrix L and I~AtL'+ L 1t~' + I~L~+ C = ( L 1+ tLA)V(L 1+ I~A)'/~. Then, (a) Q ~ W ( ~ ( A ) , V l , ~ / X z) iff A VA =hA or tr( VA /)t)i= ~(A) for i= 1,2; (b) Q ~ W ( ~ ( V ) , V ~ , ~ / 2 t 2) iff VAV=2tV or t r ( V A / ) 9 i = ~ ( V ) for i= 1,2; (e) Q ~ W(~( VA ), V 1, f~/2t z) iff ( VA /~)2 = ( VA /29 or tr( VA /2t) ~= ~( VA ) for i = 1,2 and (d) Q~2tW(~(A V), V,f~/~ 2) iff (A V/)t) 2 =(A V / h ) or tr(A V/)~)i= ~(A V) for i=1,2. The results (a), (c) and (d) are the same if V is nonsingular while they will differ in degrees of freedom if V is singular. Theorems 1, 2 and 3 are valid for complex normal variates by changing Np,,, by CNp,n, W by C W and real symmetric matrices by Hermitian matrices. Hence, there is no need of rewriting these results.

Quadraticforms in normal variables

451

The problem of finding the necessary and sufficient conditions for two quadratic forms to be i.d. is considered by a number of persons, like Craig (1943), Sakamoto (1944, 1949), Matusita (1949), Ogawa (1949), Aitken (1950), Carpenter (1950), Hotelling (1950) and Lancaster (1954). Craig's and Hotelling's proofs were in error, see for example, Ogawa (1949). The following result was given by Khatri (1963, 1964) and, for nonsingular covariance matrix V, by Laha (1956): THEOREM 4. Let x-~Nn(/~, V) and q~==x'Aix + 21~x + ci, i = 1,2, where A 1 and A 2 are symmetric matrices. Then, ql and q2 are independently distributed (i.d.) iff (i) VA I VA2V=O , (ii) VA 2 V(A 11 + 11) = VA 1 V(A2~ + 12) = ~ and (iii) (11 + A x/~)' V(I 2 + A2/~) = 0. If we know that 1~+A~p~=A i Vd~ for some vector d i for i = 1,2, then qj and q2 are i.d. iff V A I V A 2 V = O .
NOTE 1.

We observe that

A 1V A 2 V - -

if f ( V A I ) = f ( A 0 or ~ ( V A 1 V ) = f ( V A 1 ) , then V A 1 V A 2 V = O ~ 0, while (b) if ~(VA2)= f(A2) or ~ ( V A 2 V ) = ~(VA2), then V A I V A z V = O ~ VAIVA2=O. This shows that if either V is nonsingular, or f ( V A 1 V ) = ~( VA l) and f(VA 2 V) = ~(VA2), then VA l VAz V = O---~Al VA2 = O. The independence of two quadratic forms in terms of correlations of higher order was considered by Laha and Lukacs (1960). These results were extended by Khatri (1961) and they are given by

(a)

THEOREM 5. Let x ~ N ~ ( I ~ , V ) and qi=x'Aix+21~x+ct for i=1,2. Let V(12+A2tt0=0. Then, ql and q2 are i.d. iff they are uncorrelated of order (2,2), or of order (2, 1) if A 2 is p.s.d., or of order (1,2) and (2, 1) tfA l is p.s.d. Further, if V(I l + A l/t)= 0 and A 1 and A 2 are p.s.d., then ql and q2 are i.d. iff they are uncorrelated of order (1, 1). I f q z = m ' x + d , then ql and q2 are i.d. iff ql and q2 are uncorrelated of order (2, 2). Theorem 4 was extended for independence of random matrices Qi-X A i X ' + L i X ' + XL; + C i (i-- 1,2) when X~Np.,,(l~, Vl, V) and this result

was given by Khatri (1963, 1964). A similar version is given by

452

C. G. Khatri

THEOREM 6. Let X-~Np, n(I~,V1, V ) and Q i = X A i X ' + L l i X ' + X L ' z i + C i ( i = 1,2) where A 1 and A 2 are symmetric matrices. Then, Qx and Q2 are i.d. iff (i) VA, VA2V=O, (ii) (Lj, + I~A,) VAzV=(L:2 + I~A2)VA 1V=O for j = 1,2 and (iii) the coefficients of the elements of Z 1 and Z 2 from tr(Z~Lo2 ) + Z2L(22)) V(Z{L(,1) + ZIL(z,) )' are zero where LU0 = Lji + IxAi for i,j= 1,2. If Ql and Qz are symmetric, then ( L l i - L z i ) V = O for i = 1 , 2 and hence the condition (iii)becomes (LlI + l~A1)V(L12 + I~A2)'=O. Thus, if Lji + l~Ai = DjiVA i for some matrix Dji and for i , j = 1,2, then Q1 and Q2 are i.d. iff VA 1VA2V=O. We observe that Theorems 4, 5 and 6 are valid for complex normal variates and there is no need of rewriting them. We combine these results and for this, the following result is established by Graybill and Marsaglia (1957) and Khatri (1963, 1964). THEOREM 7. Let X~Np,n(ix , V1, V), Q i = X A i X ', i = 1,2 ..... k, # = T V for some matrix T and Q ---Y. Qi = X A X ' with A -]~i=lAi. - k Then, consider the following statements: (a) Q i ~ W(fi, V,, ~i) for i = 1,2 ..... k, (b) Qi and Qj are i.d. for all pairs i =/=j, i,j = 1,2 ..... k, (c) Q ~ W(f, V 1, f~) and

(d) ~(VA v) = ZI=,~(VA~ V).


Then (i), (c) and (d) imply all conditions and (ii) any two of (a), (b), (c) imply all conditions.

The result (i) of Theorem 7 is Cochran's Theorem and this result was generalised by Styan (1970) and Tan (1975, 1976). A generalization of these results were given by Khatri (1977b) which can be expressed as THEOREM 8. Let x be Nn(i~, V) and # = Vd for some d. Let q l, qz ..... qk and k q be quadratic forms such that x ' A x = q - ~k i=lqi=7i=lxAix. Then, consider the following conditions with distinct nonzero ~1,?~2..... )~m (m > 1): (a) qi~Y~7_,~jXs2(fo, vij) where the X~ 2 are i.d. as non-central Chi-squares (some f j may be zero); (b) qi and qi, are i.d. for all ivai ', i , i ' = 1,2 ..... k; (c) q--Y~.=,~jxf(fj, vj) where the Xi2 are independent non-central Chisquares, (d) ~( lira V) = ~,ki=1~( VAi V) and (e) ~ ( V A V) = ~-"ki= 1~m j= I~(B,Ai B ( I -- ( B ' A B - ;kjI)+(B'AB -- XjI)}) where V= BB', and ~(B) = number of columns of B. Then,

Quadratic forrm' in normal variables

453

(i) (a) and (b)=>all conditions, (ii) (a), (c) and ( d ) ~ a l l conditions, either for m = 2 if the Xj are of' the same sign or for m = 2 and 3 if the X i are of different signs, (iii) (b) and ( c ) ~ a l l conditions and (iv) (c), (d) and ( e ) ~ a l l conditions Hogg and Craig (1958) and Hogg (1963) have given the following THEOREM 9. Let X ~ N p n(Ix, V1, V), i x = D V for some D, Qi=XAi X ' i= 1,2 . . . . . k and Q=Y.ki=IQ i = X A X ' (say). Let us assume that Q ~ XW(f, V~,~) and Qj~Xwj(fj, V~,aj) for j = 1,2 ..... k - 1, and VAk V is p.s.d. Then, Q~, Q2..... Qk are i.d. as XWj(f;, V,, ~2j)for j = 1,2 ..... k with f = y k= lfj and ~2= s~ki=lf~i. NOTE 2. Theorem 9 is valid when the condition VA k V >/0 is replaced by the condition tr(VAk) = ?t~( VAt V). NOTE 3. Theorem 9 holds if we replace the condition VA k V ~ 0 by k--I k--I IO[ >>IEi=~O~l for all X and for some X, IEi=lQil> 0 (see Hogg (1963)). All the results of this section are valid for complex normal variates with proper modifications. The necessary and sufficient conditions for a number of quadratic forms x'Aix ( i = 1 , 2 ..... k) to be distributed as multivariate Chi-square distribution have been established by Khatri (1977c) and they are not mentioned. Untill now, we have assumed that the elements of A, L and C are fixed, Graybill and Milliken (1969) have given sufficient conditions when the elements of A are measurable functions of X. This result can be given by THEOREM 10. Let X~Np.n(t~,VI,I ). Let K 1 and K 2 be rl)<n and ran matrices such that KIK~=O and ( ( K O + ~ ( K 2 ) = n . Let A and B be n n matrices whose elements are Borel functions of K1X'. Then, X A X ' and X B X ' are i.d. non-central Wishart variates if A = K ~ A o K z and B = K ~ B o K 2 for some matrices A o and B o whose elements may be Borel functions of K1X', A 2 = A , B E = B , A B = B A = O and I~AI~' and lxB!L' do not depend on the elements of K1X'. NOTE 4. Taking B = 0, we can obtain the sufficient conditions of Wishartness. Graybill and Milliken (1969) have taken r2--n, Ao---A and Bo=B.

454

C. G. Khatri

Let X ' - ( X I ,' X 2 ') and VI~.~.( VII v~2 V12) V22 > 0.

If the elements of A are Borel functions of Xl, then X2AX/z~ W(m, 1/22V~2V11V12,f~ ) if A2=A, trA=m and f~=(l~2-flth+flX1)A(l~2-flth+ fiX1)' does not depend on X 1 where t~'=(/Z'l,/~;) and fi= V~2V1~1. So Exact distributiou of quadratic forms Let x ~ N n ( # , V ) and that
j~l

q=x'Ax+21'x+c.

We have seen in Theorem 1

q-- ~, 2tjXf.(fj,v,)
iff V(I+AI~)=VAVd for some d and I~'Al~+2|'l~+C=Y.j"~lXjV j. Here, ~kl' ~k2..... ~m are distinct nonzero eigenvalues of VA (or A V) with multiplicities fl,f2 ..... fm respectively. We shall assume in this section that 1, ~ and c satisfy the conditions mentioned above. Thus, we are trying to establish the distribution of a linear function of non-central Chi-square variates. Such a distribution have been established by a number of persons like Robbins (1948), Robbins and Pitman (1949), Gurland (1955, 1957), James Pachares (1955), Ruben (1960, 1962, 1963), Shah and Khatri (1961), Shah (1963) and Kotz, Johnson and Boyd (1967). We shall present the results of Kotz, Johnson and Boyd (1967). The computation aspect is an important one and it is not touched here, because the representation of Kotz et al. (1967) is such that a computer program for this can be developed for the general situation.

Case a
Let us assume that ~ > 0 for all j = 1,2 ..... m. Then, the mgf of q is given by

M(t)=[jH=I(I--2Xjt)-~/2]exp(tj~=I,.Vj/(1--2Xjt))
for all t such that 2hmax t < 1. Let us denote 0--- (1 - 2)kt) -1, 1 - 0 = qs,

(5.1)

flj~-(1--Xj/~t)

(5.2)
and

5=(1-k/hi)

Quadratic forms in normal variables

455

for j = 1,2 ..... m and for X>0. Then, (5.1) can be rewritten as

M(t) = (1 - 22,O-y/2g(O) = (1 -2Xt)-f/2g, (~a)


where

(5.3)

g(o)=[ jH= (I - Bjo) '/2]exp(-j~=l(xjvj/2X)O/(l-BjO) )


(5.4) and g l ( q 0 = a () 11I (1-ale)
LJ=l

-y#z exp ~Xvj./2)t~(l'ajeO)


J \j=l

l( m

(5.5)

with

f= ~fJj=l

and

a()=[j~__~l(X/~.)Y#2]exp(- ~vj/2)Oj=l

(5.6)

To get the distribution in terms of weighted function of Chi-square distributions, we have to expand g(O) and gl(~) in powers of 0 and q,. For this, one can use the following.
LEMMA 5. Let fll,fi2 . . . . . positive integers. Then

tim'

61..... 6m be real numbers and let fl ..... fm be a;Oj,

" ( 1 - ~ 0 ) -y#2 exp

6fl/(1-g.O)
1 '

= ~
j=O

where

\j=l

J ( j + 1)aj+,= ~]
a=O

a~bj_~ and b~= ~ fifli ~+' + ( a + 1) ~


i=l i=l

~i~i a,

ao = 1 for j = 0, 1.... and a = 0, 1,2 .... and


J~] .<< (1 - ~ p ) - j / 2 p _ j e x p ( a p / 0

- ~p)) ,=,f,
m and a=Ei=,la, I. If ai <O foF

for all p in O<p<e-', e==max, IBil, f = all i= 1,2 ..... m and 6=-YT%16j, then

~m

I~l < (1 -

~0)-J/2o

Jexp(6p/(1 + ep)).

The above lemma can be established directly using Cauchy's inequality or one can refer to Kotz et al. (1967). Further, we require the following two lemmas for inversion of Laplace transforms and their proofs can be obtained from Kotz et al. (1967).

456

C. G. Khatri

LEMMA 6. Let {h~(s)) be a sequence of measurable complex valued functions on {sis>O) and let { ck ) be a sequence of complex numbers such that
IckHhk(S)] <<. a exp(bs), k where a and b are real numbers. Define f ( s ) = ~, Ckhk(S ). k k and f exist for all Rez > 0 and f ( z ) = Then, the Laplace transforms k; ~ kCkl~k(Z). for almost all s > O,

LEMMA 7.

Let us define the density function

h(a,x)={V(~+ 1))-~x"exp(-x),

x>Oand l+a>0.

Then, /';(z)=(1 + z) -"-1. If L~(x) is the Laguerre polynomial of degree fi, then the generating function of L~(x), fi = O, 1.... is O l ~ L ~ ( x ) / f i ! = ( 1 - O ) -~ ' e x p ( - - x O / ( 1 - - O ) ) and for [01<1

fl=o ]L~(x)[ < (1 - R ) - ~ - ' R - e e x p ( x R / ( l + R ) )


for 0 < R < 1.

Further, for all z < 1,

fo e x p ( z x ) L ~ ( x ) g ( a , x) dx =
=(-zle(1- z)-~' - ~ r ( ~ + l + / ~ ) / ~ !r(~ + l).
Using Lemmas 5, 6 and 7 in (5.3), Kotz et al. (1967) has given two types of density functions of q = Y~xf(fj.,vj), (~ > O) and they are given by ( - l)Jaj { j ! r ( f / 2 ) / r ( j + f / 2 ) } j=o ( 2 a ) - ' h ( f - 1, q / 2 a ) L ) b- l(q/2X) and

OO

(5.7)

a(~~ a)~(r(f/2)/r(j + f / 2 ) }(2)Q- l ( q / 2 ~ ) J h ( f - 1, q/2)Q,


j=o

(5.8)

Quadratic forms in normal variables

45'1

where

ao= 1, (J+ l ) a j + l
t=l t~l

= Z biaj-1, i=1

(5.9)

2bi= ~ fl:+~- ~ fl/(X,vt/h)(i+ 1)


and a~)= l, 2b~)= ~
t=l

J ( j + Da () = ~ h ()-()
i=0

a/+~+ ~ a/(hvJh,)(i+ l)
t=l

(5.10)

with e, = maxylaj[, e =maxjl Bjl, O < p < , : - ' , O<p~ <ei-'


laj[<(1-eO)-Y/2p-Jexp

and

((m) ) [a()l<~(1--elPl)-f/2plJexp(~j~_l(Vj/~kj)Pl/2(1--Plel) ).
~ ~jvj p/2h(l+ep) ,
(5.11)
j=l

(5.12)

The above results can be computerized. The maximum error terms eN(q) and e~)(q) after N terms can be calculated (see Kotz et al. (1967)) and they are given by

eu(q) < ((N+ I)!F(f/2)/F(N+ 1 + f / 2 ) }


((1 - R)(1 -- ~o)} - z 2 ( o R ) N(PR -- 1)g, (5.13)

and

e~)(q) < { F(f/2)/F(N + 1 + f / 2 ) } ( 1- e~Ol)-y/2 exp(q/2Aol)gtO)


(5.14) where O < e < p - ~ < R < 1, O<pl<e1-1, gl = and
(2~k)

l h ( f - 1,q/2h)exp

(m hjvjp/2(l + ep)+qR/2h(1 + R ))

j 1

(5.15) g})= a()h(f-- l,q/2X)(2~)-lexp(Xj~__ l (Vj/~j)Pl/2(1-- elpO).


(5.16)

458

(2, G. Khatri

In order that (5.13) and (5.14) should be minimum, e and e 1 should be minimum. This will allow us the choice of )t in the two different situations. The best choice of 2, for the density of q given by (5.7) is
~t = (~krnax-1-Xmin)/2, (5.17)

while that for the density of q given by (5.8) is ~t= 2Qtmax 4- Xmin)- l)kmax~kmi n.

(5.18)

The above results are given by Kotz et al. (1967). They have also expanded in terms of non-central Chi-square variates, but for computational purpose they are not very convenient and hence, they are not given, but one can refer to. Kotz et al. (1967).

Case b
Let us assume that ~'1 > " " >Xp > 0 > ) ~ p + l > " " > ) b , . Then, it is easy to see that q ~ q l q2 w h e r e ql = Y T p = l A j x ) (2f j , vj) a n d q2 m j=p+ 1(--~')x2(fa, Vj.). The exact distributions of ql and q2 can be obtained using the results of Case a. If we use the density of (5.8), then the density of q is given by
=

X E alj (o,a~5) o hj,j,( q),


j = O j ' =O

oo

(5.19)

(a() (h w h e r e ,t,~(o) , u , a(Oh 1 p and ~ 2f, a 2 J are based on (61,)h,. .. ,~p,V 1..... Vp,fl . ... . fp) and (82, - X p + 1..... - A m , Vp+l ..... vm,fp+l, .,fro) respectively and they are of structures (5.10). Here, in place of 2~, 8 t and 82 are substituted a n d hj,j,(q) is the density function of q = 61Xlz(r,) - 62x2(r2) = x, (say) (5.20)

+fp+2j, r2=fp+l+-.. with q = f l + f 2 + . . . The density of x is given by

+ f m + 2 j ', 8 1 > 0 and 82>0.

f
0

i m ( X " [ - y ) 2rl

1 'y~r2-1exp(--(x+y)/261--Y/262)dY

for x > 0 (5.21)

and
C ( X - b y ) 2 q - l y 5r2

l e x p ( -- ( x - k y ) / 2 6 1 - - y / 2 6 2 ) d y

for x < O, (5.22)

-x

Quadratic formsin normalvariables


where e - ' =- (260r'/2(262)r#2F(rl/2)F(r2/2).

459

(5.23)

If q and r 2 are even integers, then the density (5.22) and (5.23) of x can be rewritten as

Cj= ~0

xJ

81 "~ ~2 ]
1) e x p ( - x / 2 6 , ) (5.24)

F((r 1+ r 2 ) - j forx>Oandforx<O

~: ' ( I
j=o j

.j[ 26162 "~(r,+r=)-j 1


1) e x p ( x / 2 8 , ) . (5.25)

F((r, + r2) - j -

Gurland (1955) has given the above distribution of x and q. For further results, one can refer Shah (1963). If we have quadratic forms V~= x'Aix (i = 1, 2 . . . . . k), then the distribution of V _E ~ k = ll,-V~ can be obtained using the above technique. For this one can refer to Krishnaiah and Waiker (1973). We shall not consider the exact distribution of

s - E xjw/.j, v, aj).
j=l
The reason is that it involves complicated functions like zonal polynomials, Laguerre polynomials in matrix arguments and some types of extension of zonal polynomials. For details, the readers are requested to study the work of Khatri (1966), (1971), (1975), (1977a), Crowther (1976) and Hayakawa (1966), (1968) and (t973).

6.
6.1

Asymptotic distribution of quadratic forms

Let us consider the situation of Case a in Section 5, namely the 2(f~, Di) with Xi > 0 for all i. Then distribution of q = ~..m i= ~',.X~

~min ~ i=1

X2(fii,Di)~q~max ~

i=1

X/2(f/,Di)

460

C. G. Khatri

Hence, if f = i~ i f/ and v = >2m

e(x2(f,/9)

-~<X/Xmax)

<<.P( q <x) <p(x2(f,v) <<.x/X,).

(6.1)

6.2
Taking q = Z~."_-l~kiXa2(f -,/9i), we find that

Eq=

j=l

~ )tj(fj+vj) and
(6.2)
j=l

V(q)=2 ~ (fj+2vflXf=2v (say).


Let us assume that

(fj + ivj)V
j=l

= o(v)

for i = 2, 3 .....

and v is very large, (This will be true if the ~ are finite given quantities.) Then, taking x=(q-E(q))/~/(2v), the mgf of x is given by
M 1(t) = M(t/~/(2/9)) exp( -- t ( E q ) / ~ / ( 2 v ) )

(6.3)

where

M(t)

is given by (5.1). Hence,

lnMl(t ) = , it 2 + ~
i=3

cit'/v~' '-~,
i

(6.4)

where
Ci = 2 (i- 2)/2
j=l

( fj "i- t/gj))kj / l

~j2(fj + 2t2j)
"=1

(6.5)

for i = 2 , 3 .... and by assumption ci=o(1) for all i > 2 . Hence, (6.3) gives for large v

Ml(t)=exP(t2)[ l +(c3t3/ V/9) +/9


Since f
--00

1(c4/4+

C2I6)
(6.6)

+ c,c4,7 + c;,9) + o(v-2)].


Ha(x) exp(_ i,x

2+tx)dx=tJ((2~r)-Sexp(gt2)), ~ 1

Quadratic forms in normal variables

461

we get the asymptotic density function of x = (q--E(q))/~/(2v) is (2~r)- ~ exp( - x2/2) { 1 + H3(x)c3v-5 + (c4H4(x) -I t- ~c3H6(x))I)
1 1 1 2 ---I

.1_ ( c5H5( x ) _1~C3C4HT( X ) .1_ I c3H9( x) ) ~ --3/2

+o(v-2)} where Hi(x) are Hermite polynomials o f j t h degree.

(6.7)

6.3
Let us assume approximately that

q~2tX2(r, 6),

(---~ = approximately distributed as)

where r, 6 and X will be determined by the first three cumulant relations, namely,

)ti(r+i6) = ~ (fj+ivj)hj j=l =Pi


(say).

fori=l,2,3,

Then, p~-plP3=X26 2 or 2t6=X/(p2-plP3) provided 8 > 0 Hence

and )t>0.

r ) t = p , - V~(P2-fflP3)
This gives

or

X(r+26)=pl~/(p~--p,p3).

X=P2/ ( Pl + X/(Pz-PlP3) },
and
2 2 r = ( p l - P 2 +plp3)/p2.

2 X 6 = ~/(P2--PlP3)/ ,

(6.8)

From these, we can calculate X, r and 8. We observe that r can be fractional, but can be taken as the greatest integer contained in the expression of r in (6.8). Such an approximation can be used, but it requires to find out its validity by computation.

6.4
Let X ~ Np,,( bt, V 1, V) and Q = X A X ' + LIX" + XL~ + C where A is symmetric, L 1V= L 2 V, (L 1+ pA) V= TVA V for some matrix T and pA/,' + L1 #'+ t~L~ + C= T V A V T ' = (L~ + p A ) V ( V A V ) - V ( L 1 + pay. Then, by

462

C G. Khatri

Lemma 4, Q ~ E j ~ ~?~jWj(fj, V1, ~2j) and its mgf is given by M(Z)= fi


j=l

{]I-2)~jV, ZI-M2etr((I-Z)~jZV1)-I)~jZ~2j)},

(6.9)

where Z is any symmetric matrix such that I - 2 A : VIZ has positive eigenvalues. Taking Vl=BlB~,~(VO=~(Bl)=number of columns of Bl-~S (say), Z 1--B~ZB 1 and ~2:= BI~j(I)B~, we can rewrite (6.7) as

M(z 1) ~= e t r
1

where for a = 1,2 .....

C~= 2~- I [ j~__I?~7(fj+ ~A,)) ] / ~.


Observe that

(6.11)

M(Z1) becomes the QI= BI-Q(BI-)', and C2>0. Let us Qz= P'(Q,- C~)P/ ~/wp

mgf of Q1 given by define

Q=B1QIB ~ or
(6.12)

where P is an orthogonal matrix such that

C2=PDwP',

Dw=diag(wpw 2..... w~), wl>>.wz>~...~>w~>0,


(6.13)

and assume that C~ = 0(%) for a--2, 3 ..... Then, the mgf of Q2 is given by

etr( ~=2C~(1)Z~/w~-l),

(6.14)

where Z2=P'Z~P and C,o)=P'C~P/w p a = 2 , 3 ...... Then, for large values of Cz or w~, it can be shown that the elements of Qz are asymptotic normal. Observing

i~l

i=lj~i+l

we find that if Q2=(qij), then q~N(O,2wi/wl) and qij~N(O,(wi+wv) /2w 0 for i~j, i,j= 1,2 ..... s and they are i.d. The better approximation can be obtained in taking higher powers Z z into considerations as done for

Quadratic forms in normal variables

463

p = 1 in Section 6.2. The details are omitted. Further, for the asymptotic distribution of IQI one can refer to Gupta, Chattopadhyay and Krishnaiah (1975).

7.

Characterization of the distributions

7.1 Let Xl,X 2 x n be i.i.d, and let ~ni=lX i and x ' A x be i.d.. If x~ is normally distributed, then from Theorem 4 (Note 2), it is easy to see that (Y.x;)2 and x'Ax are i.d. iff ~ . = l a i j = 0 for i = 1,2 ..... n. The characterization of the distribution of x~ on the basis of the independence of Z x i and x'Ax is given by
.....

THEOREM 11. Let x 1..... xn be i.i.d, such that the variance o f x 1 exists. Let n n = O f o r a l l i = 1,2 . . . . . n. Y~i_lxi and x'Ax be i.d. and Y ,~ i = l a i i_- 0 and Y,j=lasj Then, x 1 b normally distributed. For the proof, one can refer to Lukacs and Laha (1964). A non-normal distribution is characterized by the following theorem: THEOREM i2.
Let x ~ , x 2 , . . . , x n be i.i.d, such that the variance of x I exists. Then, Y~n n 2 are i.d. iff the distribution o f x 1 is F(x)--. t = l X i and ~i=laiixi pe(x-a)+(1-p)~(x + a ) with e ( x ) = 0 if x < 0 while e ( x ) = 1 if x ~ O , and O <<,p< 1. (See Lukacs and Laha (1964).)

In respect of the regression of q on Y~7=~xi, the following theorem has been given by Lukacs and Laha (1964). TIIEOREM 13. (i) L e t Xl,X 2. . . . . x n be i.i.d, such that variance of x 1 exists. Let q ~- Z i j ja i j ,x i x -I- Y~ibixi, B1 ~" zni= 1aii, B2 = ~ j , kajk, B 3 = ~,jbj and n E(q]~i=lXi)-c, a constant. Further let E ( x 0 = ct and V(xl) = 02. Then, (a) x I is normally distributed, if Bl ~=O, B 2 = 0 and B3=0. (b) x I has a G a m m a distribution if Bl v~O, B2vaO, B3=0 and BI(r2 + B2 a2
~-0.

(c) x I has a Poisson type distribution whose characteristic function is e x p ( ~ ( e x p ~ / ( - 1 ) p t ) - 1) + ~ / ( - l)/xt) if Blve0 , B 2 = 0 and B3=/=0. "' n 2 n (Xl) E(Y,i,j = laijxixj/(Y~i= lxi) I]~i= lXi) = constant and ~,j, kajk ~zn(~,,i= 1aii)
iff x 1 has a Gamma distribution.

464

C. G. Khatri

7.2 Let x ~ N ( A i x , o2I) and let q = x ' ~ I k ~ a 2 X 2 ( S ). Then by Theorem 2, T must be idempotent, t r T = s and TAI~=O. Considering some quadratic form of x to be distributed as Chi-square, we want to characterize the distribution of x. In this connection, the following results have been established by Geisser (t973), Ramachandran (1975), Block (1973), (1975) and Ruben (1974), (1976). THEOREM 14. (a) Let x and y be i.i.d, and w = ( x + y ) / ~ / 2 or w = (x-y)/~/2. 7hen w2~x2(1) iff x ~ N ( O , 1). (b) Let x and y be i.d. such that x 2 and y 2 are each distributed as xz(I). Then, w2= (ax + by)2/(a2 + b2)~X2(1)for some a,b 4=0 iff at least one of x and y is N(O, 1). (c) Let x and y be i.i.d, and w2= ( a x + by)2/(a2 + bZ)~x2(l) where [a Iv~ Iblva0. Then, x and y have entire characteristic functions o f order two at most. Further w 2 = ( a x + b y ) 2 / ( a 2 + b 2) and w Z = ( a x - b y ) 2 / ( a 2 + b 2) are each distributed as X2(1) iff x ~ N ( O , 1). THEOREM 15. Let x be a vector of n components such that x = A l~ + e where A is a known matrix of rank r. A s s u m e that the components of e are i.i.d. r.v.'s and the distribution of e 1 (the first component of e) is symmetric. Then x ' ( I - A ( A ' A ) - A ' ) x ~ k x 2 ( n -- r) for ~ > 0 iff e I a N ( O , ~). 7.3 For the characterization of the multivariate normal distribution, the following Theorem is given in the book of Kagan, Linnik and Rao (1973) and also by Khatri (1976).
x n be i.i.d, and let ~,=~n=lxl/n and S = THEOREM 16. Let x1,x 2 ~ 7 = l ( x i - K ) ( x i - i ) ' . I f either (a) i'K and l'Sl are i.d. for every non-null vector l, or (b) E(l'Sl[l'~)= constant for every non-null vector l, (constant may depend on l), then T and S are i.d. and xi~N(/t,~]). But if the structure
.....

on Ti and S is omitted and they are replaced by the statistics w and T with the following properties: (i) I'w and l' Tl are i.d. for every non-null vector 1, (ii) w is normally distributed and (iii) T is distributed as Wishart, then w and T may not be i.d..

xi~N(tt,

In the above theorem, we have seen that if E(S[~)=constant, then Y,), but if we take

Quadratic for~r~ in normal variables

465

where A , f f = ( f l 1..... tip) and ('~kk') are constants and X ' = ( X I . . . . . X p ) , then one can get other non-normal distributions such as Poisson, Negative-Binomial, G a m m a etc. For details, one can refer to Khatri (1972). Z4 In order to determine the structure of some statistics which are distributed as Chi-square, Khatri and Mitra (1977) have given the following theorem in the case of G a u s s - M a r k o f f Models: THEOREM 17. Let y = A l 3 + e be a vector of n components and e.~N(O, o2I). Let ~(A) be r ( < n ) and R 2 ( ~ ) = ( y - A ~ ) ' ( y - AI3). Let [~y be a function of y such that either (a) E { R 2Q3y)/(n - r)} = 0 2 for every q3, o) E ~, ~ being the

parameter space, or (b) R2(13y)/o2~X 2 for every ([t,o)Efa or (c) A ~ y ~ N(A~,A) and E[R=(13y)/o2[Al3y] is constant for every ( ~ , o ) ~ f L Then R 2 ( ~ y ) = y ' ( I - A ( A ' A ) - A ' ) y i.e. ~y satisfies the normal equations A ' A ~ = A'y.

8.

Estimation of fixed effects

In the analysis of variance with mixed effects model, we come across the following problem. Let

y = A O + Bq)+e,
where e and ~b are n x 1 and m 1 independent vectors, e~N(O, oZIm) and ~ N(0, o 2D a), D = dlag()t llm ..... ?t, I m ), Y.q i- lmi m and ai2 = h io 2 , A and B are known n p and n x m matrices 0,o~,o~ ..... % are unknown. The problem is to estimate O. If D x is known, then the minimum variance unbiased linear estimate of O is given by the solution of
=

A'V-1AO=A'V-Iy

with

V=I+BDaB'.

If D x is unknown, then an appropriate estimate of the Xj are used and this will be the ratio of two quadratic forms in u = Ty where T is such that TA=O and f ( T ) = n - f ( A ) . Let V be an estimate of V obtained by substituting the estimate of D x. Then, the estimate 0 of O given by

Oc=(A'l~-~A)-lA'l~-ly
is considered by Khatri and Shah (1977) who showed that if h'O is an estimable parametric function, then h'Oc is an unbiased estimate of h'0.

466

C. G. Khatri

For m = l, Khatri and Shah (1974, 1975) and Brown and Cohen (1974) have studied the problem in some detail. For example, in the simplified situation, it can be described in the following way: Let
xi~'--N ( Izi, aio2), yi~N( ~i, bio~), i = 1 , 2 . . . . . t,

sj2~oj2x2(mj) , j = 1,2,
and let them be i.d. Then, the estimate of/~ is
~L i (1 -- Wi)( X i -- Y i ) + Yi

where
Wi"~" wi( bis2/
2 2 2 j = 1,2 ..... t) aisl,bi( xj-- yfl 2/ a,si,

and in particular, wi is taken as

cilais I -Ij=l

where the cij are constants. Khatri and Shah (1974, 1975) have given the exact variance of/~ by using the exact distribution of the quadratic forms given in (5.A) and they compared with the various estimates by choosing different values of cij. For details, one can refer to Khatri and Shah (1975). The other applications of quadratic forms are omitted and one can refer to the Chapters on M I N Q U E , Time Series Analysis, and analysis of design of experiments.

References
Aitken, A. C. (1950). On the statistical independence of quadratic forms in normal variates. Biometrika 37, 93-96. Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis. Wiley, New York. Baldessari, B. (1967). The distribution of a quadratic form of nornlal random variables. Ann. Math. Statist. 38, 1700-1704. Bhat, B. R. (1962). On the distribution of certain quadratic forms in normal variates. J. Roy. Statist. Soc. Set. B 24, 148-151. Block, H. W. (1973). A characterization concerning random variables whose squares are Chi-square. lnst. Math. Statist. Bull. 2, 34. Block, H. W. (1975). Characterizations concerning random variables whose absolute powers have specified distributions. Sankhy& Ser. A 37, 405-415. Brillinger, D. R. (1974). Times Series: Data Analysis and Theory. Holt, Rinehart and Winston, New York.

Quadratic forms in normal variables

46"7

Brown, L. D. and Cohen, A. (1974). Point and confidence estimation of a common mean and recovery of Interblock information. Ann. Statist. 2, 963-976. Carpenter, O. (1950). Note on the extension of Craig's theorem to non-central variates. Ann. Math. Statist. 21, 455-457. Cochran, W. G. (t934). The distribution of quadratic forms in a normal system, with applications to the analysis of covariance. Proc. Cambridge Philosoph. Soc. 30, 178-191. Craing, A. T. (1943). Note on the independence of certain quadratic forms. Ann. Math. Statist. 14, 195-197. Crowther, C. G. (1975). ~Fne exact non-central distribution of a quadratic form in normal vectors. South African Statist. J. 9, 27-36. Geisser, S. (1973). Normal characterizations via the squares of random variables. Sankhyd Ser. A 35, 492-494. Good, I. J. (1963). On the independence of quadratic expressions (with an appendix by L. R. Welch). J. Roy. Statist. Soc. Set. B 25, 377-382. Corrigenda (1966). J. Roy. Statist. Soc. Ser. B 28, 584. Good, I. J. (1969). Conditions for a quadratic form to have a Chi-squared distribution. Biometrika 56, 215-216, Corregenda (1970). Biometrika 57, 225. Goodman, N. R. (1963). Statistical analysis based on a certain multivariate complex G a ~ s i a n distribution (an introduction). Ann. Math. Statist. 34, 152-176. Grad, A. and Solomon, H. (1955). Distribution of quadratic forms and some applications. Ann. Math. Statist. 26, 464-477. Graybill, F. A. and Marsaglia, G. (1957). Idempotent matrices and quadratic forms in general linear hypothesis. Ann. Math. Statist. 28, 678-686. Graybill, F. A. and Milliken G. (1969). Quadratic forms and idempotent matrices with random elements. Ann. Math. Statist. 40, 1430-1438. Gupta, A. K., Chattopadhyay, A. K. and Krishuaiah, P. R. (1975). Asymptotic distributions of the determinants of some random matrices. Comm. Statist. 4, 33-47. Gurland, J. (1955). Distribution of definite and indefinite quadratic forms. Ann. Math. Statist. 26, 122-127. Corrections (1962). Ann. Math. Statist. 33, 813. Gurland, J. (1957). Quadratic forms in normally distributed variables. Sankhy~ 17, 37-50. Hannah, E. J. (1970). Multiple Time Series. Wiley, New York. Hayakawa, T. (1966). On the distribution of a quadratic form in a multivariate normal sample. Ann. Inst. Statist. Math. 18, 191-201. Hayakawa, T. (1969). On the distribution of the latent roots of positive definite random symmetric matrix. I. Ann. Inst. Statist. Math. 21, 1-21. Hayakawa, T. (1973). On the distribution of the multivariate quadratic form in multivariate normal samples. Ann. lnst. Statist. Math. 25, 205-230. Hogg, R. V. (1963). On the independence of certain Wishart variables. Ann. Math. Statist. 34, 935-939. Hogg, R. V. and Craig, A. T. (1958). On the decomposition of certain Chi-square variables. Ann. Math. Statist. 29, 608-610. Hotel.ling, H. (1944). Note on a matrix theorem of A. T. Craig. Ann. Math. Statist. 15, 427 -429. Kagan, A. M., Linnik, Yn. V. and Rao, C. R. (1973). Characterization Problems in Mathematical Statistics. Wiley, New York. Kawada, Y. (1950). Independence of quadratic forms in normally correlated variables. Ann. Math. Statist. 21, 614-615. Khatri, C. G. (1959). On conditions for the forms of the type X A X ' to be distributed independently or to obey Wishart distribution. Bull. Calcutta Statist. Assoc. 8, 162-168. Khatri, C. G. (1961). Cnmulants and higher order correlation of certain functions of normal variates. Bull. Calcutta Statist. Assoc. 10, 93-98.

468

C. G. Khatri

K_hatri, C. G. (1962). Conditions for Wishartness and independence of second degree polynomials in normal vectors. Ann. Math. Statist. 33, 1002-1007. Khatri, C. G. (1963), Further contributions to Wishartness and independence of second degree polynomials in normal vectors. J. lndian Statist. Assoc 1, 61-70. Khatri, C. G. (1966). On certain distribution problems based on positive definite quadratic forms in normal vectors. Ann. Math. Statist. 37, 468-479. Khatri, C. G. (1968). Some results for the singular normal multivariate regression models. SankhyK Ser. A 30, 267-280 Khatri, C. G. (1971a). Mathematics of Matrices. Gujarat University, Ahmedabad (in Gujarati). Khatri, C. G. (1971b). Series representations of distributions of quadratic form in the normal vectors and generalised variance. J. Multivariate Anal. 1, 199-214. Khatri, C. G. (1972). On characterization of the distribution from the regression of sample covariance matrix on the sample mean vector. Sankhyd Ser. A 34, 235-242. Khatri, C. G. (1975). Distribution of a quadratic form in normal vectors (multivariate non-central case). In: G. P. Patil, S. Kotz and J. K. Ord eds., Statistical Distributions in Scientific Work, Vol. 1, D. Reidel, Dordrecht, Netherlands, 345-354. Khatri, C. G. (1976). A note on the independence of the mean vector and the covariance matrix. Gujarat Statist. Rev. 3, 21-23. Khatri, C. G. (1977a). Distribution Of a quadratic form in non-central normal vectors using generalized Laguerre polynomials. South African Statist. J. 11, 167-179. Khatri, C. G. (1977b). Quadratic forms and extension of Cochran's Theorem to normal vector variables. In: P. R. Krishnaiah ed., Multivariate Analysis, Vol IV, 79-94. Khatri, C. G. (1977c). The necessary and sufficient conditions for dependent quadratic forms to be distributed as multivariate Chi-square. (Unpublished.) Khatri, C. G. (1978). A remark on the necessary and sufficient conditions of a quadratic form to be distributed as Chi-square. Biometrika 65, 239-40. Khatri, C. G., Krishnaiah, P. R. and Sen, P. K. (1977). A note on the joint distribution of correlated quadratic forms. J. Statist. Planning and Inference 1, 299-307. Khatri, C. G. and Mitra, S. K. (1977). Some converses of the Gauss-Markov theorem. Gujarat Statist. Rev. 4, 51-59. Khatri, C. G. and Shah, K. R. (1974). Estimation of location parameters from two linear models under normality. Comm. Statist. 9, 647-663. Khatri, C. G. and Shah, K. R. (1975). Exact variance of combined inter- and intra-block estimates in incomplete block designs. J. Amer. Statist. Assoc. 70, 402-406. Khatri, C. G. and Shah, K. R. (1977). On the estimation of fixed effects in a mixed model. Proceedings of the 41st session Internat. Statist. Inst., (contributed papers) 284-87. Krishnaiah, P. R. (1977). On the generalized gamma type distributions and their applications in reliability. In: C. P. Tsokos and I, N. Shinii, eds., The Theory and Applications of Reliability, Vol. 1,475-494. Krislmaiah, P. R. and Sen, P. K. (1970). Some asymptotic simultaneous tests for multivariate moving average processes. Sankhyd Ser, A 32, 81-90. Krishnaiah, P. R. and Waiker, V. B. (1973). On the distribution of a linear combination of correlated quadratic forms. Comm. Statist. 1, 371-380. Kotz, S., Johnson, N. L. and Boyd, D. W. (1967). Series representations of distributions of quadratic forms in normal variables, I: Central case, and II: non-central case, Ann. Math. SRttist. 38, 283-837 and 838-848. Laha, R. G. (1956). On the stochastic independence of two second degree polynomial statistics in normally distributed variates. Ann. Math. Statist. 27, 790-796. Laha, R. G. and Lukacs, E. (1960). On certain functions of normal variates which are uncorrelated of a higher order. Biometrika 47, 175-176.

Quadratic forms in normal variables'

469

Lancaster, H. O. (1954). Traces and cumulants of quadratic forms in normal variables. J. Roy. Statist. Soc. Ser. B 16, 247-254. Liggett, Jr., W. S. (1972). Passive sonar: fitting models to multiple time series. Proceedings of the NATO Advanced Study Institute on Signal Processing. Academic press, New York. Lukacs, E. and Laha, R. G. (1964). Applications of Characteristic Functions. Charles Griffin, London. Matusita, K. (1949). Note on the independence of certain statistics. Ann. Inst. Statist. Math. ~, 79-82. Ogawa, J. (1949). On the independence of bilinear and quadratic forms of a random sample from a normal population. Ann. Inst. Statist. Math. 1, 83-108. Pachares, J. (1955). Note on the distribution of a definite quadratic form. Ann. Math. Statist. 26, 728-731. Ramachandran, B. (1975). On a conjecture of Geisser's. Sankhy~ Ser. A 37, 423-427. Rao, C. R. (1973). Linear Statistical Inference and Its Applications (second edition). Wiley, New York. Robbins, H. (1948). The distribution of a definite quadratic form. Ann. Math. Statist. 19, 266-270. Robbins, H. and Pitman, E. J. G. (1949). Application of the method of mixtures of quadratic forms in normal variables. Ann. Math. Statist. 20, 552-560. Ruben, H. (1960). Probability content of regions under spherical normal distributions. I. Ann. Math. Statist. 31, 598-618. Ruben, H. (1962). Probability content of regions under spherical normal distributions IV: The distribution of homogeneous and non-homogeneous quadratic forms of normal variables. Ann. Math. Statist. 33, 542-570. Ruben, H. (1963). A new result on the distribution of the quadratic forms. Ann. Math. Statistics. 34, 1582-1584. Ruben, H. (1974). A new characterization of the normal distribution through the sample variance. Sankhy~ Ser. A 36, 379-388. Ruben, H. (1976). A characterization of normality through the general linear model. Sankhy~ Set. A 38, 186-189. Sakamoto, H. (1944). On the independence of Statistics (in Japanese). Res. Memories lnst. Statist. Math. 1, 1-25. Shah, B. K. (1963). Distribution of definite and indefinite quadratic forms from a non-central normal distribution. Ann. Math. Statist. 34, 186-190. Shah, B. K. (1970). Distribution theory of positive definite quadratic form with matrix argument. Ann. Math. Statist. 41, 692-697. Shah, B. K. and Khatri, C. G. (1961). Distribution of a definite quadratic form for non-central normal variates. Ann. Math. Statist. 32, 883-887. Corrections (1963). Ann. Math. Statist. 34, 673. Shanbhag, D. N. (1968). Some remarks concerning Khatri's result on quadratic forms. Biometrika 55, 593-595. Shanbhag, D. N. (1970). On the distribution of a quadratic form. Biometrika 57, 222-223. Styan, G. P. H. (1970). Notes on the distribution of quadratic forms in singular normal variables. Biometrika 57, 567-572. Tan, W. Y. (1975). Some matrix results and extensions of Cochran's theorem. S l A M J. Appl. Math. 28 (3) 547-554. Corrections (1976). S l A M J. Appl. Math. 30, 608-610.

P. K. Krishnaiah, ed., Handbook of Statistics, Vol. i North-Holland Publishing Company (1980) 471-512

1q
1[ J

Generalized Inverse of Matrices and Applications to Linear Models


Sujit Kumar Mitra

PART 1:
1

GENERALIZED INVERSE OF M A T R I C E S

Introduction

E. H. Moore in 1920 introduced an unique inverse (called general reciprocal) for every finite complex matrix, square or rectangular. Though concepts of generalized inverse of integral and differential operators were known by that time, this appears to be the first published record of a generalized inverse of a finite matrix. There has been a fresh revival of interest in this area since Bjerhammer in 1951 observed the wide diversity of generalized inverses, the role that this concept plays in solutions of linear equations and also the least squares property possessed by some members of this class. In 1955 Rao constructed an inverse of a singular matrix that occurs in normal equations of least squares theory, named it a pseudoinverse and described its use in solving normal equations and in computing standard errors of least squares estimators. Bose in his lecture notes on Analysis of variance introduces the conditional inverse and mentions similar properties. Penrose in 1955 arrived at the Moore inverse through a different approach and indicated its use in obtaining best approximate solutions of (possibly) inconsistent systems of linear equations. This unique inverse on account of its many interesting properties has been extensively studied and applied and is known in contemporary literature as the Moore Penrose inverse. In 1962, Rao showed that an inverse satisfying only one condition of Penrose (which he called g-inverse, generalized inverse) is sufficient for obtaining solutions of matrix equations and answering most of the problems of the least squares theory. In a later paper in 1967, Rao studied further properties of g-inverse, gave an explicit expression for the projec471

472

Sujit Kumar Mitra

tion operator in terms of a g-inverse and defined several classes of g-inverses useful in optimization theory. This work was later pursued by the author in 1968 who introduced some new classes of g-inverses, for the first time gave an explicit representation of the Moore Penrose inverse and demonstrated the usefulness of the g-inverse in solving certain nonlinear matrix equations connected with the distribution of quadratic forms. It is interesting to note that m a n y developments in this area arose out of statistical considerations and a substantial n u m b e r from the study of linear models. It seems therefore most appropriate that this volume should contain a chapter devoted to generalized inverse of matrices. A nonsingular matrix has a unique inverse to suit all purposes. A singular matrix on the other hand has different inverses to satisfy different needs. A forced application of an inverse to a situation for which it was not originally meant m a y have disastrous consequences. In this chapter we shall m a k e no attempts to be exhaustive. On the contrary we shall confine our attention only to those inverses which have been found useful in the study of linear models. F o r example group inverses which play a useful role in the study of finite Markov chains will not be considered in this chapter. Emphasis is placed more on the presentation of the various g-inverses, a n d on description of their uses rather than on formal mathematics. Only those theorems are stated that aid in our understanding of these inverses. Proofs are invariably omitted as they are available in standard texts [ 1, 8, 11, 44, 57].

2.

Generalized inverse of a matrix

Let A be a real matrix of order m n and x , y denote column vectors in R n and R m respectively. DEFINITION 2.1. A matrix G is said to be a generalized inverse of A, if x - - Gy gives a solution of the equation A x = y , whenever this equation is consistent that is, if AGy = y Vy ~9IL(A), the column span of A.

This condition is easily seen to be equivalent to the condition AGA = A (2.1)

Accordingly every matrix G satisfying (2.1) will be called a generalized inverse (or briefly a g-inverse) of A and denoted by the symbol A - . Other sets of equivalent conditions are as follows: (a) (A G)2 = A G, RankA G = Rank A (2.2)

Generalized inverse of matrices and applications to linear models


or

473

(b)

(GA)2= GA,

RankGA =RankA

(2.3)

THEOREM 2.1. L e t A be a real matrix o f order m n a n d A - be any g-inverse o f A . Put H = A - A . Then the following hold: (a) A general solution o f the homogeneous equation A x = 0 is x = ( I - H ) z where z is an arbitrary column vector in R ~. (b) A general solution o f the consistent nonhomogeneous equation A x = y is.
x=A-y+(I-H)z, where z E R n and is arbitrary otherwise. (c) Q ' x has a unique value f o r all solutions o f A x = y i f a n d only i f H'Q=Q or

(2.4)

)]L(Q)caJL(A').

(2.5)

(d) A necessary and sufficient condition that A x = y is consistent is


A A --y = y .

For a proof of this theorem tile reader is referred to [57]. Let A, B and C be real matrices of order m n, p q and m q respectively. The following theorem describes the use of g-inverses in solving an important system of linear matrix equations. THEOREM 2.2 (a)
Ois
X = Z -

The general solution to the homogeneous equation A X B =

A -A ZBB -,

(2.6)

where Z is an arbitrary real m a t r i x o f order n p. (b) A necessary and sufficient condition f o r the equation A X B = C to have a solution is that A A - C B - B = C, in which case a particular solution is X = A - C B X = A -CB -

(2.7)
and the general solution is

+ Z - A -AZBB -,

(2.8)

where Z is" an arbitrary real m a t r i x o f order n p.

Consider a real matrix A of order m n and rank r and let the columns of a matrix L of order m r form a basis of 6J)]L(A). L could as well be

474

Sujit Kumar Mitra

formed by picking r linearly independent columns of A. We express A as


A = LR

and observe that R of order r n is also of rank r. It is easily seen that

R'(RR')- ' ( L ' L )

- 'L'

is one choice of A - . In terms of any such particular solution A - , using Theorem 2.2 we obtain the general solution to a g-inverse of A , as
G = A - + Z - A -A ZAA -,

(2.9)
equation A X B = 0 (2.10)

where Z is an arbitrary real matrix of order n m. Note 1. A general solution to the homogeneous could alternatively be represented as

V(I--BB--)

in terms of two arbitrary matrices U and V of order n p each, leading to corresponding changes in expressions (2.8) and (2.9). N o t e 2. A general solution to the consistent nonhomogeneous equation A x - - y , could, in addition to (2.4) be alternatively expressed as
x = A -y,

where A - is an arbitrary g-inverse of A as determined in (2.9). The following theorem gives an interesting result concerning the invariance of certain expressions with respect to choice of a g-inverse. THEOREM 2.3. L e t A , B and C be real matrices" o f order m n, m q and p n respectively and B, C be non-null. The expression C A - B is invariant under choice of A - i f and only i f
31L(B)c~L(A) and ~(C')cSflL(A').

Using Theorem 2.3 it is seen that the matrix


P=A(A'A)-A' A(A'A)-

(2.11)

is unique with respect to choice of g-inverse. Hence since ( A ' A ) - A ' and ( A ' A ) - A ' A [ ( A ' A ) - ] ' are also choices of g-inverse of A ' A
p=A(A'A)-A'A(A'A)-A'=p 2

=A(A'A)-A'A[(A'A)- I'A'
Thus P is idempotent and symmetric (in fact n.n.d.), which shows that P is an orthogonal projector under the innerproduct ( u , v ) = v ' u . R a n k ( A ) =

Generalized inverse of matrices and applications to linear models

475

Rank(A'A)~=>~L(A')=gZ(A'A)c=>A'=A'AU for some U ~ P A = A(A'A)-A'A = U'A'A(A'A)-A'A = U ' A ' A = A ~ C , ~ L ( P ) = 9]L(A). Hence P=A(A'A)-A' is an explicit representation of the orthogonal projector onto the column span of A. We conclude this section with an example illustrating the numerical computation of a g-inverse of a matrix A. Numerical computation o f A We postfix an unit matrix to A as shown below and carry out pivotal condensation on the rows of A by the sweep out method.
]'able 2.1 Showing the numerical computation of a g-inverse by the method of sweep
out

A 2 4 5 6 2 2 2 0
1

I 4 6 7 6
2
l ~

Row no. (1)

Row operation

1 1 1 1

(2) (3) (4)


(5)=(1)/2

-2 -3

-2 -3

-2 -'

1
1

(6)=(2)-4x(5)
(7) = (3)5 (5)

-6

-6
1

-3
- ~
1

1
2
1 1

(8)=(4)-6x(5)
(9) = (5) - ( 1 0 )

1
,

-5 -3

(10)=(6)/-2
1 (11)=(7)+3x(10)

(12)=(8)+6x(10)

The matrix under A in the third block is already in the ttermite canonical form. (Recall that a square matrix is in the Hermite canonical form if its principal diagonal elements are either 0 or 1 and all subdiagonal elements are 0 such that if the diagonal element is 0 the entire row consists of 0's and if-the diagonal element is 1 the rest of the elements in the same column are 0.) For this purpose we may choose either rows (9), (10) and (11) or rows (9), (10) and (12). In either case the corresponding rows under I gives one choice of A - and those under A gives A -A. We have thus -L A-=
1

l
_L 2 . 1
A -A ~--

(1 ,)
1 1

-3

4'76

Sujit Kumar Mitra

It is seen that A is of rank 2 and a g-inverse of the same rank can be obtained by replacing the last row which has a zero pivot in the Hermite canonical form by a null row.

30

Reflexive generalized inverse A matrix G is said to be a reflexive g-inverse of A if A~(G-}.

DEFINITION 3.1.

G~{A-},

(3.1)

A reflexive g-inverse of A is denoted by the symbol A Z and the entire class by {A Z }. The following theorem is due to Bjerhammer [10]. THEOREM 3.1. Condition (3.2) is equivalent to (3.!). Rank G = RankA. (3.2)

G e {A - },

A method of computing A~- was described in the concluding lines of the previous section. A general solution to A~- is G = A - A A - where A - is an arbitrary g-inverse of A.

4.

Minimum seminorm g-inverse

DEFINITION 4.1. A matrix G is said to be a minimum seminorm g-inverse of A if for any y such that the equation A x = y is consistent x = Gy is a solution with the least seminorm. In particular, if the seminorm is defined by [Ix[[n (x'Nx) 1/2 where N is n.n.d., the g-inverse is represented by the symbol A,,~(sv ) and {A,,~(N)) represents the class of all such g-inverses 1. We have the following theorem:
=

THEOREM 4.1.

A matrix G is Am(iV) if and only if and (GA)'N=NGA. (4.1)

AGA=A

If GO is a particular solution of (4.1), a general solution is given by

G = 6o + W ( I - A 6o) + ( I - GA0) V,
1The subscript (N) in A,~(t)is usually suppressed when N = 1.

(4.2)

Generalized inverse of matrices and appfications to linear models where W is arbitrary and V is an arbitrary solution of N ( I - GoA ) V=0. The matrix ( N + A ' A ) - A ' I A ( N + A ' A ) - A' ] is one choice of G0. When ~ C ( A ' ) C cAlL(N), G o can be taken to be N-A'[AN A']-.

477

(4.3)

(4.4)

(4.5)

For a proof of this theorem the reader is referred to [57]. Remark: Though with an arbitrarily computed N - the formula N - A ' [ A N - A ' ] may not necessarily provide A,T(N ) unless 9L(A')ccA1L(N), a choice of N - which will always work is given by

(X+ W)
where W is n.n.d, and is such that GYL(N) and UfL(W) are virtually disjoint and 9IL(A') c ~(~(N+ W) = ~YfC(N: W). A numerical example illustrating the computation of such a g-inverse is given in Section 11.

5.

Semileast squares inverse

For the equation A x = y (possibly inconsistent) and a given n.n.d, matrix M, 2 is a M semileast squares solution of A x = y if Vx

y) < ( A x - y ) M ( A x - y).
DEFINITION 5.1. A matrix G is said to be a M semileast squares inverse of A if Vy, x = Gy is M-semileast squares solution of the equation A x =y. A M-semileast squares inverse of A is denoted 2 by the symbol At(M) and the class of such inverses by (At(M)) . We have the following theorem THEOREM 5.1. A matrix G is Al(M) if and only if

A'MAG=A'M 2The subscript (M) in Al~-M)is usually suppressed when M ~ I.

(5.1)

478

Sujit Kumar Mitra

or equivalently MAGA = MA, (AG)'M= MAG

(5.2)

I f G O is a particular solution of (5.1) a general solution is given by Go+ [ I - ( A ' M A ) - A ' M A ]U

(5.3)

where U is arbitrary. The matrix (A'~tvlA) A ' M is one choice of G o.

For a proof of this theorem the reader is referred to [57]. It is seen that At(M) is not necessarily a g-inverse of A unless R a n k ( M A ) = RankA, which is satisfied if M is positive definite. However {AI(M~ ) N (A - ) is nonempty. A matrix G in this intersection is denoted by AI~-M~.The following result is proved in [39]. THEOREM 5.2. AI(M~ exists. I f G o is one choice of Al(M) a general solution is
given by Go+ [ I - ( A ' M A ) - A ' M A ] U,

(5.4)

where U is a general solution of the equation [ A - A ( A ' M A ) - A ' M A ] ~ A = O. The matrix A - + (A'MA)-A'M(I is one choice of G o. The duality theorem - AA -) (5.5)

(5.6)

Rao and Mitra [5"7] and Sibuya [63] established the following duality relationship between minimum norm and least squares inverses and indicated the key role it plays in the Gauss-Markov theory of linear estimation. THEOREM 5.3.
I f M and A are p.d. and M A = I, then

{ (A

= { (A,-

9')"

(5.7)

Various ramifications of this result when M and A are only positive semidefinite are discussed in [39]. For our purpose we shall be interested only in the following result.

Generalizedinverse of matrices and applications to linear models THEOREM 5.4. Then

479

Let M, A be positive semidefinite matrices of the same order.

( ( A , - ( , ) ) ' } C ( (A')~ (A)) if and only if one of the following conditions is true. (i) (ii) 0]L (A) C 0iL(A). Rank(A ' M A ) = RankA, A'MAQ=O.

(5.8)

(5.9) (5.10) (5.11)

where Q is such that ~qL(Q)= ~)L(A'), the null space of A'. f o r a given M if (5.10) is true a general n.n.d, solution A of (5.11) is given by A = A0+ ( I - H ) ' A I ( I - H ) , where A o and A l are arbitrary n.n.d, matrices such that ~ ( A 0 ) C 9iL(A) and H= MA(A'MA)-A'. For a given A if (5.9) holds, (5.8) is true for arbitrary n.n.d, matrices M. If (5.9) is untrue a general n.n.d, solution M of (5.10) and (5.11) is given by M = E - I A Q Q ' A U, A Q Q ' A + A A ' U2AA' ]( E - )' +(I- E-E)U3(IE - E )' (5.12)

where E = A Q Q ' A + A A ' , E - is an arbitrary g-inverse of E, U 1 and U3 are arbitrary n.n.d, matrices and U2 is arbitrary p.d. Choosing for UI a n.n.d. g-inverse of A Q Q ' A , for U2 a p.d. g-inverse of A A ' and putting U3 = 0, it is seen that E - is a valid choice of M. Projection operators under seminorms In Section 2 we have given an explicit representation of the orthogonal projector onto the column span of a matrix A. The same can also be expressed as P = A G where G ~ ( A Z } which is obviously unique with respect to choice of G in this class. When the inner product is induced by a p.d. matrix M, the unique orthogonal projector can again be obtained as A G where G ~ (Ate-M)}. We give below the definition of a projector applicable for the situation where M is n.n.d. This allows projections to be studied under a wider generality. Let A be a real matrix of order m n and M be real n.n.d, of order m m.

480

Sujit Kumar Mitra

DEFINITION 5.2. A matrix PA(M) is said to be aprojector into ~ ( A ) respect to seminorm defined by
IlYllM=Cy'My) 1/2

with

if V x E R n, y E R "

ilY - PAYIIM < [[Y --AxIIM.

(5.13)

Comparing this with Definition 5.1 it is seen that PA(M) is a projector into 9L(A) with respect to seminorm induced by M iff P A ( M ) = A G for some G E {At(M)). We denote PA(M) simply by PA when the seminorm is understood with reference to context. The following properties of such projectors are a consequence of Definition 5.2 THEOREM 5.5.
The matrix P o f order m m is a projector onto GJiIC(p) iff

(a) P'MP= MP, or equivalently (a') (Me)'= Me,


M e 2= M P .

(5.14)

(5.15)

THEOREM 5.6. (i) (ii) (iii)

P is a projector into ~IL(A) iff

~'YlL(P) C ~L(A),
P'MP = Me, MPA = MA.

(5.16) (5.17) (5.18)

Note: Here (iii) could be replaced by (iii)' Rank M P = Rank M A . (5.19)

THEOREM 5.7. For i = 1,2 let Pi be a projector into ~L(Pi). Then (a) P1 -t- P2 is a projector iff M P 1 P 2 = M P 2 P 1= O, (b) P1 - P2 is a projector iff M P 1 P 2 = M P 2 P 1 = M P 2, (c) PIP2 is a projector iff M P 1 P 2 = M P z P v
THEOREM 5.8.
are

I f P and P are two choices of a projector into cY%(A), then so and

Pff

~,P+ (1 - ~ ) / v

(5.20)

Generalizedinverseof matrices and applications to finearmodels

481

for any real number ~. Further

Mp2= MP2= MP~= MPP= M e = Mb-.

(5.21)

Proofs of all these propositions on projectors under seminorms are given in [38].

6.

Minimum seminorm semileast square inverse

DEFINITION 6.1. A matrix G is said to be a minimum N seminorm M semileast squares inverse of A if Vy, x = Gy is a M semileast squares solution of the equation A x = y (possibly inconsistent) and further has the least N seminorm in this class. A minimum N seminorm M semileast squares inverse of A is denoted by the symbol AMN and the class by (AMN). We have here the following theorem. THEOREM 6.1.
A matrix G & AMN ~f and only if
(A O ) ' M = M A C,

(a')
and

M A OA = M A ,

(6.1)

N G A G = NG,

( G A ) ' N = NGA,

(b')

Uf~(NGA) c ~ ( A ' M A ) .

(6.3)

For a proof of this theorem the reader is referred to [38]. In terms of projectors under seminorms introduced in Section 5 the above conditions could be equivalently stated as (a") (b')
AGE{Pa), GAE(P~)

(6.4) (6.5)

9 L ( N G A ) C_2iIL(A' M A )

Some other properties of a minimum N seminorm M semileast squares inverse are stated in the following theorem also proved in [38]. THEOREM 6.2. The following statements are true (i) NGI = N G 2 if G 1 and G 2 are two choices Of AMN. (ii) AMN=AMNo if No= N + A ' M A . (iii) Go= N o A ' M A ( A ' M A N o A ' M A ) - A ' M is one choice Of AMN. (iv) G = Go + ( I - N o N o ) U where U is arbitrary is a general solution to AMN.

482

Sujit Kurnar Mitra

(v) AMN is' unique if and only if N O= N + A'MA is positive definite. (vi) If G ~ AMN, then ~f~[N { I - (A 'MA)-A'MA )] = 9]L[N(I- GA)]. AMu is not necessarily a g-inverse of A. Further the set (AMN) f) (A - } is nonempty if and only if

glL(N) A 9]L(A ') C 6"~(A 'MA).

(6.6)

It is interesting to observe that when this condition is satisfied tile inverse AM~ v is defined just by condition (a') of Theorem 6.1 or equivalently by + condition (a"). We use the notation AMN to denote a matrix in the intersection {AMu } A {A~-} when one exists. The following result which holds when M and N are p.d. is quite interesting.
THEOREM 6.3 + ' --- (M ' )N + -Ira (AMN)
1"

7.

Optimal

inverse

If the intention is to make both A x - y and/ x smallx in some sense an alternative approach would be to consider [ A x - y ] a vector in the
\ X ]

product space and minimize a suitable norm (or seminorm) of this vector not necessarily a product norm (or seminorm). If the seminorm is induced by the n.n.d, matrix A an optimal approximate solution (OAS) of Ax =y in this sense would require projection under this seminorm of the vector ( 0 ) into the column space of A I ) " From the explicit representation of the projector a OAS is seen to be given by

x=A~y,
where A~=(A'AI1A +A'AI2+A'12A +A22 ) (A'Axl+A'12) and A = ( Ax' A'2 t

A'12 A22}
is the appropriate partitioned form of A. DEFINITION 7.1. A matrix G is said to be a A-optimal inverse of A if 2= GY is a A optimal approximate solution of the equation A x = y (possibly inconsistent) in the sense that Vx

Generalized inverse o f matrices a n d applications to linear models

483

Such a matrix G is denoted by the symbol AtA. The following theorem is proved in [31].

THEOREM 7.1.
(a) (b) (c) (d)

IfG~{A~} A l l A G + A l z G is unique and n.n.d. A'lzAG+ A22G is unique. A l l - A I I A G - - A I z G is n.n.d. For any y ~ R m min

xER"

IAx-Y 2_,,(A -X - - J \~11 A l l A G - A I a G ) y .

(7.2)

(e) A'12(AGA - A ) + Az2GA A'lz(AGA - A). (f) For any u ~ R" min

is n.n.d, and so also is A 2 2 - A a 2 G A -

xER n

Ax u 2=u'{AI2(AGA-A)+A22GA}u.

(7.3)

We also have the following result. THEOREM 7.2.(a) For a matrix G to be a A-optimal inverse of A it is necessary and sufficient that 2x22G=(A'A11A-kA'A~2+A'~2A+A22)G=A'AH+AI2. (b) A particular solution to a A-optimal inverse of A is Go = A~(A'All + A'12) where A~2 is any g-inverse of A22. (c) A general solution is G = GO+ ( I - z~22A22) U~ where U is arbitrary. (d) A A-optimal inverse of A is unique if and only if A22 /'5"p.d. The special case A = M fi3 N is interesting as Theorems 7.1 and 7.2 when restated for this special case show striking similarity with corresponding results for the minimum seminorm semileast squares inverse. When M and N are p.d., similar to Theorem 6.3 we have the result (7.6) (7.4)

(7.5)

484

Sujit Kumar Mitra

THEOREM 7.3.

(A~ aN)' = (A')*u ~ ~-'

(7.7)

The following theorem shows that a m i n i m u m seminorm semileast squares inverse can be viewed as the limit of a properly chosen sequence of optimal inverses. THEOREM 7.4. lim ( A M A +NN)+A'M E (AMN)
--.-',0 +

(7.8)

Proofs of all the stated theorems on optimal inverse will be found in [31]o

8.

Constrained Inverse

Following the work of Bott and Duffin [13], R a o and Mitra [58] defined certain classes of constrained inverses of a matrix and described applications of this concept. The general motivation for introducing constraints of different types is as follows: If A is a nonsingular matrix, then there exists a matrix G such that A G - - G A = I. If A is rectangular or square singular, no such G exists. However, we may look for a matrix G such that A G and GA behave like identity matrices in certain specified operations. For instance, we m a y d e m a n d that e'GAf= e'f b'AGc=b'c for all e E q 6 l , f E ~ for all b ~ 5 " 2 , c ~ , l, (8.1) (8.2)

where 6~6~1,62b~ z, ~l, ~ are specified subspaces in vector spaces of appropriate dimensions. We describe the conditions such as (8.1) b y saying that GA is an identity for (621fl, ~l). For example condition (5.11) can be described by saying that M A is an identity for ( ~ L ( A ) , 9L(A')). We m a y took upon a m x n matrix A ,as a transformation A x = y mapping vectors in R" into a subspace ~ of R m. In what sense can we provide an inverse transformation through a matrix G? The m a p p i n g A x = y is, in general, m a n y to one in which case Gy = x gives only one choice out of several alternatives. We m a y then choose G such that the inverse transformation leads to vectors in a specified subspace of R". Then a general type of condition we m a y impose on G in addition to conditions of the type (8.1) (8.2) is as follows: G maps vectors of 6"6 a into ~ l , G' maps vectors of QL1 into ~2,

(8.3)
(8.4)

Generalized inverse of matrices and applications to linear models

485

where %1, ~1~,W1, W2 are specified subspaces of R" or R m. A detailed study of constrained inverses was done in [58] where it was shown that with the help of this concept, by a judicious choice of constraints, the various g-inverses and pseudoinverses known in literature can be brought under a common classification scheme.

9.

Generalized inverse of partitioned matrices

In this section we present explicit expressions for g-inverses of partitioned matrices which have been found useful in many practical applications. We first consider the simple partitioned matrix (A :a) where A is m n matrix and a is m X 1. Theorem 9.1 shows that a g-inverse of (A : a) can be expressed in the form

X = ( G-db't'b'

(9.1)

where G is a g-inverse of A, d = Ga and b is suitably defined.


THEOREM 9.1. Let A be m n matrix, a be a column vector (a m-tuple) and X be as in (9.1).

Case 1.

Let a ~ ~ ( A ) and b = c / c ' a, c = ( I - A G ) ' ( I - A G ) a . XE{(A :a)-}


{(A :a) 7 } E {(A : a)7 ~} {(A : a ) / )

Then:

if c. e { A - } ,
if GE{AT},

if a E{A.7}, if a e{A,-},
if G=A +
(9.2)

=(A :a) + Case 2.

Let a ~ CAlL(A). Then X~{(A:a)-} E{(A:a)r}


E{(A:a)t } ~{(A:a)m }

if if if if if

GE{A-} G~{Ar) G~{AI- } GE(A~,} G=A +

and arbitrary b, and b = G' a (a arbitrary), and arbitrary b, and b = and b = G'Ga l+a'G'Ga' G'Ga l+a'G'Ga "
(9.3)

=(A:a) +

486 THEOREM 9.2.

Sujit KumarMitra Let X = ( G) be a g-inverse of (A :a).

Case 1. a ~ ~ ( A ) . Then
GE(A-) ~(A~-} E{A,,~)

if X~.((A:a) ), if XE((A:a)r ) if X~{(A:a)m}, if X E ( ( A : a ) / ) if X=(A :a) + andaEC~(A'), anda~CYC(A').


(9.4)

andGa=O,

~-(Az- )
=A +

Further in Case 1." G ( I - a b ' ) E { A - ) if X ~ { ( A : a ) - ) , G(I--ab')E{A; ) if X~((A:a)~-), ( A'ab'G' ) A , E { A Z ) GG' I+ 1 --b'G'-~A--'a


A + = G I--b-;-~

if X ~ ( ( A : a ) ; - ) ,
(9.5)

if X=(A:a) +.

Case 2. a E ~6 (A) and b' a=/=l. Let Y= G(l+(ab'/1 - b ' a)). Then YE(A ) if X E ( ( A : a )
),

~(Ar- )
E(A,,T}

if XE((A:a)/-}, if XE((A:a)~), if X E ( ( A : a ) ? ) , if X=(A:a) +.


(9.6)

E(A l-)
=A +

Proof of these two theorems are given in [32]. The following theorem is due to Rohde [60]. THEOREM 9.3.

Let
C'

Generalized inverse of matrices and applications to linear models

487

be a n.n.d, matrix and D = B - C'A -C, then G = ( A-+A-D-C'ACD-C'A-is a g-inverse of M.


Theorem 9.4 is the version due to Rao [49] of an earlier result due to Khatri [21]. Let V be a n.n.d, matrix of order n n and X be of order n m. Further let

--~CD-)

by any choice of g-inverse. THEOREM 9.4. hold:

Let V, C l, C2, C3, C4 be as defined in (9.7). Then the following

(i)

S'

C;

64

is another choice of g-inverse.


(ii)

XC3X = X,

XC~X = X,

(9.9)

i.e., C3 and C~ are g-inverses of X.


(iii)

X' C~X=O,

VC,X=O,

X' C1V=O.

(9.10) (9.11) (9.12) (9.13)

(iv)
(v) (vi)

V C 2 X t = X C ~ V = X C 4 X ' = X C ~ X t~. VC~X'=XC3V. VC, VC1V= VC, V, TrVC,=R(V:X)-R(X).

C3 is a g-inverse o f ( V : X ) .

Note that C~ and C 2 are in fact minimum V seminorm g-inverses of X'.

10.

Intersection of vector subspaces

The generalized inverse has been used in several ways to provide explicit expressions for the intersection of two vector subspaces. The known results in this direction are summarized in Theorems 10.1 and 10.2.

488

Sujit Kumar Mitra

Let A and B be real matrices of order m n and m s respectively. We seek expressions for matrices J and K such that GNL(J) = 9]L(A) r~ 9]L(B), and e)L(K) = ~;)L(A')r-i 9iL(B ). (10.2)

(lO.1)

THEOREM 10.1. The following are alternative choices of a matrix J satisfying (lO.1) (i)
where J=AF', F= I- WWand M)-M, M = BB'. W ' = A - BB A.

(10.3)

(ii)
where

J-A(A+
A = AA'

(10.4)

and

(iii)

J = M(A')~, (M)A',

(10.5)

where M = BB'.

PROOF. Formula (10.3) was proposed in [36]. Expression (10.4) is the parallel sum P(A,M) of n.n.d, matrices A and M as defined by Anderson and Duffin [3]. It was shown in [3] that the column span of P(A, M) is the intersection of 9IL (A) and 6 ~ (M). (10.1) follows since ~ (A) = ~ (A) and 6J[C(M)--sJiL(B). For arbitrary g-inverses A - and M - , A - + M - is a g-inverse of P(A,M) (see [57, p 189] for a proof). That formula (10.5) would have the required property is shown in [35] wherein it is also shown that a n arbitrary g-inverse M - of M is also a g-inverse of the matrix J as determined here. THEOREM 10.2. The following are alternative choices of a matrix K satisfy ing (10.2): (i) (ii)
K=B[I-B'A(B'A)-]', K = M - MAAt(M),

(lO.6)
(10.7)

where M = BB'.

Generalizedinverseof matricesand applications to linearmodels

489

PROOF. Formula (10.6) was proposed in [36]. T h a t formula (10.7) would have the required property is shown in [35] wherein it is also shown that an arbitrary g-inverse M - of M is also a g-inverse of the matrix K as determined here.

P A R T 2:

S T A T I S T I C A L A N A L Y S I S OF A L I N E A R M O D E L

11.

Linear estimation in a general Gauss-Markov modal

We consider a vector valued r a n d o m variable Y such that the expectation and dispersion matrix are given by

E(Y)=Xfi,

D(Y)=X,

where X is a given n x m matrix, [3, a m-tuple, is a vector of unknown parameters and X m a y be partly known. Unless explicitly sta ~ \q therwise the parameter space ~1 of/3 will be assumed to be R m (the m c,mensional real Euclidean space). DEFINITION 1 1.1. A linear functional p'[3 is said to be estimable if it has an unbiased estimator linear in Y, that is if there exists a linear functional b ' Y such that

E(b'V)--p'[3,

flea,.

(ll.1)

The following result is easily established. If f~l is sufficiently rich so that the linear space spanned b y X[3,/3 E a I is identical with ~ ( X ) .

(ll.1)

X'b=p

(11.2)

Here p'[3 is estimable if and only if

pE
Estimability of p'[3 can be checked by applying the criterion

(11.3)

X ' ( X ' ) - p =p,


or equivalently p ' X - X = p ' . It is however not necessary to compute a

490

Sujit Kumar Mitra

g-inverse of X just for checking estimability. It can be done as a by-prod.o uct of routine computations that will any way be necessary.

Note 1. If f~ can be embedded in a hyperplane of dimension (m --1) or less defined for example by the equation Afi=a
it is seen that additional linear functionals could have linear unbiased estimators. One way to demonstrate this fact is to regard a as an observation on a random vector Iio (supplementing the observation on Y) where Y0 has expectation Aft and a null dispersion matrix. If a is nonnull a linear functional in Y ) will often lead to an estimator, which is nonhomogeY0

neous linear in Y. Allowing such estimators however one is able to estimate unbiasedly P'B whenever p ~ C ( X ' :A'). This is a larger collection than 3E(X') unless 62qL(A') C 9]L(X'). DEFINITION 10.2. b*'Y is said to be B L U E (best linear unbiased estimator) of an estimable p'fl if it satisfies (11.1) and in addition has the least variance in the class of linear unbiased estimators of p'fl. Since Var(b' Y) = b'~b, the problem of computing a B L U E reduces to that of finding a solution to (1 1.2) for which b'Eb is a minimum. Computation of B L U E will thus require further knowledge about the nature of dispersion matrices E that are admissible in this context. If E is completely arbitrary or more precisely if the parameter space ~ of Z contains n(n + 1)/2 linearly independent matrices it is not hard to see that no linear estimator other than a constant can possibly be the B L U E of an estimable parametric functional P'B. W e shall examine some special cases.

Case 1. Z = o21, a 2 > 0 (possibly unknown). Here b is obtained as a m i n i m u m norm solution of (1 1.2) we have

b*=( X')2p.
Hence B L U E of p'fl is given by

b*' r =p'[ (X')m ]' r =p'X,- r


=p'/~, where fl = X t- Y is a least squares solution of the equation Y = Xfl (possibly

Generalized inverse of matrices and applications to linear models

491

inconsistent) and hence a solution of the normal equations

Cfl= Q,
where C=X'X If p'fi is estimable Vat(p@) = p ' [ (X'),, ]'(X '),,~oo 2
= p ' C -1)o2.

(11.4) and Q=X'.

(11.5)

(11.6)

If further q'fi is also estimable Cov(p'/?, q@) = p ' [ (X')~ ]'(X')m q2

=p' C Case 2.
Here

qo 2=

q' C

p o 2.

(11.7)

Y,=oZv, V known positive definite, 0 2 > 0 (possibly unknown).

b*-- (X')2,(~.
BLUE of p'fl=p'[(X')m(V)]'Y=p'X,(-v bY=p'[3, where j~ is a V -~ least squares solution of the equation Y=Xfl or a solution of the normal equation (11.4) where we have now

C=X'V-~X

and

Q=X'V-1y.

(11.8)

The formula for variances and covariances given above in terms of C remains valid.

Case 3.
Here

= o2V, V known positive semidefinite, 02 > 0 (possibly unknown).

b =(x')L(v~p.
BLUE of p'fi=p'[(x')~(J'Y-pXl(v+cxx,) Y=pfi, where fi is ( V + cXX')- least squares solution of the equation Y = XB or a solution of the normal equation (11.4) where we have now
.-! t ~

C=X'(V+eXX') X

and

Q = X ' ( V 4 eXX')- Y,

(11.9)

and c is an arbitrary positive constant.

492

Sujit KumarMitra

If p'fl and q'fl are estimable V a r ( p ' ~ ) ~=p'[ (X')m(v)]' V(X')m(V)pO2

=p'(C - -- cI)po 2
and Cov(p'/~, q'/~) =p'(C-- cI)qo 2= q'(C - - cI)po 2

(11.1o)

(11.11)

Rao [51] showed that the most general form of C and Q in (10.9) are

C= X ' ( V + X U X ' ) - x ,

Q= X'( V + XUX')

where U is arbitrary subject to the condition Rank ( V + X U X ) - = R ( V : X )

Note 2. Since for all the cases outlined above and for the choice of C as indicated 9]L(C)--9]L(X'), estimability of p'fi can be determined by checking if CC p =p.
It was pointed out in [30] that in some cases there may be some advantages in using V - in place of ( V + c X X ' ) - as the weight matrix for the generalized least squares procedure. This will require the g-inverse V to be specially computed for this purpose. We reproduce below a method of computation as illustrated in [30]. We denote by X~ a matrix formed by linearly independent columns of X such that

v: x ) = % ( v ) 9rc(x 0.
Instead of ( V + cXX')- as suggested above an alternative would be to use M-= ( V + X1X{)- which infact is a g-inverse of V. For a numerical illustra~ tion consider 4 V: 4 2 -2 2 X': 2 0 4 5 4 -1 3 1 2 2 4 5 1 3 0 3 -2 -1 1 3 0 2 -2

Generalized inverse of matrices and applications to linear models

493

The sweepout operations in a square root reduction of V (see R a o and Mitra [57, p 214]), when extended to the rows of X ' , reduce these matrices to 2 0 0 0 0 0 0 2 1 0 0 0 0 0 1 2 0 0 0 1 -1 (0) -1 1 0 1 0 4 -4 (0)

Observe that the second and third columns of X are alternative choices for X 1 and that R a n k ( V : X ) -- 3 + 1 = 4. Keeping in view subsequent computations that are required to be done we choose the third column. It is important to keep track of the positions which the columns of X~ occupied in the original matrix X. Let T denote the upper triangular matrix contained in the reduced form of V given above with the null rows suitably replaced by the rows of X~. We carry out the usual steps of pivotal condensation for computing a g-inverse of T' as shown in Table 11.1. Then the matrix C of normal equation is given by
C=X'MX=S'S,

and Q = X ' M Y = S ' W.

Table 11.1 Showing the computations necessary for setting up the normal equations T' 2 2 1 -1 1 0 0 0 0 1 2 1 0 1 0 0 0 2 3 -2 0 0 1 0 0 0 0 1 0 0 0 1 2 3 3 0 1 1 0 0 X 2 1 0 2 1 1 - 1 0 (T')-X= S 0 2 3 -2 0 0 1 0 Y 15.0 19.1 17.7 5.2 7.5 8.1 -2.0 0.6 ( T ' ) - Y= W

494

Sujit KumarMitra

Table 11.2 Showing the solution of normal equations Row


1 2

C
2 0

Q
15.6 1

1' 0 0 1/~/2 - 1
-1

2'
0

3'
0

2 3 !.1 2.1 3.1

2 [0 X/2

3 -1 X/2 1

- 1 t 0 - 1 0

t7.6 -2.01A 7.8"k/2 2.0 0

t 0 0 1
1

0 1 0 0
1

T h e n o r m a l equations can be solved by any convenient method. T h e square root m e t h o d is illustrated in T a b l e 11.2. W e have

~, =(0')(1')=7.8W ~-1 )+z0(-1)+0(-1)=5.8 /32= (0') (2') = 7.8 v ~


Also if
cll = (1')(1') = ~V/~--

(0) + 2.0(1) + 0(1) = 2.0

/~3 = (0')(3') = 7 . 8 V ~ (0) + 2.0(0) + 0(1) = 0

1 )2

+(--

1)2+(--

1)2= ~

c n = cZl-- (1')(2') = - 2 ,
C23~ C32 ~-'~1, C3 3 = 1.

C13 ~= C3 1 = - - 1 ,

C22 ~ 2

then
2

(c)=

5 --2
1

-2

2
1

-1] 1
1

is one choice of C -

R~= Y'[ M - MX(X'MX) X'M] Y = W ' W - Q't~= 1 2 6 . 2 2 - 125.68=0.54


on r a n k ( V : X ) - r a n k X = r a n k ( V : X ) - - r a n k C = 4 - 2 = 2 d.f. If M = (V+ XUX')- and p'fl and q'fl are estimable it was shown b y R a o [49] that

Generalized inverse of matrices and applications to linear models

495

the following formulas are valid Var(p'/3) = ~ A p , where A=C--U. Here since X 1 is formed by the third column of X, X1X ~= X U X ' where the matrix U consists exclusively of O's except for u33 which is 1. This gives C o v ( p ' L q'/~ ) = o ~ ' a q , (11.12)

A=

-2
--1

2
1

"

I f p ' f l = X ' Q and q'i3=l~' Q we also have here

var(p'/

)=

2(p'X-p'

Up)

Cov(p'/~, q' t~ ) = o2(p' l~ - p' Uq) = o2(EPil~i- P3q3). The known information about/~: We recall once again that X~ is formed by the third column of X and mark the corresponding row of Table 11.2. It provides sure information about/3, namely
/~2 - / 3 3 = 2.

For a proof one has only to check that Var(Q3)=0. Case 3 (continued). The inverse partitioned matrix method Rao [49] gives an interesting method of computing BLUE when V is singular. Let

where C~ is a matrix of order n n. Using Theorem 9.4 it is seen that ifp'/~ is estimable a BLUE of p'fl is given by p'/3 where /3= C3Y or (C~)Y. Further Var(p'/~)=p'C4po 2. If q'fl is also estimable Cov(p'fl, q'~ ) = p' C4qo 2 = q' C4po 2

496

SujitKumarMitra

Unbiased estimate of o 2 is given by

Y'C 1Y/f
where f = rank( V : X) - r a n k X = rank(

V + XX') -

r a n k X = tr VC l

XB)'M(Y--X~) a n d f i s as defined above. M = 2, - - ( V + cXX')- or V - in Case 3 depending

Unbiased estimate of o 2 in general is given by R~/f where R ~ = (Y-I in Case 1, = V -1 in Case upon which ever is used in the least squares procedure. /? refers to the corresponding least squares solution. Observe that f - - n - r a n k X in Cases 1 and 2. Note. Albert [2] has given the expression

XX + [ I - (QVQ) + QV]' Y.
for the BLUE of Xfi in Case 3. For other alternative expressions the reader is referred to Khatri [21] and Rao [52, 55].

Case 4. The general case


We consider the model (Y, Xfl, Vo; ~l, ~2) where the admissible dispersion matrices V, belong to a subset cV = { Vo : o E f~2} of the linear space of real symmetric matrices of order n n. The index parameter o will be assumed to be a real p - t u p l e . For a subset S of a vector space its linear span L(S) is the smallest vector subspace in which the whole of S can be embedded. To be more specific we shall assume that (/3,t r) belongs to f~l X~22 the cartesian product of ~1 and ~22 and that L{Xfl: (~'~1) and L(C~) are of dimension equal to rank X a n d p respectively. Let V1, V2... Vp be linearly independent non-negative definite matrices spanning L(~V). Put

v0= vl+..-+Vp
M= ( Vo + X X ' ) -

K= I - MX(X'MX)-X' W= X'M( i~__ 1V~KK' V~)MX


The following theorem given in [34] is a generalization of a similar theorem due to Seely and Zyskind [62].

Generalizedinverseof matricesand applications to linearmodels


THEOREM 1 1.1. Under the above model Xfl has a BLUE, if and only if

497

X'MV~K=O,

Vi.

(11.13)

If (11.13) is satisfied the B L U E of Xfi is X/~ where

fl=(X'MX)- X'MY.
We give below a generalization of an unpublished result due to R. Zmyslony THEOREM 1 1.2. W=O

Condition (11.13) is equivalent to or


trW=O. (11.14)

If (11.14) is not satisfiedp'fi has a BLUE if and only i f p ' belongs to the row span of

(I- WW-)X'MX,
or equivalently

p ' ( X ' M X ) - W=O,

(11.15)

in which casep'/~ is a B L U E ofp'fl where/3 is as defined in Theorem 11.1. It is interesting to note that if VI, V2..... Vp are members of ~ , for every estimable p'fl the estimator p'fl so constructed is at least an admissible linear unbiased estimator of p'fl if the B L U E does not exist.

Identifiability and estimability Let ~ , ~ denote the probability distribution of the r a n d o m variable Y when the true parameter point is (fl, Y.). The linear parametric functional P'fl is said to be identifiable by distribution if
~B, ~ = ~Bo,~0==>p'fl = P' flo" (11.16)

Consider the situation where ~3B, x depends on fl only through Xfl ( = E ( Y ) ) . This is true for example when Y has a multivariate normal distribution. Here condition (11.16) can be restated as

X fl = X flo~p' fl = P' flo"

(11.17)

By Theorem 2.1(c) this is true iff p E ~ C ( X ' ) . Using (11.3) it is seen therefore that identifiability by distribution is equivalent to (linear) estimability. Assume now that p'fl has an unbiased estimator b ( Y ) not neces-

498

Sujit Kumar Mitra

sarily linear. Then E { b ( r ) l B, y.} =p'/~. If s]3~,~:and sJ3a0,~ are equal (11.18) can be satisfied only if (11.!8 )

p'B=p%.
We are thus led to the following interesting theorem due to Bunke and Bunke [ 141. THEOREM 11.3. Under the assumptions' stated above the following statements are equivalent. (i) p'fl is estimable. (ii) p' fl has an unbiased estimator (not necessarily linear). (iii) p ~ s91L(X'). (iv) p'fl is identifiable by distribution.

12.

Tests of linear hypotheses

We shall confine our attention to Cases 1 to 3 only. We shall further assume that Y has a multivariate normal distribution. Let p'fl be an estimable functional. Then the hypothesis p'fi = a can be tested by computing
(p,/~

- a ) / ~ / V a r ( p fi).

"~"

, ^

(12.1)

where Var(p@) is estimated replacing o 2 by its unbiased estimate s2= R 2 / f , and noting that under the hypothesis this is distributed as t on f d.f. To test this hypothesis against a one sided alternative such as p'fl >a or p ' f l < a , the computed value of the test statistic is compared against the appropriate one sided critical value of t. A hypothesis specifying the values of several estimable linear functionals, e.g. H 0 : Pfl = a can be tested as follows. If the dispersion matrix of P/~ be o2D, then we compute

u'D - u / ks 2,

(12.2)

and refer to the critical values (upper tail) of F distribution with k d.f. for numerator and f d.f. for denominator where u - - P f l - a and k = rank D. In Case 3 however on account of the singularity of V certain linear relations between the parameters ill, f12. . . . . tim may be known with certainty as soon

Generalized inverse of matrices and applications to linear models

499

as observations are available. A hypothesis H 0 which contradicts such sure information can be rejected a f o r t i o r i without further statistical tests. This is examined by checking if DD--u = u. A statistical test is necessary only when DD - u = u. Sometimes particularly in Cases 1 and 2 it is more convenient to compute u ' D - u by the fornmla 2o - Ro, 2 u' D -u = R ~ where R~o= rain( Y - Xfl ) ' M ( Y - Xfl ), subject to PB = a, (12.4) (12.3)

andM=Iin Case 1 and V 1 in Case 2. With a choice of V - f o r M i n Case 3 as illustrated in the numerical example the same formula also works if the hypothesis is suitably reformulated using the sure information. The necessary steps are illustrated in [30]. For a proof the interested reader is referred to Rao [50].

Ordered tests for the general linear hypothesis We shall consider briefly the problem of testing the hypothesis P]3 = a against a one sided alternative of the type Pfl >1a (where the inequality holds coordinate wise and is strict at least for one coordinate). We shall assume that the various coordinates of Pfl are linearly independent and that after suitable reparametrization if necessary the hypothesis is reformulated as Pfl = 0 and the alternative as Pfl/> 0. A natural extension of the above mentioned test statistic (12.2) or rather of its equivalent Beta distributed version ( R~0 - Ro]/2"~ / R~o2 is
2 2 2 (R.oR,)/R.

(12.5)

where R ,2 = rain( Y-- X f l ) ' M ( Y - Xfl), subject to (12.6)

Pfi >10.

The null distribution of the statistic (12.5) is known to be a weighted linear combination of Beta distributions ([6], p 179). The weights however depend heavily on matrices X and P and are explicitly available only for some special cases. Minor modifications of the test statistic which take into account an embedding of tile cone ( f i : P f i >1a) in an appropriate circular

500

Sujit Kumar Mitra

cone is considered in [7] and [43]. The distribution under the hypothesis of any such statistic is again a weighted linear combination of Beta distributions but the weights can be analytically determined. Note 1. When Pfl is not estimable, it was shown in [36], that if one applies the formula (12.3) without checking abinitio the estimability of Pfl, the procedure actually leads to a correct test of the testable part of the hypothesis, that is of the subhypothesis

LPfl = La,

(12.7)

where 3L(P'L')=gL(P')AgIL(X'), provided of course the degrees of freedom is correctly taken as rankD(LPfl), not rankD(P/3). Formula (12.2) fails as u = P t ~ - a is not even invariant under the choice of the solution/~ of normal equations. Correct application of (12.2) requires the identification of subhypothesis (12.7), computation of u b y the formula L ( P f l - a ) and of D defined by o21)= D(LPfl). One could use any one of three formulas proposed in Theorem 10.1 to obtain the intersection s3IL(P') 71 s31L(X'). However for Cases 1 and 2 of Section 1 1, the following choice of L, dictated by (10.5) has some advantages. We recommend

L = CPm(c)

(12.8)

where C is the matrix of the appropriate normal equations. This is because

D( LPfl ) = o2jC - J = a2J.


where J = CP~(c)P and C - is one choice of D - . Note that here D = J. We also recommend checking the condition DD -u = u since this takes care of possible inconsistencies in formulation of the hypothesis.

13.

Bayes linear and minimax linear estimators

In other approaches to linear estimation one visualizes a loss function which in the case of several linear functionals f/=p;fl (i = 1,2 ..... k) within the framework of quadratic loss takes the form

L(f,f)=(f-f)'A(f-f),
where A is a given non-negative definite matrix, '= ( f l J 2 ..... A) and f is the linear estimator proposed for f.

(13.1)

03.2)

Generalizedinverseof matricesand applicationsto linearmodels

501

One could then either take a purely Bayesian approach, consider a prior distribution of the parameters and minimize the average risk (with respect to the chosen prior) or minimize alternatively the maximum risk (the minimax criterion) [16, 23, 24, 25, 49, 53]. The risk function is given by r(f,f;/3, a 2 ) = E ( L ( ~ f ) } . We illustrate these ideas considering the Gauss-Markov model (Y, Xfl, o2V). DEFINITION 13.1o C , Y + d, is called a Bayes Linear Estimator (BLE) of f if E r ( C . V + d . , f ; f l , a2)<Er(CY+d,f;/3,a 2) VC, d, (13.3) where the expectation is taken with respect to the chosen prior distribution of fl and o 2. The Bayes homogeneous linear estimator (BHLE) is defined analogously when one restricts attention to the class of homogeneous linear estimators of the form CY. We have the theorem. THEOREM 13.1. BLE

of f = P f l

under the Gauss-Markov

model
(13.4)

( Y, Xfl, o2V) is given by


f-- C, Y + d,,

where

c.=e[(x%,w]', d.=[e- C.X]E(/3)


and W= D(fl)/E(oZ). Note 1. An alternative expression for BLE of Pfl is P/? where
/~= [ (X')tv~ rv]' Y + [ I - ( (x')tvew }'X] E(/3).

(13.s)
(13.6)

This theorem can be proved on the same lines as Theorem 2.1 in [31].

(13.7)

Note 2. The expressions for BLE do not explicitly or implicitly depend upon the matrix A of the loss function L(f,f). Note 3. The j t h coordinate of/~ is a minimum expected mean square error estimator of ~ (thejth coordinate of/3). Similarly BHLE of f = Pfl is given by P[(X')tvew]'Y,
where

W=E(flfl')/E(a2),

(13.8)

and remarks similar to those in Notes 2 and 3 above are applicable.

502

Sujit Kumar Mitra

* ' - -X~w-'~v ' (Theorem 7.3). When V and W are invertible [(X ' )v,w] Here clearly [(X')tv~w]'Y is seen to be B L U E of its expectation. Even otherwise it is clear from the definition that it has to be so in general. We present below alternative expressions for the m i n i m u m expected mean square error estimator of a single parametric functional p'fi which shows explicitly the extent to which information on a priori distribution of the parameters is used in its computation. Let (13.9) then M E M S E E ofp'fl is b' Y + d where b = ( V + Wll ) - W1:, and (13.10)

d= E(p'3) - b'E(X3).

(13.11)

Let X 1 be a matrix formed by r linearly independent rows of X where r = R a n k X . Let X l/3 be B L U E of X 1 ft. Put

D(X13)=oaS,

(13.12)

Then M E M S E E of p'B is C'X 1t~+ e, where

C = ( S + Tll )- T12
and

(13.14)
(13.15)

e= E(p' fl ) - C'E( X, fl ).

This formula requires inversion of a matrix of order r r in place of one of order n n but can be used only if necessary computations for determination of BLUES, their variances and covariances are otherwise available. DEFINITION 13.2. The Minimax Homogeneous Linear Estimator ( M I H L E ) of f is C o Y given by sup r(CoY,f;fi, o2)=irff where a represents the parameter space. sup

r(CY,f;~,o2),

(13.16)

Generalized inverse o f matrices a n d applications o linear models'

503

For the special case of k = 1 the M I H L E of p'/3 is I~Y given by sup E(I~Y-p'B) 2= inf ~sup E(I'Y-p'/3)2
B, o2 ~ i2 l [3, ~2 E ~2

(13.17)

The definitions are analogously extended to the case where one includes nonhomogeneous linear estimators of the type CY+d. These are called minimax linear estimators (MILE). For our computations we shall consider the case where a = { B, o2: B ' M -'B .<62, 0<02 < < } , and 32,0~ are given positive numbers sup E(I' y_p,fl)2= oi l, VI+ 3 2 ( X ' l - p ) ' M ( X ' l - p ) . (13.19) 03.18)

B,o2~
This attains its minimum when /0= (X')t0 2M, vp, (13.20)

where 02= o~/32. Hence when V is invertible tile M I H L E ofp'fl is


t t XV-,.O2M--,Y. t l~Y=p

(13.21)

It is not difficult to see that sup E ( l ' Y + d - p ' t ~ ) 2


B,o2 ~

= 02,l' VZ+ [ { 8 2 ( X ' L - p ) ' M ( X ' L - - p ) } ' / 2 +

lal]

(13.22)

which implies in particular that l6 Y as obtained above is in fact the M I L E of p'fl. This result could alternatively be expressed in the following form due to Kuks [23]: /3(m) = [ ( X ' ) ; - 2 M ( g V ] ' Y is the M I L E of/3 when A =pp',

and in fact when A is a nonnegative definite matrix of rank 1. Consider the matrix valued loss function

L(~f) = (f-f)(f-f)'

(13.23)

and the associated risk R(~f;B, o2) = ELOC]f). The class of risk matrices

504

Sujit Kumar Mitra

R ( C Y , fl;/~,O 2) as fl, o 2 varies over f~ has an unique maximal d e m e n t


R*( C Y ) = ,,~.CVC' + 8 2( C X _ X)M( CX - I)'
in the sense that R * ( C Y ) - - R ( C Y , fl;fl, a 2) is n.n.d, for every (fl, a 2 ) ~ a . The fact that for every p,p,fl(m) is a minimax estimator of p'fl implies that

R*( C Y ) -- R*( fl("))


is n.n.d, for every C. Thus fl(m) is minimax with respect to the matrix valued loss function as introduced above in the class of homogeneous linear estimators of fl (Bunke [15]). Lauter [25] gives an explicit expression for the M I L E of Pfl for the special case where P = I, M = I and A = I. We reproduce his result in Theorem 12.2. THEOREM 12.2.

I f R a n k X = rn, V is invertible and


(13.24)

a = {( B, o2): B'B <8 2, 0 <o2 <o~),


then the shrunken least squares estimator

(x,v-lx)-~x,v-~y
I + 0 2 t r ( X ' V - 1 X ) -1 is a M I L E for fl under the loss function

(13.25)

c(L/~)=(~-/~),(~-/~)

(13.26)

For the more general case where R a n k X = m, V is invertible, A is arbitrary n.n.d, and P is arbitrary, Lauter also gives an expression for a M I L E for Pfl in terms of a square matrix of order m m and a scalar which have to be determined to satisfy certain conditions. In the absence of an algorithm, his formula is not useful for computational purposes. In certain situations one could use an iterative procedure for computing M I L E suggested by Kuks and Olman [24]. Note 1. As 6 2 - - ~ , 02---~0+. Using Theorem 7.4 therefore we have
02-->0+

lim P ' Xv-,+O2M-IY=p "~ XV-IM-IY

which in a sense could be interpreted as a M I L E of p'fl when the parameter space for fl is u n b o u n d e d in every direction. When p'fi is

Generalized inverse of matrices and applications to linear models

505

estimable this estimator is infact the B L U E o f p ' f l - - i t does not depend on M and it is not hard to see even otherwise that when f~-- {(fl, o 2) :fl E R m, 0<02 <0~}, f = Pfl is estimable and the loss function is

L(f,f)=(f-f)'A(f- f)

(13.1)

then the B L U E P j of P13 is also the M I L E of PB. One reaches an identical conclusion when a 2 is also unbounded if the loss function is

L(Z,/)=(Z-/)'A(Z-/)Io

(13.2)

When p'fl is nonestimable the dependence of the estimator on M is perhaps a warning to the effect that with an unbounded parameter space for 13, it makes no sense to estimating p'fl. In [53], Rao has proved some general theorems on admissibility of estimators extending earlier works of Cohen and Shinozaki.

14.

Best linear minimum bias estimator (BLIMBE)

Towards the end of the previous section we have raised an interesting philosophical question on the wisdom of estimating a nonestimable parametric function. The worst thing to happen would be when the probability distribution of the estimator we propose does not depend on the value assumed by the nonestimable parametric function. And this indeed would be the case if we have no prior information on/3 and the parameter space for fl is R", with a translation parameter family of distributions for Y. With a bounded parameter space for fl, for any estimator .b'Y if its probability distribution depends on fl at all, it will generally depend onp'fl on account of restrictions imposed by boundedness. Consider the parameter space

a= { (fl, oz);fi'M-afl <62, 0 < 0 2 < oe }.


Bias of an estimator

(14.1)

b'Y ofp'13 is given by


(14.2)

(b'X -p')fl.
Maximum bias for variations of fl in [2 is

~/ { 62(X'b - p ) ' M ( X ' b - p ) },


which attains its minimum value if

(14.3)

b = (X')? (M)P.

(14.4)

506

Sujit Kumar Mitra

A m o n g such minimax bias estimators the one with the least variance is called the best linear minimum bias estimator (BLIMBE). T h e B L I M B E is obtained for the choice of

b ---t ~ X , ~}MVP, +

(14.5)

which gives b . g .- p . .{(X )MV} + ' Y = P ' XV-'M + 'Y, if V and M are invertible [16, 17, 49, 57]. Since each minimax bias estimator has exactly the same bias the BLIMBE also minimizes the mean square error a m o n g such estimators. Before we leave this section it seems appropriate that we discuss the conditionally unbiased estimators of Scheffe [61] in the same context. Consider a least squares solution/~ = X l (v ,)Y E ( f l ) = X t ~ v ,)X/3. Observe that this is equal to fl if /3 ~gIL(Xt~v I)X)=6~(H) versely suppose H is a matrix of order m p such that rank H = r a n k X H = rankX, (14.6)

say. Con-(14.7)

and there is a priori evidence to suggest that fl possibly belongs to ~ ( H ) though not strong enough to dictate a reformulation of the linear model, then it is possible to find a least squares solution which will lead to unbiased estimators of estimable parametric function unconditionally and of nonestimable parametric functions conditionally if fl Eg]-C(H). The unique choice of X ~ v-,) to achieve this is given by

X[-(v ~)= H ( X H ) 7 ( v
When H = X '

,).

(14.8)

and V = I one has thus to use least squares solution ]~=

X+y.
IS Improved estimation: Hoerl-Kennard and James-Stein estimators

Consider the estimator flc=Xv ,@cM-IYfor fi which for the choice of c = 0 2= a~/8 2 gives fl(") as defined in Section 13. This estimator reminds us of the ridge regression estimator of Hoerl and K e n n a r d [18, 19] which in our notation can be written as XT.cIY. Consider the case where r a n k X - m. Consider the loss function as introduced in (13.1) and the corresponding risk function r(.~f; fl, o2). The following theorem determines an improvement region in which tic has uniformly a lower risk than the least

Generalized inverse of matrices and applications o linear models

507

squares estimator/? = ( X ' V - 1 X ) - : X ' V - 1 y . It can be proved on the same lines as Theorem 11.2 in Bibby and Toutenburg [9]. TrtEOReM 15.1.

For

0 <c -<2o2//3'M 13
r(/3c,/3;/9,o 2) -<r(/~,/3; t , o2).

(15.1)
(15.2)

We had assumed earlier that o 2 is bounded above by o 2. If it is also bounded below by o 2 > 0 , there is thus a choice of c for which 3c is uniformly better than/3 for every (/3, o 2) in the parameter space

a = {(/3, o2); B ' v -'B < 82, ~,2 -<02 -< Ov~}

(15.3)

Another type of improvement of conventional estimators was considered by James and Stein [20], for a special case of linear models under specific distributional assumptions. These authors consider the case where Y is distributed as Nm(fl, o2I). The data also includes an estimate of o 2 based on S 2, where S 2 / o 2 is distributed as X2 on fd.f. independently of Y. Consider the estimators

rte~= 1

(/+2)y.y/2

(m-2)S 2

(15.4)

of/3 i, i = 1,2 ..... m. Then o _ 2 E Z ( T t ~ ) _ fli)2= f ( m - 2)2 E ( 2 K l + m - 2 ) -1 f+2 (15.5)

where K 1 is a Poisson variable with mean fl'/3/202. This shows that m > 3, T(le) is uniformly better than Y as an estimator of fl under the c o m p o u n d quadratic loss as considered above. The following table from R a o and Shinozaki [59] gives the upper b o u n d to v2= f l ' f l / m o z for which, for every a, a' T(~e) as an estimator of a'fl has an uniformly smaller m e a n square error compared to a' Y.
m: 3 4 5 6 7 8 9

10 0.12

maxv2:

0.47

0.33

0.26

0.21

0.18

0.15

0.13

The following modification of the above estimator due to Rao [54] is on

508

Sujit Kumar Mitra

the same lines as proposed by Lindley [26]. Consider the estimators

T(Ee)=~+

E(yi--)

2 (y-~7)

(15.6)

of fli, i = 1,2 ..... m. Then

o-2EE(T(2e)- fli) 2= m

f(m -2 f+ 2) 2 E ( 2 K 2 4 m - - 3 ) - l '

(15.7)

where K 2 is a Poisson variable with mean >2(/3i -/3)2/2o2. This shows that if m > 4, T(2e) is uniformly better than Y as an estimator o f / 3 under the compound quadratic loss as considered above. The same table as given above with 62= Z ( / 3 i - fl)2/P 2 replacing v2 and p replaced by p - 1 gives upper bounds to 62 for a' T2 (e) as an estimator of a'/3 to have an uniformly smaller mean square error compared to a' Y.

16. Specification errors in the dispersion matrix D(Y)--robustness of BLUE While considering linear models with a singular dispersion matrix D ( Y ) = o2V we have made use of the fact that if b' Y is BLUE of p'fl under the model (Y, Xfl, a2V) it is also B L U E under the model (Y, Xfl, a2(V+ cXX')) for arbitrary c > 0 and vice versa. This shows that certain errors in specifying the dispersion matrix do not materially affect the expressions for BLUE. The following theorem summarizes some known results in this connection. THEOREM 16.1. If for every estimable parametric function p' fl the B L U E under the model (Y, Xfl, o2I) is also the B L U E under the model (Y, Xfl, V) then it is necessary and sufficient that one of the following equivalent conditions holds. (i) X ' VZ = O, where 9]L ( Z ) = % (X'). (ii) V = X A I X ' + Z A 2 Z ' where A 1 and A 2 are arbitrary nonnegative definite matrices of appropriate order. (iii) For every vector u @ 9]L(X), Vu ~ ~ L ( X ) . (iv) 9]L(X) is spanned by a subset of eigenvectors of V. (v) ~,Pi=lRank(X'Qi)=Rank(X), where the columns of matrices QI,QE ..... Qp provide eigenvectors of V corresponding to the p distinct eigenvalues and ~ = 1Rank Qi = n.

Generalized inverse of matrices and applications to linear models

509

(vi) Px V is symmetric where Px represents the orthogonal projector onto 9]L(X) under the innerproducl ( u , v ) = v' u. Conditions (i) and (ii) are due to Rao [481, (iii), (iv), (v) and (vi) are respectively due to Kruskal [22], Zyskind [65] Anderson [25] and Zyskind [65]. For a proof of this proposition the reader is referred to Rao and Mitra [57] which also reports a similar theorem due to Mitra and Rao [36] describing robustness of BLUEs under specification errors affecting the X matrix. Generalizations covering the case of the singular dispersion matrix are given in Mitra and Moore [33]. The following theorem due to Baksalary and Kala [5] gives an upperbound to the Euclidean norm of the difference between the simple least squares estimator and the B L U E of X/? under the model (Y, Xfi, V). Let P = X ( X ' V - I X ) - 1Xt g - 1 denote the orthogonal projector onto 9]L(X) under the inner product induced by V-1. To avoid triviality we assume that X is nonnull and has a rank strictly less than n, the number of rows in X. THEOREM 16.2. Let ~ and (t be respectively the simple least squares and the best linear unbiased estimator of Ix = X~, under the model (Y, X~, V). Then

II~ - ~11 < (V~/Xr)ll Y - ~11,


where Pl is the largest eigenvalue of P V - 1(1- P ) V - 1p and 2~ is the smallest nonnull eigenvalue of P V - ip.
Theorem 16.2 is an improvement over an earlier result due to Haberman.

References
[1] Albert, A. (1972). Regression and the Moore-Penrose Pseudoinverse. Academic Press, New York. [2] Albert, A. (1973). The G a u s s - M a r k o v theorem for regression models with possibly singular covariances. S l A M J. Appl. Math., 24, 182-187. [3] Anderson, W. N., Jr. and Duffin, R. J. (1969). Series and parallel addition of matrices, J. Math. Anal, Appl. 40, 576-594. [4] Anderson, T. W. (1972). Efficient estimation of regression coefficients in time series. Proc. Sixth Berkeley symposium on Math. Slat. and Prob. 1, University of California Press, 471-482. [5] Baksalary, J. K. and Kala, R. (1978). A bound for the Euclidean n o r m of the difference between the least squares and the best linear unbiased estimators, Ann. Statist. 6, 1390-1393.

510

Sujit Kumar Mitra

[6] Barlow, R. E., Bartholomew, D. J., Bremner, J. M. and Brunk, H. D. (1972)o Statistical Inference under Order Restrictions. Wiley, New York. [7] Bohrer, R. (1973). A multivariate t probability integral, Biometrika 60, 647-654. [8] Ben-Israel, A. and Greville, T. N. E. (1974). Generalized Inverses--Theory and Applications. Wiley, New York. [9] Bibby, J. and Toutenburg, H. (1977). Prediction and ltr~roved Estimation in Linear Models. Wiley, New York. [10] Bjerhammer, A. (1951). Application of calculus of matrices to method of least squares with special reference to geodetic calculations, Kungl. Tekn. Hogsk, Handl., Stockholm 49, 1-86. [11] Bjerhammer, A. (1973). Theory of Errors and Generalized Matrix Inverses. Elsevier Scientific Publishing Company, Amsterdam. [12] Bose, R. C. (1959). Unpublished lecture notes on analysis of variance. University of North Carolina, Chapel Hill. [13] Bott, R. and Duffin, R. J. (1953). On the algebra of networks. Trans. Amer. Math. Soc. 74, 99-109. [14] Bunke, H. and Bunke, O. (1974). ldentifiability and estimability Math. Operations-forsch. Statist. 5, 223-233. [15] Bunke, O. (1975). Minimax linear, ridge and shrunken estimators for linear parameters, Math. Operationsforsch Statist. 6, 817-829. [16] Chipman, J. S. (1964). On least squares with insufficient observations. J. Amer. Statist. Assoc. 59, 1078-1111. [17] Drygas, H. (1969). Gauss-Markov estimation and best linear minimum bias estimation. Unpublished Technical Report, University of Heidelberg. [18] Horel, A. E. and Kennard, R. W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55-67. [19] Horel, A. E. and Kennard, R. W. (1970). Ridge regression: applications to nonorthogonal problems. Technometrics 12, 69-82. [20] James, W. and Stein, C. (1961). Estimation with quadratic loss. Proc. Fourth Berkeley Symposium on Math. Stat. and Prob. 1. University of California Press, Berkeley, 361-379. [21] Khatri, C. G. (1968). Some results for the singular multivariate regression models. Sankhy~ Ser. A 30, 268-280. [22] Kruskal, W. (1968). When are Gauss-Markov and least squares estimates identical? A coordinate free approach. Ann. Math. Statist. 39, 70-75. [23] Kuks, J. (1972). A minimax estimator of regression coefficients (in Russian). lzv. Akad. Nauk Eston SSR 21, 73-78. [24] Kuks, J. and Olman, W. (1972). Minimax linear esttmation of regression coefficients (in Russian). Izv. Akad. Nauk Eston SSR 21, 66-72. [25] Lauter, H. (1975). A minimax linear estimator for linear parameters under restrictions in form of inequalities. Math. Operationsforsch. Statist. 5, 689-696. [26] Lindley, D. V. (1962). Discussions in 'Confidence sets for the mean of a multivariate normal distribution' by C. Stein. J. Royal Statist. Soc. Ser. B 24, 265-296. [27] Meyer, C. D., Jr. (1975). The role of the group generalized inverse in the theory of finite Markov Chains. S l A M Rev. 17, 443-464. [28] Mitra, S. K. (1968). On a generalized inverse of a matrix and applications. Sankhy~ Set. A 30, 107-114. [29] Mitra, S. K. (1968). A new class of g-inverse of square matrices. Sankhyd Ser. A 30, 323-330. [30] Mitra, S. K. (1973). Unified least squares approach to linear estimation in a general Gauss-Markov model. S l A M J. Appl. Math. 25, 671-680.

Generalized inverse of matrices and applications to linear models

511

[31] Mitra, S. K. (19'/5). Optimal inverse of a matrix. Sankhyd Ser. A 37, 550-563. [32] Mifra, S. K. and Bimasankaram, P. (1971). Generalized inverses of partitioned matrices and recalculation of least squares estimates for data or model changes. Sankhy8 Set. A 33, 395-410. [33] Mitra, S. K. and Moore, B. J. (1973). Gauss-Markov estimation with an incorrect dispersion matrix. Sankhy~ Set. A 35, 139-152. [34] Mitra, S. K. and Moore Thorne, B. J. (1976). Linear estimation in general linear models. In: S. Ikeda et al. eds., Essays in Probability and Statistics, Shinko Tsusho, Tokyo, 189-201. [35] Mitra, S. K. and Puff, M. L. (1979). Shorted operators and generalized inverses of matrices. Linear Algebra Appl. 25, 45-56. [36] Mitra, S. K. and Rao, C. R. (1968). Some results in estimation and tests of linear hypothesis under the Gauss-Markov model. Sankhy8 Ser. A 30, 281-290. [37] Mitra, S. K. and Rao, C. R. (1969). Conditions for optimality and validity of simple least squares theory. Ann. Math. Statist. 40, 1617-1624. [38] Mitra, S. K. and Rao, C. R. (1974). Projections under seminorms and generalized Moore-Penrose inverses. Linear Algebra Appl. 9, 155--167. [39] Mitra, S. K. and Rao, C. R. (1975). Extensions of a duality theorem concerning g-inverses of matrices, Sankhy~ Ser. A 37, 439-445. [40] Moore, E. H. (1920). On the reciprocal of a general algebraic matrix (Abstract). Bull. Amer. Math. Soc. 26, 394-395. [41] Penrose, R. (1955). A generalized inverse for matrices. Proc. Cambridge Philos. Soe. 51, 406-413. [42] Penrose, R. (1956). On best approximate solutions of linear matrix equations. Proc. Cambridge Philos. Soc. 52, 17-19. [43] Pincus, R. (1975). Testing linear hypotheses under restricted alternatives. Math. Operationsforsch. Statist. 6, 733-751. [44] Pringle, R. M. and Rayner, A. A. (1971). Generalized Inverse Matrices with Applications to Statistics. Griffin, London. [45] Rao, C. R. (1955). Analysis of dispersion for multiply classified data with unequal numbers in cells. Sankhy8 15, 253-280. [46] Rao, C. R. (1962). A note on a generalized inverse of a matrix with applications to problems in mathematical statistics. J. Roy. Statistics Soc. Ser. B 24, 152-158. [47] Rao, C. R. (1967). Calculus of generalized inverses of matrices: Part I--general theory. Sankhy~ 29, 317-350. [48] Rao, C. R. (1967). Least squares theory using an estimated dispersion matrix and its application to measurement of signals. Proc. Fifth Berkeley Symposium on Math. Stat. and Prob. 1, University of California Press, Berkeley, 355--372. [49] Rao, C. R. (1971). Unified theory of linear estimation. Sankhy~ Set'. A 33, 371-394. [50] Rao, C. R. (1962). Some recent results in linear estimation. Sankhy~ Ser. B 34, 369-377. [51] Rao, C. R. (1973). Unified theory of least squares. Comm. ,Statist. 1, 1-8. [52] Rao, C. R. (1973). Representation of best linear unbiased estimators in the GaussMarkov model with a singular dispersion matrix. J. Multivariate Anal. 3, 276-292. [53] Rao, C. R. (1976). Estimation of parameters in a linear model, Ann. Statist. 4~ 1023-1037. [54] Rao, C. R. (1976). Characterization of prior distribution and solution to a compound decision problem. Ann. Statist. 4, 823-835. [55] Rao, C. R. (1978). On the choice of best estimators in singxdar linear models. Comm. Statist. Theor. Math. A 7. [56] Rao, C. R. and Mitra, S. K. (t971). Further contributions to the theory of generalized inverse of matrices and its applications. Sankhy~ Ser. A 33, 289-300.

512

Sujit Kumar Mitra

[57] Rao, C. R. and Mitra, S. K. (1971). Generalized Inverse of Matrices and its Applications. Wiley, New York. [58] Rao, C. R. and Mitra, S. K. (1973). Theory and application of constrained inverse of matrices. S l A M J. Appl. Math. 24, 473-488. [59] Rao, C. R. and Shinozaki, N. (1978). Precision of individual estimators in simultaneous estimation of parameters. Biometrika as, 23-30. [60] Rohde, C. A. (1965). Generalized inverses of partitioned matrices. S I A M J . AppL Math. 13, 1033-1035. [61] Scheffe, H. (1959). Analysis of Variance. Wiley, New York. [62] Seely, J. and Zyskind, G. (1971). Linear spaces and minimum variance. Unbiased estimation. Ann. Math. Statist. 42, 691-703. [63] Sibuya, M. (1970). Subclasses of generalized inverse of matrices. Ann. Inst. Statist. Math. 22, 543-556. [64] Zmyslony, R. (1978). A characterization of best linear unbiased estimators in the general linear model. Prepfint No. 159, Institute of Mathematics, Polish Academy of Sciences, Warsaw. [65] Zyskind, G. (1967). On canonical forms, negative covariance matrices and best and simple least square linear estimator in linear models. Ann. Math. Statist. 38, 1092-1110. [66] Zyskind, G. and Martin, F. B. (1969). On best linear estimation and a general Gauss-Markov theorem in linear models with arbitrary negative covariance structure. S I A M J. AppL Math. 17, 1190-1202.

P. R~ Krishnaiah, ed., Handbook of Statistics, Vol. I North-H011and Publishing Company (1980) 513-570

1 t1~
It

Likelihood Ratio 'rests for Mean Vectors and Covariance Matrices


P. R. Krishnaiah* and Jack C. Lee**
1. Introduction

Likelihood ratio tests play an important role in testing various hypotheses under A N O V A and M A N O V A models. A comprehensive review of the literature until 1957 on the likelihood ratio procedures for testing certain hypotheses on mean vectors and covariance matrices was given by Anderson (1958). In this chapter, we describe various likelihood ratio tests on mean vectors and covariance matrices and discuss computations of the critical values associated with the above tests. Throughout this chapter, we assume that the underlying distribution is multivariate normal. In Section 2, we discuss the likelihood ratio test for testing for the equality of mean vectors of several multivariate normal populations. Also, the test specifying the mean vector is discussed. The approximation of Rao (1951) for the distribution of the determinant of the multivariate beta matrix is also discussed in this section. Likelihood ratio test for multiple independence of several sets of variables is discussed in Section 3. In Section 4, we discuss the likelihood ratio tests for sphericity, specifying the covariance matrix as well as the test for the equality of the covariance matrices. Likelihood ratio test for the multiple homogeneity of the covariance matrices is also discussed in this section. I n Section 5, we discuss the likelihood ratio procedure specifying the mean vector and covariance matrix simultaneously. Also, likelihood ratio test for the equality of the mean vectors and the equality of covariance matrices simultaneously is *The work of this author is supported in part by the Air Force Officeof Scientific Research under Contract F49620-79-C-0161. Reproduction in whole or in part is permitted for any purpose of the United States Government. **The work of this author is supported in part by the National Science Foundation under Grant MCS79-02024. Part of the work of this author was also done at the Air Force Flight Dynamics Laboratory under Contract F33615-76-C-3145.

513

514

P. R. Krishnaiahand Jack C. Lee

discussed in this section. Likelihood ratio test for the equality of the means, equality of the variances and the equality of the covariance matrices simultaneously is discussed in Section 6. In Section 7, we discuss the likelihood ratio test for c o m p o u n d symmetry whereas Section 8 is devoted to the likelihood ratio tests for certain linear structures on the covariance matrices. Applications of the tests on linear structures in the area of components of variance are also discussed in Section 8. Tables which are useful in the application of most of the tests discussed in this chapter are given at the end of the chapter (before References). Most of these tables are constructed by Chang, Krishnaiah and Lee b y approximating certain powers of the likelihood ratio test statistics with Pearson's type I distribution. In discussing the distributions associated with the likelihood ratio tests, we emphasized on approximations partly because the exact expressions are complicated and partly because the accuracy of the above approximations is good. F o r a detailed discussion of some of the developments on exact distributions, the reader is referred to Mathai and Saxena (1973).

2.

T e s t s o n m e a n vectors

In a number of situations, it is of interest to test the hypotheses on m e a n vectors of multivariate normal populations. For example, an experimenter m a y be interested in testing whether the means of various variables are equal to specified values. Or, he m a y be interested in finding out as to whether there is significant difference between the populations when the covariance matrices are equal. In this section, we discuss the likelihood ratio tests on mean vectors. We will first discuss certain distributions useful in the application of these procedures.

2.1.

Hotelling's T 2 statistic and determinant of multivariate beta matrix

Let y be distributed as a p-variate normal with mean vector p and covariance matrix Z. Also, let A be distributed independently of y as the central Wishart distribution with n degrees of freedom a n d E ( A ) - - n E . Then, T 2= ny'A-~y is known to be Hotelling's T 2 statistic. W h e n p =0, Hotelling (1931) showed that T Z ( n - p + 1)/pn is distributed as the central F distribution with (p,n - p + 1) degrees of freedom. It is known (e.g., see Wijsman, 1957; Bowker, 1961) that T Z ( n - p + 1)/pn is distributed as the noncentral F distribution with ( p , n - p + 1) degrees of freedom a n d with p ' E - ~ p as the noncentrality parameter.

Likefihood ratio tests f o r mean vectors and covariance matrices

515

Next, let

Bp,q,n"-[A~+A2 [

IAI[

(2.1)

where A l : p p and A2: p p are independently distributed as central Wishart matrices with n and q degrees of freedom respectively and E ( A i / n ) = E ( A z / q ) = E. The matrix A l ( A 1 -t-A2)- 1 is someth-nes referred to in the literature as a multivariate beta matrix. Schatzoff (1966), Pillai and Gupta (1969), Mathai (1971) and Lee (1972) considered the evaluation of the exact distribution of Bp,q, n and computed percentage points for some values of the parameters. We will now discuss approximations to the distribution of Bp,q, n. Rao (1951) approximated the distribution of

Cms+2X~l \ 11,

-- Dp,q,n)/ZHJp,q, n

-i/s ~-- hi/,

with F distribution with (2r, ms + 2~) degrees of freedom where 2~--- - (pq I - 2), r = 5pq, m = n - (p + q + 1) and s = [(p2q2_ 4 ) / ( p 2 + q 2 _ 5)]1/2. Here, we note that n ( 1 - Bl,q,n)/qBl,q, n is the F distribution with (q,n) degrees of freedom. Also, it is known (see, Wilks, 1935) that I"n \
D/1 1\ R 1/2 ~/.~

U 2 , q , n / / "lU2,q,n

1/2

is distributed as F distribution with 2q and 2(n - 1) degrees of freedom. Table 1 gives a comparison of the exact values with the values obtained by using approximation suggested by Rao. The exact values are taken from Lee (1972).

Table 1 Comparison of Rao's approximation with the exact distribution of Bp.q,,, for a =0.05 p=5,q=7 n 5 8 9 10 12 Exact 0.06920 0.0014212 0.0032991 0.0062212 0.0153692 Rao 0.053394 0.0014755 0.0033610 0.0062627 0.0152890

516

P. R. Krishnaiah and Jack (Z Lee

When the F approximation is used, it is important to keep in m i n d that the denominator degree of freedom (that is ms + 2X) could be noninteger, and hence careful interpolation is in order. The entries under "Exact" give the exact values of c for a =0.05 where
e [ Bp, q,n < C ] = ( ] - o O .

(2.2)

The entries in the last column are approximate values of c obtained by using the approximation of Rao. Here we note that a n u m b e r of likelihood ratio test statistics including Bp, q,n can be expressed as a product of beta variates. Tukey and Wilks (1946) approximated the distribution of the product of beta variables with a single beta variate. Using the m e t h o d of R a o (1951), Roy (1951) obtained an asymptotic expansion for the product of beta variates; the first term in this series is beta distribution. Using the first four moments, Lee et al. (1977) approximated the 1/b distribution of Bp, q,n with Pearson Type I distribution where b is a properly chosen constant. They chose b to be equal to 8 or 2 according as M < 5 or M > 30 where M - - n - p + 1; for the remaining values of M, they chose b to be equal to 4. Using the approximation described above, Lee et al. (1977) computed the values of c 1 for a--0.05, 0.01, M - - 1 ( 1 ) 20, 24, 30, 60, 120 and certain values of (p, q) where P[ C, < c, ] = ( 1 - a), C I = - { n - -1i ( p (2.3)

q+

where loga denotes the natural logarithm of a. Tile above values are given in Table 7 at the end of this chapter. Also given in Table 7 are some values of c I computed by Schatzoff (1966) f o r p and q less than 11. For tables for some other values of p, q and n, the reader is referred to Pillai and G u p t a Table 2 Comparison of LKC approximation with exact values for a = 0.05
p=3,q=3
M cI Exact

p=5,q=7
cI Exact

1 5 10 16 20 40

1.354 1.034 1.010 1.005 1.003 1.001

1.359 1.035 1.011 1.005 1.003 1.001

1.530 1.090 1.035 1.017 1.012 1.003

1.535 1.090 1.035 1.017 1.012 1.003

Likeflhood ratio tests for mean vectors and covariance matrices

517

(1969), Mathai (1971) and Lee (1972). The above authors computed the tables by using methods different from the one used by Lee, Krishnaiah and Chang (LKC). Here we note that the distribution of Bp, q, n is the same as the distribution of Bq,p,n+q_ p. Table 2 gives a comparison of the exact values of c~ for a = 0.05 with the corresponding values obtained using L K C approximation. The above table is taken from Lee et al. (1977). We now discuss the likelihood ratio tests on mean vectors.
2.2. Test specifying m e a n vector

Let x l , . . . , x u be distributed independently as multivariate normal with mean v e c t o r / t and covariance matrix Y,. Also, let Hi:/~=/~o where/~o is known. In addition, let
h 1= N ( N -

1)(2-.- I%)'A - ' ( ~ -

l~o)

(2.4)

where N:~ = Y.~=i a9 and A = G~Y=l(xj - Y.)(xj - ~.)'. Then, it is known that the likelihood ratio test for H 1 involves accepting or rejecting H l according as

where P[~I < F~IIHI ] = (1 -- a). (2.5)

The statistic ;k1 is Hotelling's T 2 statistic and (n - p + l ) ; k l / n p is distributed as central F distribution with (p, n - p + 1) degrees of freedom when H I is true, where n = N - 1. As an illustration, we use Example 4.1 of Morrison* (1967, p. 122) where p = 2 , N = 101, :~. = (55.24, 34.97)' A = 100(210.54 126.99 126.99) 119.68

and/~0=(60,50)'.

The test statistic in this case is 126.99 ] - l ( 55.24-60 ] 119.68 k 34.97-50 ]

X~=101(55.24-60,34.97-50)[ 210.54126.99 =357.43.

*From Multivariate Statistical Methods by D. F. Morrison. Copyright 1967, by McGraw-Hill, Inc. Used with permission of McGraw-Hill Company, Inc.

518

P. R. Krishnaiah and Jack C. Lee

Suppose a = 0 . 0 1 . T h e n f r o m the F table, F2,99(0.01)=4.8 where F~,b(a ) denotes u p p e r 100a% value of F distribution with ( a , b ) degrees of freedom. H e n c e F0.01' 1= 9.7. Since h I > 9.7, the null hypothesis that ~' = (60, 50) is rejected.
2.3. Test for the equality o f m e a n vectors

F o r i = 1,2 ..... q, let x~l . . . . . xm, be distribated independently as multivariate n o r m a l with m e a n vector ~ and covariance matrix E. Also, let Hz: ~l . . . . . ~q. Then, the likelihood ratio test statistic (WiNs, 1932) for testing H 2 is given b y X2= ~ where N = N.q - _ iv, . 1Ni, . .N.2. = Z q_ l ~ t iv ~ l X i t , Nixi.--~,t=lXit ,
q N~a = Z
i=1

(2.6)

n=N-q,

Ni Z ( x i t - .2i.)(xit- xi.) t,
t=l

q
i~l

iv,
t~l

N :o = E E (xi,- .2
W h e n H 2 is true, the statistic h2 is distributed a s Bp,q ~ l,n where Be,q, r was defined b y (2.1). N o w , let C 2 = - ( n l ( p _ q+2))log)t2/Xff(q_l)(a ). Then, we accept or reject H 2 according as C2X c 2 where

e[ C2

(2.7)

As an illustration we give an artificial example. Suppose there are four three-dimensional populations with samples of sizes N i --5, i = 1, 2, 3, 4 a n d the null hypothesis H 2 is to be tested at 0.05 level. If [Zal--2.45 a n d [~o,[ =3.12, then )~2 :=0.79. In our case N = 2 0 , p = 3 , q = 4 , n= N-q=20-4 = 16. In order to complete the test, we have to c o m p u t e C 2 = - { 1 6 - 3(3 1 -4+2))1og0.79/16.9190=0.2160. Now M=16-3+1=14, and we have c 2 = 1.006 from the table in Lee (1972). Thus the null hypothesis that the p o p u l a t i o n m e a n vectors are equal is accepted. W e will now consider testing t h e hypothesis H 2 when q - - 2 . In this case, let
F = N'N2(NN - 2) (.21. - .2z.)'A - 1(.2 L -- -~2.)

(2.8)

= 16. In order to complete the test, we have to c o m p u t e

-{ 16-

Likelihood ratio tests for mean vectors and covariance matrices

519

where

N,.~,.--.= ~JY, xlj, N2.~2. = N,,)vz, x2y and


2
i=1

Ni

A = ~,, ~ ( x , - i.)(xi,- ~.)'.


t=l

(2.9)

~I1ae hypothesis H 2 is accepted or rejected according as

FX G2
where

?[F<&,21H2]=(1-o0.

(2.1o)

When H 2 is true, ( N - p - - I ) F / p ( N - 2 ) is distributed as central F with (p,N-p-1) degrees of freedom. The above test procedure is due to Hotelling and it is equivalent to the likelihood ratio test. As an illustration, we use Example 4.2 of Morrison* (1967, p. 126) where p = 4 , N l =37, N z = 12, ~1 =(12.57, 9.57, 11.49, 7.97)', :g2.= (8.75, 5.33, 8.50, 4.75)' and [11.2553 4 I A = 7 9.4042 7.1489 [ 3.3830 9.4042 13.5318 7.3833 2.5532 7.1489 7.3833 11.5744 2.6170 3.3830] 2.5532 / 2.6170|" / 5.8085J

F r o m the data, we have F = 22.05. Suppose a = 0.01. Then from F table, F0.m,2~3.8. Since {(N - - 2 ) p F o . o l , 2 / ( N - - p - 1)} ~- 15.6 which is less than 22.05, the null hypothesis that two mean vectors/L 1 and ~2 are equal is rejected at a = 0.01 level.

3
3.1.

Test on independence of sets of variates


The likelihood ratio test f o r independence
.....

i Let x j - ( x tv ..... x/) for j = l , 2 . . . . . N. We assume that x 1 x N are distributed independently as multivariate normal with m e a n vector # and

*From Multivariate Statistical Methods by D.F. Morrison. Copyright , 1967, by McGraw-Hill Company, Inc. Used with the permission of McGraw-Hill, Company, Inc.

520

P R. Krishnaiah and J a c k C. Lee

covariance matrix X. Also, let E ( x i ) = t t i and E ((Xi--l, gi)(Xj--l.tj) t )=X/j where x i is of order Pi 1 a n d s = P l + " " " +Pq. I n this section, we wish to test the hypothesis H 3 where H3: X o = O (3.1)

for i vaj = 1..... q. T h e p r o b l e m of testing H 3 is of interest in testing certain hypotheses on m e a n vectors W h e n p l . . . . . pq = p and H 3 is true, the test procedure in Section 2.3 can be used to test the h y p o t h e s i s / t 1. . . . . /~q. N o w , let

All
A21 A -Aql

. .

Alq

..

A2q

Aqq

where
N Aim = E j=l ( X U -- X i . ) ( X m j - Xm.) t , N

Nxc=

X j=l

xij,

n=N-1.

Wilks (1935) derived the likelihood ratio test for testing H a. A c c o r d i n g to this test, we accept or reject H 3 according as
- 2 logX 3 X c a (3.2)

where
P [ - 2 l o g ~ 3 < c 3IH 3 ] = (1 - ,x)

(3.3)

and 3 = qIAI (3.4)

II IA.I
i=l

W h e n q = 2, the distribution of ~k3 u n d e r H 3 is the same as the distribution of Bpl,P2,n_p2. Box (1949) o b t a i n e d an a s y m p t o t i c expression for a class of likelihood ratio test statistics as a linear c o m b i n a t i o n of chi-square variables This class includes test statistics for the equality of m e a n vectors, multiple

Likelihood ratio tests for mean vectors and covariance matrices

521

independence of sets of variables, sphericity, homogeneity of the covariance matrices, etc. Here we note that Wald and Brookner (1941) and Rao (1948) approximated the distribution function of the determinant of multivariate beta matrix with a linear combination of the distribution functions of chi-square variables. The number of terms given by Box for the asymptotic expression is not sufficient to get the accuracy desired in several practical situations. So, Lee et al. (1976) obtained higher order terms. Consul (1969) and Mathai and Rathie (1971) obtained exact expressions for the distribution of X3. Lee et al. (1977) approximated the distribution of ?t]/4 with Pearson's Type I distribution. The following table gives a comparison of the values obtained by approximation of Lee, Chang and Krishnaiah (LCK) with the corresponding values obtained using Box's expansion by taking terms up to order n - 13 The entries in Table 3 are taken from Lee et al. (1977). In Table 3, a 1 and a 2 are the values of a if we use L C K approximation and Box's asymptotic expression respectively where P[ V3 <c31H3]--(1- a) and V3 = -21og?~ 3. Table 8 gives upper 5% and 1% points of the distribution of V3 for some values of the parameters. For the parameter sets not given in the table, an F approximation developed by Box (1949) can be used. In order to use this approximation, we compute V3 = - N log )t3,

ao=
D1 = ~ ( 2 G 3
1

-Ep;, j=l
+ 3 C2),
- G2),
1 fl = "~ G2"

1 D2= 12N~(G4+2G3

We then check the sign of D 2 - D~. If D 2 - D~ is positive, then F = V 3 / b is distributed approximately as an F distribution with fl and f2 degrees of
Table 3 Comparison of LCK approximation with asymptotic expression of order n - 13 q 11 10 15 20 30 3 1.913 1.187 0.860 0.555 3 ~1 0.05 0.05 0.05 0.05 1x2 0.0499 0.0500 0.0500 0.0500 3 4.978 2.947 2.099 1.333 5 ~i 0.05 0.05 0.05 0.05 ~2 0.0488 0.0497 0.0499 0.0500

522

P. IL Krishnaiah and Jack C. Lee

freedom where f2 = ( f l + 2 ) / ( D z - D~) and b = f l / ( 1 - D 1 - ( f l / f 2 ) ) . On the other hand, if D E - D1z is negative, then F = f 2 V 3 / f l ( b V3) is distributed approximately as an F distribution with fl and f2 degrees of freedom where f2--(fl + 2 ) / ( D ~ - 0 2 ) , b = j ~ / ( 1 - 01 +(2/f2) ). The null hypothesis is re o jected at a level if F >Fy,,f2(a) where FA,A(a ) is the upper 100a% value of the F distribution with fl and f2 degrees of freedom. Recently, Mathai and Katiyar (1977) computed the percentage points of )t3 by using an exact expression. To illustrate the likelihood ratio test for mutual independence of variables, we use the data of Abruzzi (1950) on an industrial time study with a sample of size 33 instead of 76. Data was collected o n the length of time taken by several operators on the following variables: (i) pick up and position garment, (ii) press and repress short dart, (iii) repositio n garment on ironing board, (iv) press three quarters of length of long dart, (v) press balance of long dart, (vi) hang garment on rack. The following summaI2 of the data on sample mean vector and covariance matrix based upon the first three variables are taken from Anderson* (1958, pp. 240-241). f 9.47] 57= 125.561' [ 13.25j [ 2.57 s = / 0.85 [ 1.56 0.85 37.00 3.34 1.56] 3.34/. 8.44j

In this problem we have s = 3 , p i = l , q = 3 , X3=0.8555 and hence -21og)~3=0.31214. From Table 8 with a = 0 . 0 5 we find that c3=0.518 as M = N - s - 4 = 33 - 3 - 4 = 26. Thus the null hypothesis of mutual independence is accepted at level of significance a = 0.05.

4.

Tests

on covariance

matrices

The test discussed in Section 2.3 for the equality of mean vectors is valid when the covariance matrices of the populations are equal. So, it is of interest to test for the homogeneity of the covariance matrices. It is also of interest to test for the structure of the covariance matrix in single sample case since we can take advantage of the structure in testing the hypothesis on mean vector. In this section we shall discuss tests concerning the covarianee matrix of a single population as well as the covariance matrices of q independent populations. For the single population case we shall discuss the sphericity test where the covariance matrix is proportional to a given matrix, and shall also discuss the test of the hypothesis that the
*From An Introduction to Multivariate Statistical Analysis by T. W. Anderson. Copyright 1958 by John Wiley and Sons, Inc. Used with permission of the publisher.

Likelihood

ratio tests Jbr mean vectors and covariance matrices

523

covariance matrix is equal to a given matrix. F o r the case of several populations, we shall discuss the homogeneity of covariance matrices across the populations, as well as the multiple homogeneity of covariance matrices where several subgroups of homogeneous covariance matrices are established.
4.1. T h e sphericity test

The likelihood ratio test of the null hypothesis


H 4 : ~] = 02~]0

(0 2 is unknown, :E0 known)

(4.1)

on the basis of the r a n d o m sample x~ . . . . . x N from a p-variate normal distribution with m e a n vector/L and covariance matrix E is based on the statistic

)t4 =

IA~o'l

(trAZol/p)"

(4.2)

N o and trB denotes the trace of where A _ - N, = 1( x t - x , ) ( x t - ~ .), ' N 2 . =Y.N=lx square matrix B. The hypothesis H 4 is accepted or rejected according as X4<>c4 where P [ X 4 / > e 4 I H 4 ] = ( 1 - a ) . The above statistic ~4 for testing H 4 was derived by Mauchly (1940). The null hypothesis (4.1) is equivalent to the canonical f o r m H4: ~ ] - o 2 I ( I being an identity matrix) as we can transform x I..... x u to Yl . . . . . YN by .l~ = Gxj where G is a matrix such that G Z o G ' = I . Thus, the null hypothesis H 4 is equivalent to the hypothesis that we have a set of p independent r a n d o m variables with a c o m m o n variance v 2. Consul (1969), Mathai and Rathie (1970) and Nagarsenker and Pillai (1973a) obtained expressions for the exact distribution of ~4. Table 9 gives the values of c 4 computed by Nagarsenker and Pillai f o r p = 4(1)10 and a = 0.01, 0.05 where e4 is defined

by
e [ ~ 4 i> c41H4] = (I - ~).
(4.3)

Lee et al. (1977) approximated the distribution of ~t41/4with Pearson's Type I distribution. Table 4 gives a comparison of the values of c 4 obtained by using L C K approximation with the corresponding values obtained by Nagarsenker and Pillai using exact expression. F o r the parameter sets not listed in Table 4, we can also use the following asymptotic expression of

524

P. 1L Krishnaiah and Jack C. Lee

Table 4 Comparison of LCK approximation with exact values for a ~ 0.05 p n 6 10 15 21 33 41 LCK 0.0169 0.1297 0.2812 0.4173 0.5833 0.6507 4 Exact 0.0169 0.1297 0.2812 0.4173 0.5833 0.6508 LCK 0.0013 0.0492 0.1608 0.2877 0.4663 0.5453 5 Exact 0.0013 0.0492 0.1608 0.2876 0.4663 0.5453 LCK 0.0029 0.0368 0.1111 0.2665 0.3515 7 Exact 0.0030 0.0368 0.1111 0.2665 0.3515

Box (1949):
P[ - nplogX 4<z] =

= P { d ~z)'~(.,02(P (x~+4~z ~ - P ( ~ < z ) )


+ O ( n -3) (4.4)

where X~ denotes the X2 random variable with f degrees of freedom,

f=p(p+ l ) - l, n = N - 1 ,
0=1
~2

2p2+p+2

6pn
(p + 2)(p - 1)(p - 2)(2p 3 + 6p 2 + 3p + 2)
288p2rt2p 2

(4.5) (4.6)

The null hypothesis is rejected at level a if the computed value of -nplog~ 4 (used as z in (4.4)) provides probability greater than 1 - a. Of course, a rougher approximation will be to approximate X2= -nplog~ 4 as a X2 distribution with f degrees of freedom, with the error of the remaining terms of the order n -a. The null hypothesis is rejected at level a if X2 >X~(a), where X~(a) is the upper 1004% value of X2 distribution with f degrees of freedom. For illustration we use the data described in Section 3.2 by taking all six variables. In this case, ~.=(9.47, 25.56, 13.25, 31.44, 27.29, 8.80)', [ 2.57 [0.85 ,,/1.56 A--,511.79 0.85 37.00 3.34 13.47 7.59 0.52 1.56 3.34 8.44 5.77 2.00 0.50 1.79 13.47 5.77 34.01 10.50 1.77 1.33 7.59 2.00 10.50 23.01 3.43 0.42 ] 0.52[ 0.50 / 1.77 / 3.43[ 4.59J

/1.33
L0.42

Likelihood ratio tests for mean vectors and covariance matrices

525

and we w a n t to test the null hypothesis that the p o p u l a t i o n covariance matrix Z has the f o r m ~ = o2I. Substituting into (4.2) with Z o = I we have ~4--0.0366 which is m u c h smaller t h a n 0.6641 (for N = 8 0 ) and hence the null hypothesis that Z = o2I is rejected at level of significance a = 0.05. 4.2. Test specifying the covariance matrix

Consider the p r o b l e m of testing the null hypothesis Hs: Y~= Z 0 (~o is known), (4.7)

on the basis of a r a n d o m sample x~ ..... x u f r o m a p-variate n o r m a l distribution with m e a n vector # a n d covariance matrix E. T h e likelihood ratio test for testing H 5 was derived b y A n d e r s o n (1958). The modified likelihood ratio test statistic (obtained b y c h a n g i n g N to n = N - 1 in the likelihood ratio statistic) is ~5=(e/n)Pn/ZlAZol,n/2etr(_ 1 AZ0_ i) (4.8)

where n = N - 1 a n d A = Y,tu= l(xt - ~ ) ( x t -- ~.)'. Since the null hypothesis (4.7) is a special case of (4.1) with 0 2 = 1, we see that (4.7) is equivalent to the hypothesis that we have a set of p independent r a n d o m variables each with unit variance. A c c o r d i n g to the likelihood ratio test, we accept or reject H 5 according as -21og~sXc 5 where P [ - 2 1 o g ~ 5 < c51H5 ] = (1 - a). (4.9)

K o r i n (1968) and N a g a r s e n k e r and Pillai (1973b) gave values of c 5 for p = 2 ( 1 ) 1 0 and a = 0 . 0 1 , 0.05. Lee et al. (1977) a p p r o x i m a t e d the distribution of 2~ 1/34 with Pearson's T y p e I distribution. Table 5 taken f r o m the Table 5 Comparison of LCK approximation with exact expression for a = 0.05 p n 6 7 10 11 13 LCK 25.76 24.06 21.75 21.35 20.77 4 Exact 25.76 24.06 21.75 21.35 20.77 n 8 9 10 15 20 25 6 LCK 49.24 45.82 43.62 38.71 36,87 35.89 Exact 49.25 45.83 43.63 38.71 36.86 35.88 n 12 13 14 15 20 25 10 LCK 119.08 111.15 105.76 101.83 91.28 86.51 Exact 119.07 111.15 105.76 101.82 91.28 86.52

526

P. tL Krishnaiah and Jack C. Lee

above paper gives a comparison of the exact values with the values obtained by using L C K approximation for a = 0.05. Table 10 gives the values of c 5 for a = 0 . 0 1 , 0.05; these values are taken from Nagarsenker and Pillai (1973b). When n exceeds the values given in the table, either X2 or F approximation can be used. These approximations are given in Korin (1968). (a) X2 approximation. The statistic X2= - 2 ( 1 - D 1 ) l o g ) ~ 5 is approximated as a X2 distribution with fl = P(P + 1) degrees of freedom, where Dim- ( 2 p + 1 - 2 / ( p + 1)}/6n (b) F approximation. The statistic F = - - 2 1 o g 2 t s / b is approximated as an F distribution with fl and f2 degrees of freedom where f~+2 D2- D 2 ' D 2 = (p - 1)(p + 2 ) / 6 n 2, b=

f,
1-D,-(f~/f2) "

Again in the approximation, the null hypothesis is rejected if the computed value is too large, i.e., X2 >X:,(a) 2 or F>Fy,,:2(a ), where X~(a) is the upper 100a% values of X2 distribution with fl degrees of freedom and Fy,,A(a ) is the upper 100a% value of F distribution with fl and f2 degrees of freedom.

4.3.

Test for the homogeneity of covariance matrices

As in Section 2.3, let Xil..... XgN, ( i = 1,2 ..... q) be distributed independently as multivariate normal with mean vector/~; and covariance matrix Z i. We are interested in testing the hypothesis H 6 where
H6: X 1..... Xq.

(4.10)

Wilks (1932) derived the likelihood ratio test for 1-f6. The modified likelihood ratio statistic (obtained by interchanging N; with n i = N i - 1 in the likelihood ratio statistic) is
q

r[ IA.l~/~n ~n/~
~6 :=: i=l q q g=l

(4.11)

IE
i=1

Aii] n/2 ~[ npng/2

Likelihood ratio tests f o r mean vectors and covariance matrices

527

where
q Ni Ni

n=
i=1

hi,

Ni2"c = ~ x,) and


j=l

Aii= ~_, (xg-2"L)(x~j--~c-/.)'o


j=l

Lee et aL (1977) gave values of c6 for ni=no=(p+l)(1)20(5)30 , p = 2(1)6, q--2(1)10 and a=0.01, 0.05, 0.10, where c6 is defined by
P [ - - 21og~ 6 ~c61H6] == 1 - 0~.

(4.12)

Values of c6 for some values of the parameters are given in Table 11. We reject the null hypothesis at level a when the computed value of -21og2t 6 is larger than c6. For the parameter sets not given in the table, either X: or F approximation due to Box (1949) can be used. The F approximation has been adopted by SPSS in the Option Statistic 7 of the Subprogram Discriminant (Chapter 23 Discriminant Analysis). (a) X2 approximation. The statistic X2=(-210g~k6)(l-D1) is approximately distributed as X2 with f l = p ( p + 1 ) ( q - 1 ) degrees of freedom, where Di=

_,,_-_,

j = l nj

(2p2 + 3 p - 1)(q+ 1) 6(p + 1)qn 0

if nj = n o ( j = 1..... q)o

(b) F approximation. In order to use this approximation, we compute

D2=(P-1)(p+2){

1 }

6-(q--i5

i=, 4

n2
if r9 = n o ( j = 1..... q).

= ( p - 1)(p+2)(q2+q+ 1) 6q2n ~

We then check the sign of D 2-- D~ where D l was given in (a). Case 1: D 2 - D1z >0. F=(--21ogX6)/b is distributed approximately as an F distribution with fl and f2 degrees of freedom, where f2 = (fl + 2)/ (D 2 - Di2) and b = f / ( 1 - D 1-(f~/f2))Case 2: D 2 - D~ < 0. The statistic F=(-2f21ogX6)/fl(b+21ogX6) is approximately distributed as the F distribution with fl and f2 degrees of freedom where f2 = (fl + 2)/(D~ - D2) and b = f z / ( 1 - D l + 2/fz).

528

P. R. Krishnaiahand Jack C. Lee

As an illustration we use Example 4.11 in Morrison* (1967, p.153), where p --2, q--2, n 1= n2=31,

9'881,
4.4

1 2.52

(,90 1009,

and -21og)~ 6 = 2.82. Since c 6 is about 8 which is m u c h larger than 2.82, the null hypothesis that the covariance matrices are homogeneous is accepted at the level of significance a = 0.05.

Test for the multiple homogeneity of covariance matrices

When the null hypothesis of the homogeneity of q covariance matrices is rejected, the multiple homogeneity of covariance matrices considered in this section should be the next hypothesis to test concerning the covariance matrices. This hypothesis is also of interest in studying certain linear structures on the covariance matrices (see Krishnaiah and Lee, 1976). Here we will use the notations of Section 4.3. The null hypothesis to be tested is
~ql ~ ~q~

~ql+l

=...

H7:
L y'q~ '+ I ..... --

Zq,

where qJ=O, (~* =E~=lqi and q~=q. If the null hypothesis is true, then among the covariance matrices of q independent populations, there are k groups of homogeneous covariance matrices, and qi denotes the n u m b e r of populations in/tit group. The modified likelihood ratio statistic is
q

H IAiilnil "~/z
3k 7 -i= 1 (4.14)
I"712

k j-1

~*

E
i~qff_l + l

Aii/n;
_ qY'

where ni and Aii were defined in Section 4.2, and n~--E;J=q;_,+ln i.

*From Multivariate Statistical Methods by D.F. Morrison. Copyright 1967 by McGrawHill, Inc. Used with permission of the publisher.

Likelihood ratio tests for mean vectors and covariance matrices

529

Lee et al. (1977) gave values of c 7 for n i = no; m = no-- p = 1(1)20(5)30; p - - 1,2,3,4; q = dk; k = 2 , 3 ; a n d a =0.05, where c 7 is defined by P [ - 2 Iog)k 7 ~<cT]H 7 ] = 1 - a. (4.15)

These values are given in Table 12 for a = 0 . 0 5 . We reject the null hypothesis at level a when the computed value of - 2 1 o g A 7 is larger than c 7. Because of the restriction q = dk, with k and q specified, d is automatically specified. Thus, in the table, for example, if k = 2 and q = 6, then d = 3. This means that there are two groups of covariance matrices each with three homogeneous covariance matrices. F o r the parameter sets not given in the table, the simple X2 approximation can be used. We approximate - 2 1 o g ) ~ as a X2 distribution with f = p ( p + l ) ( q - k ) degrees of freedom where 2~ is obtained from ~7 by replacing n i by N i and nj* by
N j - Xr=qT_ m i.

As an illustration we give an artificial example where p = 2, n o = n I = n z = n 3 = n a = 3 1 , k = 2 , d---2, q=4, Al, A 2 are as given in the previous section, and A 3 - 3 1(2.11 3.12 2.1118.88 ] and A4--31( 4"012.44 10.55) 2.44

Substituting into (4.14) we have - 2 log)~ 7 --3.724. Since M = 3 1 - 2 - - 2 9 we have c7= 13.05 for a = 0 . 0 5 . Thus, the null hypothesis that Z 1=Z2, Y~3=Y'4 is accepted.

5.

Tests on mean vectors and eovariance matrices simultaneously

In this section we shall cover tests concerning the m e a n vector and covariance matrix of a single population as well as those of q independent populations.
5.1. Test specifying mean vector and covariance matrix

Let x~ ..... xN be a r a n d o m sample from a p-variate normal distribution with mean vector/L and covariance matrix E. The null hypothesis to be tested is

H8:

{ X = X0

(5.1)

where ~ and Y~0 are known. When the null hypothesis is rejected, then either/~V~#o, or E ~ E o , or # ~ # o and Z # E o.

530

P. R. Krishnaiah and dack C. Lee

The null hypothesis is equivalent to the canonical form H~: ~ = 0, X = I as we can transform x 1..... xN to yl ..... YN by .~ = G ( x j - / % ) where G is a matrix such that GY~oG'=I. Thus, the null hypothesis (5.1) is equivalent to the hypothesis that we have a set of p independent r a n d o m variables each with zero mean and unit variance. We, therefore, have complete knowledge of the population under consideration: The likelihood ratio statistic for testing the null hypothesis H 8 is known (see Anderson (1958)) to be - 21ONX8= Nlog[Xot + NtrXo-1 [ S + ( 2 -/L0)(2f-/~0)' ] - Nlog[S[where
N N

Np

(5.2)

NS= E (xj-x.)(xj--x.)',
j=l

N2----E xj.
j=l

Nagarsenker and Pillai (1974) gave values of c 8 for p = 2(1)6, a = 0.01, 0.05 where c s is defined by

e [ - 21ogX8.<

= (1 - . ) .

(5.3)

These values are given in Table 13. Chang et al. (1977) approximated the distrfbution of X]/b with Pearson's Type I distribution where b is a properly chosen constant. T a b l e 6 gives an idea about the accuracy of the above approximation. The values under the column C K L are the values obtained by using the approximation of Chang, Krishnaiah and Lee, whereas the values under the column Exact are the values given by Nagarsenker and Pillai (1974). The following approximations were suggested by Korin and Stevens (1973).
Table 6 Comparison of CKL approximation with exact expression for a = 0.05 p=2 N- 1 6 7 10 11 13 21 CKL 13.69 13.26 12.55 12.40 12.19 11.75 Exact 13.69 13.27 12.55 12.40 12.19 11.75 N- 1 8 9 10 15 21 CKL 20.91 20.38 19.98 18.85 18.26 p=3 Exact 20.92 20.38 19.98 18.85 18.26 N- 1 12 13 14 15 21 25 CKL 51.70 50.48 49.50 48.69 45.79 44.75 p=6 Exact 51.70 50.48 49.50 48.69 45.79 44.75

Likelihood ratio tests for mean vectom and covariance matrices

531

(a) X2 approximation. The statistic X2= - 2 1 o g ~ 8 / ( 1 - - D I ) 2 is approximated as a X2 distribution with fl = P(P + 3) degrees of freedom, where

Dl=(2pZ+9p+ ll)/6N(p+ 3).

(b) F approximation. The statistic F = - 2 1 o g h s / b is approximated as an F distribution with fl and f2 degrees of freedom where

f2=(f,+2)/(D2--D?)~ D2=(p3+6p2+llp+4)/(6N2(p+3)) and b=fl/(1-D,-(f]/f2)).


In the approximation the null hypothesis is rejected if the computed value is too large, i.e., XZ> X~(a) or F >F/,,/2(o0. As an illustration, we give an artificial example where N =28, p = 2, ~ (0.5, 1.2)' and

s_-(15 05)
0.5 0.9 Suppose we want to test the null hypothesis

H= (/_t = 0,
Z=I at level of significance a =0.05. Substituting into (5.2) we obtain - 2 1 o g X 8 -- 55.85 which is m u c h larger than c 8 = 11.591 and hence the null hypothesis that/L = 0 and Z = I is accepted at level of significance a = 0.05.

5.2. Testfor the homogeneityofpopulatiom


In this section, we discuss the likelihood ratio test for H 9 where Hg: / ~l . . . . .
t

~gl . . . . .

Zq, /~q

(5.4)

and ~i and Xi are the m e a n vector and covariance matrix of ith population. In this section, we use the same notation as in Section 4.3. The likelihood ratio test statistic for testing H 9 was derived by Wilks (1932). The modified likelihood ratio test statistic (see Anderson, 1958) is given by
npn/2
X9 = - -

q II
i= 1

IAiil n'/2
(5.5)

i~=lnP'~/zi~=lAiiWi~=lNi(xi.- x..)(x,- x..)'ln/2


where N2.. = ~]Y~x 0.

532

P. R. Krishnaiah and Jack C. Lee

Chang et al. (1977) gave values of c9 for ni=n o, M = n o - P = !(1)20,25,30; p = 1(1)4; q=2(1)8 and a =0.05, where c9 is defined by P[ - 2 tog X9 <
c9[H9] =

1 - a.

(5.6)

We reject the null hypothesis at level a when tile computed value of -21og?t 9 is larger than c9. Table 14 gives the values of c 9 computed by Chang et al. (1977). For the parameter sets not given in the table, we can use Box's asymptotic expansion P [ - 2plogX 9 <z] = P[X~ < z ] +WE(P [ X~+4<z] -- P[ Xf ~ z ] ) + O(n-3) (5.7)

where X 2 denotes the X2 random variable with f degrees of freedom, l q f--- 5( - 1)p(p + 3),

g =1 ng

6(-~- 1)-(p+ 3)

n(p + 3 ) '

(5.8)
(5.9)

o2= 2 - ~ p 2 6

1 n2

n2

l t ( p 2 - 1)(p +2)
1.
1 2 ( q - 1) n2

n.
36(q-1)(p-q+2) 2 n2(p + 3)

(7 q-2q2 + 3pq--2p2-6p-4)

The null hypothesis is rejected at 100a% level if the computed value of -2plogX9 (used as z in the formula (5.7)) provides probability greater than 1 - a. Of course, a rougher approximation will be to approximate X2= -2plogX 9 as a Xa distribution with f degrees of freedom, with error of remaining terms of the order n -2. The null hypothesis is rejected if

L i k e l i h o o d ratio tests f o r m e a n vectors a n d covariance matrices

533

For illustration purpose we give an artificial example where p - - 2 , q = 2, N 1--- N 2-- 11, :~1.= (0.5, 1.2)', ~2. = (0.7, 1.4)' 0.8 (/~1--1) -1A 11=( 0:55 ~:~) and (N 2- 1)-1A22 ~--(0.81"71.5 )" Suppose we want to test the null hypothesis

H10: E l = Z 2 at level of significance a = 0.05. Substituting values of x~., x2. and Aii into (5.5) we obtain - 2 1 o g ~ = 0 . 8 7 1 1 which is much smaller than c9--12.01 and hence the null hypothesis t h a t / h =/L2 and Z l = Z a is accepted at level of significance a = 0.05.

~1 =/x2,

6.

Test for equality of means, variances and covariances

Let/~' = ( ~1..... ~v) and E = (o~j) respectively denote the mean vector and covariance matrix of p-variate normal population. Also, let

H~':

I'Ll .....

P~

H~': oll . . . . .
H~': %-=o12

opp

(i4=j).

Likelihood ratio test criteria for H~ f)H~ and HI~N H~ f)H~ were derived by Wilks (1946). The likelihood ratio test statistics for H ~ N H ~ and H~ A H~' fq H~ are given by Xl0 and Xll respectively where )~10= ISI (s2~(1 - r ~ - 1 ( 1 (6.1)
-t- ( p - -

l)r)'

(6.2)

(p -- 1)s2(1 - r) +

/=1

( x i . - x..) 2 J

NS = A, and A was defined earlier in Section 2.2. Also ps2= tr S, pY. = ]~,e-=l~i., x.=(:~ 1...... ~p.)' and p(p-1)rsZ=Ei4=:sij. Wilks (1946) obtained
the exact distribution of Xl0 and Xl~ for p = 2, 3. Varma (1951) obtained the exact distribution of Xl0 and Xl~ for some other values of p and also computed a few tables. Using the methods similar to those used by Box

534

P. R. Krishnaiah and.lack C. Lee

(1949) and Nair (1938), Nagarsenker (1975a, b) derived the exact distributions of Xl0 and X11- He has also constructed percentage points of )'10 and Air Tables 15 and 16 give the values of Clo and e~l respectively where

P[X,o>>.qolH~n H~] =(1 - a), P[ X11 ~Cl,IH~n H'~ n H~ ] = ( 1 - a ) . 7. Test for compound symmetry

Let x ' = ( x ' 1..... x+2) be distributed as a t-variate normal with mean vector/x' and covariance matrix Z. Also, let E(xi)=l~i, E ( ( x i -t~i)(xj-/~j)'} = Z o. In addition, let x i be of order 1 x 1 for i-<<b, and of orderPl X 1 and P2 X 1 respectively for i-- b + 1, b + 2. We assume that p,./> 2 for i-- 1,2. Let H12 denote the hypothesis that the variances are equal and the covariances are equal within any set of variables, and the covariances of any two variables belonging to any two distinct sets are equal. Votaw (1948) derived the likelihood ratio test for H12 and it is given by 12 where X,2=

I Vl/I ~'1N -/

(7.1)

In the above equation, V=0'/J )' ~'u=Z-=l(X/---x/-)(xJ~-~9.), where (x~ . . . . . . X'b+Z,,),~(a= 1..... N), are independent observations on x'. Also Nxi.=Y~,=lXi, ~, V= (tTu), ~%,= t%,, vsio=p-lYTop,j,
-N ~ ~

Pi i = Pa Z 1Jjaja' Ja

P~cda -- p,,(p,~ -- 1) Z
la .,Lr%

P+.m.'

/ " , j . . = - - - 2 u,omo., PaPa" t.,m.. for s,s'= 1..... b; a , a ' = 1,2, a:/:a'. Also,

ia, l.,j., m ~ = b + f i a + 1..... b+fia+l,fia=Pl + . . .


Now, let

+p._l,fil=0.

P[~kl2 > c121H12] =(1 -- 0/).

(7.2)

Lee et al. (1976) approximated the distribution "12)tl/bwith Pearson's type I distribution where b is a properly chosen constant. The accuracy of the above approximation is good. Using the above approximation, they computed the values of c12 for some values of the parameters and they are given at the end of this chapter in Table 17.

Likelihood ratia tests for mean vectors and covariance matrices

535

8.

Tests on linear structm'es of covariance matrices

In the preceding sections, we discussed likelihood ratio tests on some special structures of the covariance matrices. In this section, we discuss the tests of hypotheses on more general structures of the covariance matrices. Let x ' = ( x 1. . . . . xp) be distributed as a multivariate normal with m e a n vector p and covariance matrix Z. It is of interest to test the hypothesis H13 where
2 Hi3: ~ = o~G~ + o + aiGko

(8.1)

In Eq. (8.1), o12 ..... a~ are unknown and G l . . . . . G k are k n o w n matrices of order p p and are at least positive semidefinite. Beck and Bargrnann (1966) and Srivastava (1966) considered the problem of testing H when G l ..... Gk commute with each other. Anderson (1969) derived the likelihood ratio test for H13 when G~..... Gk do not necessarily commute with each other. We will now consider more general models. Let x'ffi (x'l ..... Xp) be distributed as a multivariate normal with m e a n vector p ' = (~'1..... pp) and covariance matrix E = (Y0)" Here x i is of order q 1, E(xg)=p~ and E [ ( x i - p g ) ( x j - pj)'] = E#. W e wish to test the hypothesis H14 where H14:Z=GIZI+""
+ G k Z k.

(8.2)

In the above equation denotes the Kronecker product, G 1..... Gk are known matrices of order p p , and ~ I . . . . . Z k are unknown matrices of order q q. We will now discuss the procedures discussed b y Krishnaiah and Lee (1976) for testing Hi4. We will first assume that G 1..... Gk c o m m u t e with each other. Then, there exists an orthogonal matrix F: p p such that F G I F ' = diag(~ 1. . . . . ~.p). Now, let y = ( F % ) x . Then y is distributed as multivariate normal with mean vector p * = ( F I q ) p and covariance matrix E* where E* -- ( F Iq)E(F' Iq), and

Z*=

(8.3)

When H14 is true, Z* =diag(Z~l . . . . . E~p) where X$ = h l j E 1+ - . .


k f f i p , the hypothesis H14 is equivalent to H15: X ~ = O

+ h k j Z k. If (i=j= t,2 ..... p).

This is equivalent to the problem of testing for multiple independence of

536

P. R. Krishnaiahand.lack C. Lee

several sets ot varia01es wlaen their joint distribution is multivariate normal. So, we can test the hypothesis H14 by using the test procedure in Section 3. Next, consider the problem of testing the hypothesis HI4 when k <p. Let

x~2zq x2j~

o~o

Xjq
(8.4)

~,1~
Then

~,iq

...

X~,lqj

=A

!
X~

(8.5)

ix;,

Let E: ( p - k)q p q be of rank ( p - k)q and EA--0. Then, the hypothesis Hla under the above assumption is equivalent to H~5 fq H16 where H15 was defined earlier and

H16: E

x:,j
In particular, if

1
m..

=0.

[
~i1

i,p + 1

" " "

~,pl+P2 ~

(8.6)

,ps_l

+ 1

~--" . . .

for i = 1,2,..., k, then

[
Hl6:J

x~,~

.....
~;[ +P2,Pl +P2'

:Pl+ 1,pl+ 1

"..
.....

[ ~;s_t+ l,p..t+ 1

Likelihood ratio tests for mean vectors and covariance matrices

537

The p r o b l e m of testing H16 is equivalent to testing for multiple h o m o g e n e ity of the covariance matrices of multivariate n o r m a l populations when the underlying populations are multivariate normal. So, the procedure discussed in Section 4.4 can be used to test this hypothesis. W e will n o w consider some special cases of the structure Hla. Let Z1 E2 Z2 Zl -.. . .. Z2 Z2

(8.7)

Z 2

E 2

" .

Z 1

If the covariance matrix is of the a b o v e structure, it is called block version of intraclass correlation model. T h e structure is seen to be a special case of (8.2) if we let Gt = I a n d G 2 = J - I where J is a matrix with all its elements equal to unity. W h e n q = 1, Z denotes the covariance m a t r i x with all its diagonal elements equal a n d all off diagonal elements equal. Next, let Z have the structure of the f o r m

El

Z2

..

Zp

z,
E2 Z3

..
. . Z1

(8.8)

where Z j = E p _ j + 2. Also, let W o = I p and

wj=

(0

/pO_j)

forj=l

.....

p--1.

The matrix (8.8) c a n be expressed as

= w0 , + ( w , +

(8.9)

Since the matrices W 0, ( W~ + W~) ..... c o m m u t e with each other, the structure given by (8.8) is of the same f o r m as (8.2). T h e structure (8.8) is k n o w n to be block version of circular symmetry. Olkin (1973) considered the p r o b l e m of testing the hypotheses that E is of the structure (8.7) a n d (8.8). Now, let (xlj . . . . . Xpj), ( j = 1. . . . . N), be a r a n d o m sample of size N d r a w n f r o m a multivariate n o r m a l p o p u l a t i o n with m e a n vector gt a n d covariance matrix N. We wish to test the hypothesis H14 where G 1. . . . . G k do not

538

P. R. KrishnaiahandJack C. Lee

necessarily commute with each other. The likelihood ratio test for given by

1-~14is

x=

ICl~/~

(8.1o)

i~l ai @~ilN/2
where the m a x i m u m likelihood estimates )2~ of ~E i are given by

tr[

]-1

i=1

=tr

i=l

GiE i

k G i ~ i 1-t [ @ @(I~ab]-1 E i=l


(8.11)

for a, b --- 1..... q, j = 1,2 ..... k and f~ab is a q X q matrix with 1 in (a, b)th position and zero elsewhere. Also, N C = Y.~v=l(xi - ~ . ) ( x i - . ) ' and N ~ = Y'/n-l:9. The statistic - 2 1 o g X is distributed asymptotically as a Xz with p degrees of freedom where v---pq(pq+ 1 ) - k q ( q + 1). We will n o w discuss the N e w t o n - R a p h s o n method of solving E~. (8.11) iteratively; Let S I..... S~ be the initial approximation to N1..... Z~ a n d let Y'i = Si + R~ for i-- I, 2 ..... k. Then

(Gi2,
l

= *+ E Gis,
i~I

Z c,
i=I

E c,
i~l

= I-

i=1

~, G i S i

i=1

GiR i +...

i=1

GiS i

(8.12)

Using Eq. (8.12) in Eq. (8.11) and expanding the left side of Eq. (8.11) into terms linear in R 1. . . . . R k, the following equations are obtained:

t~( ~l ~ ~i) ~ l I~~ ~ o~ ~ - ~~ Oa~+ ~ o ' ( ~ ~ o ~ 1~o ~


= tr ~:o~[ C~:o ' - I] ( g ~ % ) (8.13)
for j = l ..... k, where ~ ^0 = ~ i =k 1 Gi@Si. The above equation can be used to solve for R l ..... Rk. Let R u . . . . . Rkl be the values of R I , . . . , R ~ obtained by solving Eq. (8.13). Then Sil=Si-t-Ril is a second approximation to the m a x i m u m likelihood estimate of Z,. where 1.01=Ei=lGi@Sil is a second

L i k e l i h o o d ratio tests f o r m e a n vectors a n d covariance matrices

539

approximation to the maximum likelihood estimate of Z. L e t R~. 2 be the value of R i obtained by solving Eq. (8.13) after replacing ~20 with E01- Then Si2--S;+ Ri2 is the third approximation to E i. We can continue this procedure to the desired degree of accuracy. For initial approximations of X~..... Z k, we can use their unbiased estimates. The unbiased estimates can be obtained from
" N E{trC(Gg~ab)}=t r X N-1 i=1

)(G. @~ab) Gi@~-"i


(8.14)

g = 1..... k; a , b = 1..... q.

When q = 1, the above procedure of testing l-Ii4 when G 1..... Gk are not necessarily commutative is equivalent to the procedure of Anderson (1969). Next, consider the problem of testing the hypothesis Hi7 where
H17: s ~ = U l X 1 U ; + . . . + UkE~U~

(8.15)

and U~ (i= 1..... k) are known matrices of order p s i. Also, p ( p + 1) Y.k_1si(s~+ 1) and Z~ ..... Zk are unknown matrices. The likelihood ratio test statistic for H15 is known (see Krishnaiah and Lee (1976)) to be

2t*= I k
i=l

ICtN/2

u,Y,G

IN~2

(8.16)

where C was defined earlier and Zi are the maximum likelihood estimates of E i given by

E
i=l

iv;

E
i=l

Z v,2i
i=l

G,

g=l ..... k.

(8.17)

1 k The statistic - 2 1 o g ~ * is distributed as X2 with p(p + 1 ) - i Y ' i = lSi(Si + 1) degrees of freedom. When si = 1, the above test statistic 2~* was given in Anderson (1969). Consider the following multivariate components of variance model: y~-" UlCltl -{'- U2~flt2--[- - - -+- U k _ l O t k _ l ' + O l
k

(8.18)

where U~: p si ( i = 1..... k - 1 ) are known. Also, a l , . . . , % are distributed independently as multivariate normal with mean vector 0 and covariance matrices given by E(attt~)= E i. Then, the covariance matrix Y. of y is of the

540

P. R. Krishnaiah and Jack C. Lee

s t r u c t u r e g i v e n b y (8.15). T h e p r o b l e m o f t e s t i n g t h e h y p o t h e s i s Xt . . . . . Ek--1 = 0 is e q u i v a l e n t to t h e p r o b l e m o f t e s t i n g t h e h y p o t h e s i s t h a t E is of ' 1 u n d e r t h e m o d e l (8.18). R a o t h e s t r u c t u r e U1E1 U ; + + U t_ l e t _ 1 U ~(1971) a n d S i n h a a n d W i e a n d (1977) c o n s i d e r e d t h e p r o b l e m of e s t i m a t i n g Xg u s i n g M I N Q U E p r o c e d u r e . F o r v a r i o u s d e t a i l s of the M I N Q U E p r o c e d u r e , t h e r e a d e r is r e f e r r e d to t h e c h a p t e r b y R a o in this v o l u m e . Next consider the model

x = V 1 1 8 1 + . - . + V k _ ~ / 3 ~ _ l + Ip/3k where V 1. . . . , Vk_ 1 a r e k n o w n m a t r i c e s of o r d e r p p . Also, w e a s s u m e t h a t 181. . . . . 18k are d i s t r i b u t e d i n d e p e n d e n t l y as q - v a r i a t e n o r m a l w i t h m e a n Vector 0 a n d u n k n o w n c o v a r i a n c e m a t r i c e s E(18i18~)=f~ I ( i = 1 . . . . . k - 1 ) . T h e c o v a r i a n c e m a t r i x of x in this case is of the structure (8.2).

Table 7* Percentage points of the distribution of the determinant of the multivariate beta matrix pffi3, q = 4 0.050 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120
oo

p=4, q=4 0.050 1.451 1.194 1.114 1.076 1.055 1.042 1.033 1.027 1.022 1.018 1.014 1.010 1.008 1.007 1.006 1.004 1.003 1.002 1.001 1.000
1.000

p=5, q=4 0.050 1.483 1.216 1.130 1.089 1.065 1.050 1.040 1.032 1.027 1,023 1.017 1.013 1.010 1.008 1.007 1.005 1.003 1.002 1.001 1.000
1.000

pffi6, q = 4 0.050 1.517 1.240 1.148 1.102 1.076 1.059 1.047 1.038 1.032 1.027 1.020 1.016 1.013 1.010 1.009 1.006 1.004 1.002 1.001 1.000
1.000

0.010 1.514 1.207 1.116 1.076 1.054 1.040 1.031 1.025 1.021 1.017 1.012 1.010 1.007 1.006 1.005 1.004 1.002 1.001 1.001 1.000
1.000

0.010 1.550 1.229 1.132 1.088 1.063 1.048 1.037 1.030 1.025 1.021 1.015 1.012 1.009 1.008 1.006 1.004 1.003 1.002 1.001 1.000
1.000

0.010 1.589 1.253 1.150 1.I01 1o074 1.056 1.044 1.036 1.030 1.025 1.019 1.014 1.012 1.009 1.008 1.006 1.004 1.002 1.001 1.000
1.000

0.010 1.628 1.279 1.168 1.115 1.085 1;066 1.052 1.043 1.036 1.030 1.023 1.018 1.014 1.012 1.010 1.007 1.005 1.003 1.001 1.000
1.000

1.422 1.174 1.099 1.065 1.046 1.035 1.027 1.022 1.018 1.015 1.011 1.008 1.007 1.005 1.004 1.003 1.002 1.001 1.001 1.000
1.000

,~q

21.0261 26.2170

26.2962

31.9999

31.4104

37.5662

36.4151 42.9798

Likelihood ratio tests for mean vectors and covariance matrices Table 7 (continued) p=7,q=4 p=8,q=4 p=9,q=4 p = lO, q = 4

541

o:o oolo
1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120 oo X~q 1.550 1.263 1.165 1.116 1.087 1.068 1.055 1.045 1.038 1.032 1,024 1.019 1.015 1.013 1.011 1.008 1.005 1.003 1.001 1.000 1.000 41.3372 1.667 1.305 1.188 1.130 1.097 1.076 1.061 1.050 1.042 1.036 1.027 1.021 1.017 1.014 1.012 1.008 1.006 1.003 1.002 1.000 1.000 48.2782

0.050

0.010

0.050

0.010

0.050

0.010

1.583 1.286 1.183 1.130 1.099 1.078 1.063 1.052 1.044 1.038 1.029 1.023 1.018 1.015 1.013 1.009 1.006 1.004 1.002 1.000 1.000 46.1943

1.704 1.330 1.207 1.146 1.109 1.086 1.070 1.058 1.048 1.041 1.031 1.025 1.020 1.016 1.014 1.010 1.007 1.004 1.002 1.001 1.000 53.4858

1.614 1.309 1.201 1.144 1.110 1.088 1.071 1.060 1.050 1.043 1.033 1.026 1.021 1.018 1.015 1.011 1.007 1.004 1.002 1.001 1.000 50.9985

1.740 1.355 1.226 1.161 1.122 1.096 1.078 1.065 1.055 1.047 1.036 1.029 1.023 1.019 1.016 1.012 1.008 1.005 1.002 1.001 1.000 58.6192

1.644 1.331 1.218 1.159 1.122 1.097 1.080 1.067 1.057 1.049 1.038 1.030 1.024 1.020 1.017 1.013 1.009 1.005 1.003 1.001 1.000 55.7585

1.774 1.379 1.244 1.176 1.134 1.107 1.088 1.073 1.062 1.054 1.041 1.033 1.026 1.022 1.019 1.014 1.009 1.006 1.003 1.001 1.000 63.6907

542 Table 7 (continued) p~3, q=6 0.050 0.010

P. R. Krishnaiah and.lack C. Lee

p=5, q=6 0.050 0.010

p=6, q=6
0.050 0.010

p = 7 , q-~6 0.050 0.010

1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120 oo X~q

1.535 1.241 1.145 1.099 1.072 1.056 1.044 1.036 1.030 1.025 1.019 1,014 1.012 1.009 1.008 1.006 1.004 1.002 1.001 1.000 1.000 28.8693

1.649 1.282 1.167 1.113 1.082 1.063 L050 1.041 1.034 1.028 1.021 1.016 1.013 1.011 1.009 1.006 1.004 1.002 1.001 1.000 1.000 34.8053

1.514 1.245 1.154 1.108 1.081 1,063 1.051 1.042 1.035 1.030 1.023 1.018 1.014 1.012 1.010 1.007 1.005 1.003 1.001 1.000 1.000 43.7730

1.625 1.284 1.175 1.121 1.090 1.070 1.056 1.046 1.039 1.033 1.025 1.019 1.016 1.013 1.011 1.008 1.005 1.003 1.001 1.000 1.000 50.8922

1.520 1.255 1.163 1.116 1.088 1.069 1.056 1.046 1.039 1.034 1.025 1.020 1.016 1.013 1.011 1.008 1.006 1.003 1.002 1.000 1.000 50.9985

1.631 1.294 1.183 1.129 1.097 1.076 1.061 1.051 1.043 1.037 1.028 1.022 1.018 1.014 1.012 1.009 1.006 1.004 1.002 1.000 1.000 58.6192

1.530 1.266 1.173 1.124 1.095 1.075 1.062 1.051 1.043 1.037 1.029 1.023 1.018 1.015 1.013 1.009 1.006 1.004 1.002 1.000 1.000 58.1240

1.642 1.306 1.194 1.138 1.105 1.083 1.067 1.056 1.047 1.041 1.031 1.024 1.020 1.016 1.014 1.010 1.007 1.004 1.002 1.001 1.000 66.2062

Likelihood ratio tests for mean vectors and covariance matrices

543

Table 7 (continued) p~8,q=6 p=9,q=6 p = lO, q = 6 p=3,q~8

0.050

0.010

0.050

0.010

0.050

0.010

0.050

0.010

1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 6O 120 oo X~q

1.543 1.279 1.184 1.134 1.103 1.082 1.068 1.057 1.048 1.042 1.032 1.025 1.021 1.017 1.014 1.011 1.007 1.004 1.002 1.001 1.000 65.1708

1.656 1.319 1.205 1.148 1.113 1.090 1.074 1.062 1,052 1.045 1.035 1.027 1.022 1.018 1.015 1.011 1.008 1.005 1.OO2 1.001 1.000 73.6826

1.558 1.293 1.196 1.144 1.112 1.090 1.074 1.062 1.053 1.046 1.035 1.028 1.023 1.019 1.016 1.012 1.008 1.005 1.002 1.001 1.000 72.1532

1.671 1.333 1.218 1.158 1.122 1.098 1.080 1.067 1.058 1.050 1,037 1.031 1.024 1.020 1.017 1.013 1.009 1.005 1,003 1.001 1.000 81.0688

1.573 1.307 1.208 1.154 1.120 1.097 1.081 1.068 1.059 1.051 1.040 1.032 1.026 1.021 1.018 1.014 1.010 1.006 1.003 1.001 1.000 79.0819

1.687 1.348 1.230 1.169 1.131 1,106 1.087 1.074 1.063 1.055 1.042 1.034 1.027 1.023 1.019 1.015 1.010 1.006 1.003 1.001 1.000 88.3794

1.632 1.302 1.190 1.133 1.100 1.078 1.063 1.052 1.043 1.037 1.028 1.022 1.018 1.014 1.012 1.009 1,006 1.004 1.002 1.000 ! .000 36.4151

1.763 1.350 1.216 1.150 1.112 1.087 1.070 1.058 1.048 1.041 1.031 1.024 1.019 1.016 1.013 1.010 1.007 1.004 1.002 1.000 1.000 42.9798

544 Table 7 (continued) p---5, q = 8 0.050


1 2 3 4 5

P. R. Krishnaiah and Jack C. Lee

p ~ 7 , q=8 0.050 1.538 1.282 1.189 1.139 1.108 1.086 1.071 1.060 1.051 1.044 1.034 1.027 1.022 1.018 1.016 1.012 1.008 1.005 1.002 1.001
1.000

p-----8,q = 8 0.050 1.538 1.288 1.195 1.144 1.113 1.091 1.076 1.064 1.055 1.048 1.038 1.030 1.025 1.021 1.017 1.013 1.009 1.005 1.003 1.001
1.000

0.010 1.672 1.321 1.200 1.145 1.110 1.087 1.071 1.059 1.050 1.043 1.033 1.026 1.021 1.017 1.015 1.011 1.007 1.004 1.002 1.001
1.000

0.010 1.648 1.321 1.210 1.152 1.117 1.094 1.077 1.065 1.055 1.048 1.037 1.028 1.023 1.020 1.017 1.012 1.009 1.005 1.003 1.001
1.000

0.010 1.646 1.326 1.215 1.158 1.123 1.099 1.082 1.069 1.059 1.051 1.040 1.031 1.026 1.022 1.018 1.014 1.009 1.006 1.003 1.001
1.000

1.556 1.280 1.182 1.131 1.100 1.079 1.065 1.054 1.046 1.039 1.030 1.024 1.019 1.016 1.013 1.010 1.007 1.004 1.002 1.001
1.000

6 7 8 9 10 12 14 16 18 20 24 30 40 60 120 o0

55.7585 63.6907

74.4683 83.5134

83.6753 93.2169

Likelihood ratio tesls for mean vectors and covarianve matrices

545

Table 7 (continued) p ' ~ 3 , q = 10 0.050 0.010 p = 5, q = 10 0.050 0.010 p = 7 , q ~ 10 0.050 0.010

1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120 oo y~q

1.716 1.359 1.232 1.167 1.127 1.101 1.082 1.068 1.058 1.050 1.038 1.030 1.024 1.020 1.017 1.012 1.009 1.005 1.002 1.001 1.000 43.7730

1.862 1.413 1.262 1.187 1.141 1.112 1.091 1.075 1.064 1.055 1.042 1.033 1.027 1.022 1.019 1.014 1.009 1.006 1.003 1.001 1.000 50.8922

1.600 1.315 1.211 1.155 1.120 1.097 1.080 1.067 1.057 1.050 1.038 1.031 1.025 1.021 1.018 1.013 1.009 1.005 1.003 1.001 1.000 67.5048

1.721 1.359 1.235 1.171 1.131 1.105 1.087 1.073 1.062 1.054 1.041 1.033 1.027 1.022 1.019 1.014 1.010 1.006 1.003 1.002 1.000 76.1539

1.557 1.303 1.208 1.155 1.122 1.099 1.083 1.070 1.060 1.053 1.042 1.034 1.028 1.023 1.019 1.014 1.010 1.006 1.003 1.001 1.000

1.666 1.342 1.229 1.169 1.132 1.107 1.089 L075 1.065 1.056 1.044 1.036 1.029 1.024 1.020 1.015 1.011 1.007 1.003 1.001 1.000

90.5312 100.4250

546
Table 7 (continued)

P. R. Krishnaiah and Jack C. Lee

p-~9,q= ll
0.050 0.010

p~9,q=13 0.050 0.010

p=9,q=15
0.050 0.010

1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120
o0

1.542 1.311 1.219 1.168 1.134 1.111 1.093 1.080 1.070 1.061 1.048 1.039 1.033 1.028 1.024 1.018 1.013 1.008 1.004 1.001
1.000

1.650 1.348 1.240 1.182 1.144 1.118 1.100 1.085 1.074 1.065 1.051 1.042 1.035 1.029 1.025 1.019 1.013 1.008 1.004 1.001
1.000

1.554 1.326 1.234 1.182 1.147 1.122 1.104 1.090 1.078 1.069 1.055 1.045 1.038 1.032 1.028 1.021 1.015 1.009 1.005 1.001
1.000

1.659 1.363 1.255 1.195 1.157 1.130 1.110 1.095 1.082 1.073 1.058 1.047 1.040 1.034 1.029 1.022 1.016 1.010 1.005 1.001
1.000

1.568 i.343 1.250 1.196 1.160 1.134 1.115 1.099 1.087 1.077 1.062 1.051 1.043 1.037 1.032 1.024 1.017 1.011 1.006 1.002
1.000

1.673 1.380 1.271 1.210 1.170 1.142 1.121 1.105 1.092 1.081 1.065 1.054 1.045 1.038 1.033 1.026 1.018 1.012 1.006 1.002
1.000

~q(a)

123.2250 134.6420

143.2460 155.4960

163.1160 176.1380

Likelihood ratio tests for mean vectors and covariance matrices


Table 7 (continued)

547

p=9, q=17
~t

p= ll,q=ll 0.050 0.010

p=ll,q=13 0.050 0.010

0.050

0.010

M 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 24 30 40 60 120 oo 1.585 1.360 1.266 1.210 1.173 1.146 1.126 1.110 1.096 1.086 1.070 1.058 1.049 1.042 1.036 1.028 1.020 1.013 1.007 1.002
1.000

1.690 1.398 1.287 1.225 1.184 1.154 1.132 1.115 1.102 1.090 1.073 1.060 1.051 1.043 1.038 1.029 1.021 1.013 1.007 1.002
1.000

1.536 1.316 1.226 1.175 1.142 1.118 1.100 1.086 1.075 1.067 1.053 1.044 1.036 1.031 1.027 1.020 1.014 1.009 1.005 1.001
1.000

1.638 1.351 1.246 1.189 1.152 1.125 1.106 1.092 1.080 1.070 1.056 1.046 1.038 1.031 1.028 1.021 1.015 1.010 1.005 1.001
1.000

1.538 1.324 1.236 1.185 1.151 1.127 1.108 1.094 1.082 1.073 1.059 1.048 1.041 1.035 1.030 1.023 1.017 1.011 1.005 1.002
1.000

1.637 1.359 1.256 1.198 1.161 1.134 1.114 1.099 1.087 1.077 1.061 1.051 1.042 1.036 1.031 1.024 1.017 1.011 1.005 1.002
1.000

X~q(Ot) 182.8650 196.6090

147.6740 160.1000

171.9070 185.2560

*The entries in this table are the values of c I where P [ C 1 < c d = ( 1 - a ) , M=n-p+l, C 1 = - { n - ( p - q + 1)} logBp,q,n/X~q(a ). For an explanation of the notations, the reader is referred to Section 2.1. T h e entries w h e n both p and q are less than l 1 are reproduced from Schatzoff (1966) with the kind permission of Biometrika Trustees, whereas the remaining entries are reproduced f r o m Lee et al. (1977) with the permission of the editor of the South

African Statistical Journal.

548

P. t~ Krishnaiah and Jack C. Lee

Table 8* Percentage points of the likelihood ratio statistic for multiple independence M~q) (1,3) (2,3) (3,3) (1,4) (2,4) 0,4) (1,5) (2,5) (3,5)

= 0.05
1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30

3.023 2.534 2.180 1.913 1.704 1.537 1.400 1.284 1.187 1.103 1.030 0.967 0.910 0.860 0.815 0.775 0.739 0.705 0.675 0.647 0.597 0.555 0.518 0.486 0.458

6.371 10.037 4.376 9.417 14.932 5.784 12.600 20.006 5.509 8.857 3.721 8.285 13.408 4.978 11.228 18.192 4.857 7,937 3.238 7.406 12.190 4.371 10.146 16.716 4.345 7.198 2.867 6.701 11.189 3.900 9.261 15.477 3.933 6.587 2.572 6.122 10.347 3.520 8.525 14.425 3.593 3.307 3.064 2.854 2.672 2.511 2.369 2.242 2.128 2.025 1.932 1.847 1.769 1.697 1.631 1.513 1.412 1.323 1.244 1.175 6.075 2.333 5 . 6 3 7 5.639 2.134 5.225 5.262 1.967 4.870 4.933 1.825 4.560 4.643 1.701 4.289 4.386 4.157 3.950 3.763 3.593 3.438 3.296 3.165 3.044 2.932 2.732 2.557 2.403 2.267 2.145 1.593 1.498 1.414 1.338 1.271 1.210 1.154 1.104 1.057 1.015 0.939 0.874 0.817 0.767 0.723 4.048 3.833 3.639 3.465 3.306 3.162 3.030 2.908 2.796 2.692 9.627 9.004 8.460 7.980 7.552 7.169 6.823 6.510 6.225 5.964 5.725 5.503 5.299 5.110 4.933 3.208 2.949 2.727 2.537 2.372 2.227 2.099 1.985 1.882 1.790 1.706 1.631 1.561 1.497 1.438 1.333 1.243 1.163 1.094 1.032 7.900 7.364 6.897 6.487 6.125 13.515 12.717 12.011 11.386 10.821

5.801 10.313 5.510 9.852 5.247 9.432 5.009 9.047 4 . 7 9 1 8.691 4.592 4.409 4.240 4.084 3.938 3.677 3.448 3.247 3.067 2.906 8.364 8.061 7.779 7.518 7.272 6.828 6.435 6.086 5.772 5.491

2.506 4.615 2.344 4.336 2.202 4.088 2.076 3.868 1.964 3.671

L i k e l i h o o d ratio tests f o r m e a n vectors a n d covariance matrices

549

Table 8 (continued) M ~ ) (1,3) (2,3) (3,3) (1,4) (2,4) (3,4) (1,5) (2,5) (3,5)

a =0.01 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 4.390 3.678 3.165 2.778 2.475 2.231 2.031 1.864 1.723 1.602 1.496 1.403 1.322 1.249 1.183 1.125 1.072 1.024 0.979 0.939 0.867 0.805 0.752 0.705 0.664 7.991 6.899 6.076 5.432 4.914 4.487 4.129 3.825 3.563 3.335 3.134 2.956 2.798 2.655 2.527 2.410 2.304 2.207 2.117 2.035 1.888 1.761 1.650 1.552 1.465 11.838 10.427 9.334 8.457 7.735 7.130 6.615 6.171 5.784 5.443 5.142 4.871 4.629 4.409 4.210 4.028 3.861 3.707 3.566 3.434 3.199 2.995 2.814 2.655 2.512 5.861 11.195 4.980 9.831 4.330 8.777 3.832 7.936 3.438 7.245 3.118 2.852 2.628 2.437 2.272 2.128 2.001 1.888 1.788 1.697 1.616 1.541 1.474 1.412 1.355 1.254 1.167 1.091 1.025 0.966 6.669 6.179 5.757 5.391 5.068 4.783 4.528 4.299 4.093 3.905 3.734 3.578 3.434 3.302 3.179 2.959 2.768 2.600 2.451 2.318 16.905 15.155 13.762 12.618 11.660 10.846 10.140 9.524 8.981 8.498 8.066 7.676 7.323 7.001 6.708 6.437 6.188 5.958 5.745 5.546 5.188 4.874 4.596 4.348 4.125 7.368 6.333 5.557 4.954 4.470 4.073 3.742 3.461 3.219 3.009 2.826 2.663 2.518 2.388 2.270 2.165 2.068 1.980 1.898 1.823 1.690 1.576 1.475 1.387 1.309 14.500 12.899 11.639 10.616 9.767 9.047 8.430 7.893 7.423 7.006 6.635 6.301 6.000 5.727 5.477 5.249 5.039 4.846 4.667 4.501 4.202 3.940 3.709 3.504 3.321 22.106 20.067 18.421 17.043 15.876 14.864 13.981 13.204 12.513 11.890 11.332 10.822 10.359 9.935 9.545 9.185 8.850 8.541 8.252 7.983 7.495 7.063 6.679 6.335 6.025

*The entries in this table are the values of c 3 when M = N - s - 4 and P [ - 2 log )~3 <c31H3]. For an explanation of the notations, the reader is referred to Section 3.1. The entries in this table are reproduced from Lee et al. (1977) with the kind permission of North-Holland Publishing Company.

550

P. R. Krishnaiah and Jack C. Lee

Table 9* Percentage points of the likelihood ratio test statistic for sphericity 4 5 6 7 a=0.05 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 34 42 50 60 80 100 140 200 300 0.049528 0.023866 0.01687 0.03866 0.06640 0.09739 0.1297 0.1621 0.1938 0.2244 0.2535 0.2812 0.3074 0.3321 0.3533 0.3772 0.4173 0.4530 0.4848 0.5134 0.5390 0.5833 0.6508 0.6998 0.7447 0.8037 0.8406 0.8842 0.9179 0.9447 0.042578 0.021262 0.026400 0.01650 0.03110 0.04919 0.06970 0.09174 0.1146 0.1378 0.1608 0.1835 0.2058 0.2273 0.2482 0.2876 0.3240 0.3575 0.3882 0.4164 0.4663 0.5453 0.6046 0.6603 0.7354 0.7835 0.8413 0.8868 0.9234 8 9 10

0.057479 0.034267 0.022553 0.027004 0.01435 0.02433 0.03653 0.05051 0.06583 0.08210 0.09900 0.1163 0.1337 0.1511 0.1854 0.2t85 0.2501 0.2800 0.3081 0.3594 0.4442 0.5106 0.5749 0.6641 0.7228 0.7948 0.8525 0.8996

0.052284 0.051473 0.039434 0.022950 0.026524 0.01179 0.01870 0.02712 0.03682 0.04761 0.05927 0.07161 0.08446 0.1111 0.1383 0.1654 0.1920 0.2178 0.2665 0.3515 0.4211 0.4910 0.5916 0.6597 0.7453 0.8153 0.8734

0.067219 0.045149 0.033631 0,021233 0.022924 0,025613 0.029379 0.01423 0.02011 0.02693 0.03460 0.04299 0.06154 0.08178 0.1030 0.1248 O.1467 0.1898 0.2697 0.3389 0.4112 0.5196 0.5955 0.6935 0.7757 0.8452

0.062326 0.041817 0.031397 0.055114 0,021295 0.022629 0.024616 0.027314 0.01074 0.01489 0.01973 0.03129 004494 0.06022 0.07667 0.09392 0.1296 0.2006 0.26 0.3376 0.4499 0.5317 0.6405 0.7342 0.8151

0.077722 0.056455 0.045370 0.032107 0,035667 0.021214 0.032235 0.023692 0.025630 0.028071 0.01448 0.02282 0.03287 0.04435 0.05698 0.08468 0.1444 0.2035 0.2715 0,3840 0.4694 0.5870 0.6913 0.7833

L i k e l i h o o d ratio tests' f o r m e a n vectors a n d covariance matrices

551

Table 9 (continued) 4 5 6 7 8 9 10

a=0.01 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 34 42 50 60 80 100 140 200 300 0.053665 0.036904 0.025031 0.01503 0.03046 0.05010 0.07258 0.09679 0.1218 0.1471 0.1721 0.1966 0.2204 0.2434 0.2655 0.2867 0.3264 0.3626 0.3956 0.4257 0.4531 0.5013 0.5769 0.6331 0.6856 0.7558 0.8006 0.8541 0.8961 0.9297 0.069837 0.032184 0.031828 0.036123 0.01361 0.02416 0,03730 0.05248 0.06915 0.08685 0.10518 0.1239 0.1426 0.1613 0.1797 0.2156 0.2497 0.2819 0.3120 0.3402 0.3910 0.4741 0.5383 0.6001 0.6852 0.7407 0.8085 0.8626 0.9066

0062970 0.047187 0.036758 0.022498 0.026033 0.01148 0.01880 0.02782 0.03830 0.04998 0.06261 0.07595 0.08982 0.1040 0.1330 0.1620 0.1904 0.2180 0.2445 0.2940 0.3789 0.4475 0.5157 0.6129 0.6782 0.7598 0.8262 0.8811

0.078604 0.042424 0.032520 0.031017 0.022646 0.035369 0.029296 0.01444 0.02073 0.02807 0.03635 0.04541 0.05514 0.07612 0.09845 0.1215 0.1447 0.1677 0.2125 0.2939 0.3632 0.4348 0.5408 0.6144 0.7088 0.7874 0.8535

0.072760 0.058306 0.049438 0.034120 0.021149 0.022476 0.024516 0.027343 0.01098 0.01542 0.02062 0.02652 0.04017 0.05580 0.07287 0.09092 0.1096 0.1475 0.2211 0.2876 0.3594 0.4705 0.5505 0.6562 0.7465 0.8239

0.089216 0.052879 0.043544 0.031663 0.034943 0.021126 0.032160 0.033669 0.025705 0.038300 0.01146 0.01940 0.02933 0.04095 0.05392 0.06795 0.09805 0.1611 0.2221 0.2912 0.4034 0.4879 0.6028 0.7040 0.7927

0.083573 0.031004 0.041332 0.046681 0.032108 0.035065 0.021018 0.031804 0.032914 0.024386 0.038498 0.01421 0.02145 0.03007 0.03989 0.06231 0.1136 0.1671 0.2311 0.3411 0.4275 0.5495 0.6605 0.7600

*The entries in this table give the values of 4 where P[~4 ~c41H4] =(1 - a). For an explanation of the notations the reader is referred to Section 4.1. The entries in this table are reproduced from Nagarsenker and Pillai (1973a) with the kind permission of the Academic Press, Inc.

552 Table 10"

P. R. Krishnaiah and Jack C. Lee

Percentage points of the likelihood ratio test statistic specifying the covariance matrix p 4 n 4 5 6 5 6 7 8 6 7 8 9 10 11 7 8 9 10 11 12 13 14 15 16 17 a=,O.05 38.184 29.013 25.763 53,820 41.223 36.563 34.056 71.781 55.455 49,247 45.829 43,627 42.076 91.376 71.701 63.829 59.407 56.513 54.450 52,898 51.683 50.704 49.898 49,221 a=O.O1 51.207 37.658 33.082 70.113 51.780 45.382 42.048 91.068 67.949 59.600 55.158 52.351 50.400 112.490 86,150 75.745 70.104 66.486 63.942 62.044 60.568 59.384 58.413 57.601 p 8 n 9 10 11 12 13 14 15 16 17 18 19 20 22 10 11 12 13 14 15 16 17 18 ~=0.05 90.127 80.323 74.809 71.151 68.517 66.516 64.939 63.661 62.603 61.712 60.949 60.290 59.206 110.217 98.734 92.051 87.563 84.298 81.800 79.818 78.203 76.859 a=O.Ol 107.164 93.830 86.901 82.402 79.204 76.797 74.912 10 73.391 72.137 71.084 70.186 69.410 68.138 128.599 113.850 105,564 100,116 96,205 93,239 90.901 89.005 87.434 p n 19 20 22 24 26 28 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 tx=O.05 75.722 74.747 73.158 71.918 70.922 70.104 132.403 119.070 111.146 105.764 101.816 98.772 96.343 94.354 92.691 91.279 89.005 87.251 85.855 84.716 83.768 82.967 a=O.Ol 86.108 84.973 83.131 81.697 80.548 79.605 152.510 135.813 126.102 119.643 114.966 111.392 108.558 106.249 104.326 102.697 100.084 98.075 96.480 95.181 94.103 93.192

*The entries in this table give the values of c5 where P [ - 2 log~ s ~ cslHs] = ( 1 - a). For an explanation of the notations, the reader is referred to Section 4.2. The entries in this table are reproduced from Nagarsenher a n d Pillai (1973b) with the kind permission of Biometrika Trustees.

Likelihood ratio tests for mean vectors and covariance matrices

553

Table 11" Percentage points of the likelihood ratio test statistic for homogeneity of the covariance matrices

n0q
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

10

12.18 10.70 9.97 9.53 9.24 9.04 8.88 8.76 8.67 8.59 8.52 8.47 8.42 8.38 8.35 8.32 8.28 8.26 8.17 8.11

18.70 16.65 15,63 15.02 14.62 14,33 14.11 13.94 13.81 13.70 13.60 13.53 13.46 13.40 13.35 13.30 13.26 13.23 13.10 13.01

24.55 22.00 20.73 19.97 19.46 19.10 18.83 18.61 18.44 18.30 18.19 18.10 18.01 17.94 17.87 17.82 17.77 17.72 17.55 17.44

ffiO.05,p = 2 30.09 35.45 27.07 31.97 25.57 30.23 24.66 29.19 24.05 28.49 23.62 27.99 23.30 27.62 23.05 27.33 22.85 27.10 22.68 26.90 22.54 26.75 22.42 26.61 22.33 26.50 22.24 26.40 22.17 26.31 22;10 26.23 22.04 26.16 21.98 26.10 21,79 25.87 21.65 25,72 a = o.05,p = 3 57.68 68.50 50.95 60.69 47.49 56.67 45.37 54.20 43.93 52.54 42.90 51.33 42.11 50.42 41.50 49.71 41.00 49.13 40.60 48.65 40,26 48.26 39.97 47.92 39.72 47.63 39.50 47.38 39.31 47.16 39.15 46.96 39.00 46.79 38.44 46.15 38.09 45.73

40.68 36.75 34.79 33.61 32.83 32.26 31.84 31.51 31.25 31.03 30.85 30.70 30.57 30.45 30.35 30.27 30.19 30.12 29.86 29.69

45.81 41.45 39.26 37.95 37.08 36.44 35.98 35.61 35.32 35.08 34.87 34.71 34.57 34.43 34.32 34.23 34.14 34.07 33.78 33.59

50.87 46.07 43.67 42.22 41.26 40.57 40.05 39.65 39.33 39.07 38.84 38.66 38.50 38.36 38.24 38.13 38.04 37.95 37.63 37.42

55.86 50.64 48,02 46.45 45.40 44.64 44.08 43.64 43.29 43.00 42.76 42.56 42.38 42.23 42.10 41.99 41.88 41.79 41.44 41.21

4 22.41 5 19.19 6 17.57 7 16.59 8 15.93 9 15.46 10 15.11 11 14.83 12 14,61 13 14.43 14 14.28 15 14.15 16 14.04 17 13.94 18 13.86 19 13.79 20 13.72 25 13.48 30 13.32

35.00 30.52 28.24 26.84 25.90 25.22 24.71 24.31 23.99 23.73 23.50 23.32 23.16 23.02 22.89 22.78 22.69 22.33 22.10

46.58 40.95 38.06 36.29 35.10 34.24 33.59 33.08 32.67 32.33 32.05 31.81 31.60 31.43 31,26 31.13 31.01 30.55 30.25

79.11 70.26 65.69 62.89 60.99 59.62 58.57 57.76 57.11 56.56 56.11 55,73 55.40 55.11 54.86 54.64 54.44 53.70 53.22

89.60 79.69 74.58 71.44 69.32 67.78 66.62 65.71 64.97 64.36 63.86 63.43 63.06 62.73 62.45 62.21 61.98 61.16 60.62

99.94 110.21 89.03 98.27 83.39 92.09 79.90 88.30 77.57 85.73 75.86 83.87 74.58 82.46 73.57 81.36 72.75 80.45 72.09 79.72 71.53 79.11 71.05 78.60 70.64 78.14 70.27 77.76 69.97 77,41 69.69 77.11 69.45 76.84 68.54 75.84 67.94 75.18

554

P. R. Krishnaiah and Jack C. Lee

Table 11 (continued) 2 3 4 5 6 7 8 9 10

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

35.39 30.06 27.31 25.61 24.45 23.62 22.98 22.48 22.08 21.75 21.47 21.24 21.03 20.86 20.70 20.56 20.06 19.74

56.10 48.62 44.69 42.24 40.57 39.34 38.41 37.67 37.08 36.59 36.17 35.82 35.52 35.26 35.02 34.82 34.06 33.59

75.36 65.90 60.89 57.77 55.62 54.04 52.84 51.90 51.13 50.50 49.97 49.51 49.12 48.78 48.47 48.21 47.23 46.61

a = 0.05,p = 4 9397 112.17 130.11 82.60 98.93 115.03 76.56 91.88 106.98 72.77 87.46 101.94 70.17 84.42 98.46 68.26 82.19 95.90 66.81 80.48 93.95 65.66 79.14 92.4l 64.73 78.04 91.15 63.95 77.13 90.12 63.30 76.37 89.26 62.76 75.73 88.51 62.28 75.16 87.87 61.86 74.68 87.31 61.50 74.25 86.82 61.17 73.87 86.38 59.98 72.47 84.78 59.21 '71.58 83.74

147.81 130.94 121.90 116.23 112.32 109.46 107.27 105.54 104.12 102.97 101.99 101.14 100.42 99.80 99.25 98.75 96.95 95.79

165.39 146.69 136.71 130.43 126.08 122.91 120.46 118.55 116.98 115.69 114.59 113.67 112.87 112.17 111.56 111.02 109.01 107.71

182.80 162.34 151.39 144.50 139.74 136.24 133.57 131.45 129.74 128.32 127.14 126.10 125.22 124.46 123.79 123.18 120.99 119.57

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

51.11 43.40 39.29 36.71 34.93 33.62 32.62 31.83 31.19 30.66 30.22 29.83 29.51 29.22 28.97 28.05 27.48

a = 0.05,p = 5 81.99 110.92 138.98 166.54 193.71 220.66 247.37 273.88 71.06 97.03 122.22 146.95 171.34 195.49 219.47 243.30 65.15 89.45 113.03 136.18 159.04 181.65 204.14 226.48 61.39 84.62 107.17 129.30 151.17 172.80 194.27 215.64 58.78 81.25 103.06 124.48 145.64 166.56 187.37 208.02 56.85 78.75 100.02 120.92 141.54 161.98 182.24 202.37 55.37 76.83 97.68 118.15 138.38 158.38 178.23 198.03 54.19 75.30 95.82 115.96 135.86 155.54 175.10 194.51 53.23 74.05 94.29 114.16 133.80 153.21 172.49 191.68 52.44 73.01 93.02 112.66 132.07 151.29 170.36 189.38 51.76 72.14 91.94 111.41 130.61 149.66 166.53 187.32 51.19 71.39 91.03 110.34 129.38 148.25 166.99 185.61 50.69 70.74 90.23 109.39 128.29 147.03 165.65 184.10 50.26 70.17 89.54 108.57 127.36 145.97 164.45 182.81 49.88 69.67 88.93 107.85 126.52 145.02 163.38 181.65 48.48 67.86 86.70 105.21 123.51 141.62 159.60 177.49 47.61 66.71 85.29 103.56 121.60 139.47 157.22 174.87

*The entries in this table are the values of c6 where P [ - 2 1 o g ~ 6 < c r l n 6 ] = ( l - a ) . For an explanation of the notations, the reader is referred to Section 4.2. The entries in this table are reproduced from Lee, Chang and Krishnaiah (1977) with the kind permission of North-Holland Publishing Company.

Likelihood ratio testa'for mean vectors and covariance matrices

555

"~ ~

c5 II

t~ 0 0

~ d ~ d ~ ~ N M M M M N N M ~ M N M N
0 0

556

P. R. Krishnaiah and Jack C. Lee Table 13" Percentage points of the likelihood ratio test statistic for Z = ~0 and 1~=P.0 2 3 4 a = 0.05 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 45 50 55 60 65 70 75 80 85 90 95 100 17.381 15.352 14.318 13.689 13.265 12.960 12.729 12.549 12.404 12.285 12.186 12.101 12.029 11.966 11.911 11.862 11.819 11.745 11.684 11.633 11.591 11.554 11.522 11.494 11.469 1L447 11.427 11.386 11.353 11.327 11.305 11.286 11.271 11.257 11.245 11.235 11.225 11.217 11.210 27.706 24.431 22.713 21.646 20.915 20.382 19.975 19.655 19.396 19.181 19.002 18.848 18.716 18.601 18.499 18.410 18.258 18.134 18.031 17.944 17.870 17.806 17.750 17.701 17.657 17.618 17.536 17.471 17.419 17.375 17.339 17.308 17.281 17.258 17.237 17.219 17.203 17.188 5 6

39.990 35.307 32.787 31.190 30.080 29.261 28.631 28.131 27.723 27.384 27.098 26.854 26.642 26.457 26.294 26.019 25.797 25.614 25.460 25.329 25.215 25.117 25.030 24.954 24.885 24.742 24.630 24.539 24.465 24.402 24.348 24.302 24.262 24.227 24.196 24.168 24.143

54.261 48.039 44.610 42.400 40.843 39.683 38.782 38.061 37.470 36.977 36.559 36.200 35.888 35.614 35.157 34.790 34.489 34.237 34.023 33.840 33.681 33.541 33.417 33.307 33.079 32.900 32.755 32.636 32.537 32.452 32.379 32.316 32.261 32.211

70.475 62.660 58.222 55.321 53.254 51.698 50.480 49.499 48.691 48.013 47.436 46.938 46.504 45.785 45.212 44.745 44.357 44.029 43.748 43.505 43.292 43.105 42.938 42.594 42.324 42.107 41.929 41.780 41.654 41.546 41.451

Likelihood ratio tests for mean vectors and covariance matrices

557

Table 13 (continued) M ~ 2 3 4 5 6

=0.01

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 45 50 55 60 65 70 75 80 85 90 95 100

24.087 21.114 19.625 18.729 18.129 17.700 17.377 17.125 16.923 16.758 16.620 16.503 16.403 16.316 16.239 16.172 16.112 16.010 15.927 15.857 15.798 15.747 15.703 15.665 15.631 15.601 15.574 15.517 15.473 15.436 15.406 15.381 15.359 15.341 15.324 15.310 15.297 15.286 15.276

36.308 31.682 29.318 27.871 26.890 26.180 25.642 25.219 24.878 24.597 24.361 24.161 23.988 23.838 23.706 23.589 23.392 23.231 23.098 22.986 22.890 22.807 22.734 22.671 22.614 22.564 22.458 22.375 22.307 22.252 22.205 22.165 22.131 22.101 22.074 22.051 22.030 22.011

50.512 44.073 40.713 38.621 37.184 36.133 35.328 34.692 34.176 33.748 33.388 33.080 32.814 32.582 32.378 32.035 31.758 31.529 31.337 31.174 31.033 30.911 30.803 30.708 30.623 30.447 30.308 30.196 30.103 30.025 29.959 29.903 29.853 29.810 29.771 29.737 29.706

66.728 58.348 53.885 51.063 49.100 47.650 46.531 45.639 44.911 44.305 43.793 43.353 43.973 42.639 42.083 41.637 41.272 40.967 40.708 40.486 40.294 40.125 39.976 39.844 39.568 39.353 39.179 39.036 38.916 38.815 38.727 38.651 38.585 38.526

84.937 74.530 68.874 65.244 62.690 60.784 59.302 58.114 57.139 56.324 55.631 55.035 54.517 53.658 52.977 52.422 51.961 51.573 51.240 50.953 50.701 50.480 50.284 49.877 49.559 49.303 49.094 48.919 48.770 48.643 48.532

(1 -

*The entries in this table are the values of ca where P [ - 2 1 o g X s < c s ] = a). For an explanation of the notations, the reader is referred to Section 5.1. The entries in the table are reproduced from Nagarsenker and Pillai (1974) with the kind permission of the Academic Press, Inc.

558

P. R. Krishnaiah and Jack C. Lee

Table t4" Percentage points of the likelihood ratio test statistic for the homogeneity of multivariate normal populations M k=2 k=3 p = 1,a =0,05 t2.47 12.49 12.51 12.52 12.53 12.53 12.54 12.55 12.55 12.56 12.56 12.56 12.56 12.56 12.57 12.56 12.5'7 12.57 12.57 12.57 12.57 12.58 p = 2 , a =0.05 31.66 29.83 28.92 28.39 28.03 27.78 27.60 27.45 27.33 27.24 27,16 27.09 27.04 26.99 26.94 26.90 26.87 26.84 26.81 26.79 26.69 26.63 k=4 k=5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

8.12 8.03 7.97 7.94 7.92 7.91 7.89 7.89 7.88 7.87 7.87 7.86 7.86 7.86 7.85 7.85 7.85 7.85 7.85 7.85 7.84 7.84

16.36 16.50 16.59 16.64 16.68 16.71 16.73 16.75 16,77 16.78 16.79 16.86 16.81 16.81 16.82 16.83 16.83 16.84 16.84 16.85 16.86 16.87

20.06 20.30 20.45 20.55 20,62 20,67 20,71 20.74 20.77 20.79 20.81 20.82 20.84 20.85 20.86 26.87 20.88 20.89 20.89 20.90 20.92 20.94

1 2 3 4 5 6 7 8 9 10 ll 12 13 14 15 16 17 18 19 20 25 30

19.87 18.41 17.68 17.25 16.96 16.75 16.60 16.47 16.38 16.30 16.23 16.18 16.13 16.09 16.05 16.02 15.99 15.97 15.94 15.92 15,84 15.79

42.59 40.42 39.37 38.75 38.35 38.06 37.85 37.69 37.55 37.44 37.36 37.28 37.22 37.16 37.12 37.07 37.04 37.01 36.97 36.95 36.85 36.78

53.10 50.62 49.43 48.74 48.29 47.97 47.74 47.56 47.41 47.30 47.21 47.12 47.05 47.00 46.94 46.90 46.86 46.82 46.79 46.76 46.65 46.57

L i k e l i h o o d ratio tests f o r m e a n vectors a n d covariance matrices

559

Table 14 (continued) M k=2 k=3 p = 3 , a =0.05 59.39 54.56 52.04 50.49 49.42 48.65 48.06 47.61 47.24 46.94 46.68 46.46 46.27 46.11 45.97 45.84 45.73 45.63 45.54 45.45 45.14 44.92 p =4,c~ ~0.05 95.91 87.16 82.37 79.35 77.23 75.67 74.46 73.51 72.74 72.10 71.56 71.09 70.70 70.35 70.03 69.77 69.52 69.30 69.10 68.92 68.22 67.74 k=4 k=5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

36.46 32.90 31.04 29.90 29.13 28.56 28.13 27.79 27.52 27.30 27.11 26.96 26.82 26.70 26.60 26.51 26.42 26.35 26.28 26.22 25.99 25.83

80.89 74.91 71.80 69.88 68.57 67.62 66.90 66.34 65.89 65.52 65.19 64.94 64.71 64.50 64.33 64.17 64.04 63.92 63.80 63.70 63.31 63.05

101.75 94.65 90.97 88,70 87.16 86.05 85 .21 84.55 84.02 83.57 83.21 82.90 82.63 82.40 82.20 82.01 81.85 81.71 81.58 81.46 81.01 80.70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30

58.06 51.79 48.40 46.26 44.77 43.68 42.83 42.17 41.63 41.19 - 40.81 40.50 40.22 39.97 39.76 39.57 39.41 39.25 39.12 38.99 38.51 38.18

131.68 120.63 114.58 110.73 108.05 106.08 104.57 103.36 102.39 101.57 100.89 100.29 99.80 99.36 98.97 98.62 98.31 98.04 97.79 97.55 96.67 96.07

166.49 153.23 145.98 141.37 138.13 135.78 133.95 132.51 131.33 130,37 129.54 128.84 128.22 127.70 127.23 t26.83 126.45 126.13 125.82 125.54 124.50 123.76

*The entries in this table give the values of c9 where P [ - 21ogh 9 < c 91H9] = (1 - a). For an explanation of the notations, the reader is referred to Section 5.2. The entries in this table are reproduced from Lee et al. (1977) with the kind permission of North-Holland Publishing Company.

560

P. R. Krishnaiah and Jack C. Lee

"Fable 15" Percentage points of the ~ e l i h o o d ratio test statistic for the equality of variances and eovariances 4 5 6 7 a~O.05 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100 .0312635 .0350303 .021109 .046971 .078826 .11353 .14898 .18392 .21763 .24974 .28009 .30864 .33543 .36054 .38406 .40610 .49765 .56580 .61812 .65939 .69273 .72019 .76271 .79406 .81813 .83717 .85261 .0432727 .0215517 .0376651 .019335 .035879 .055972 .078419 .10223 .12667 .15120 .17545 .19917 .22221 .24447 .26589 .36028 .43574 .49643 .54584 .58674 .62105 .67527 .71607 .74784 .77325 .79404 8 9 10

.0592238 .0350592 .0228538 .0280234 .016233 .027219 .040508 .055581 .071959 .089226 .10704 .12514 .14331 .16138 .24711 .32180 .38524 .43897 .48468 .52388 .58729 .63615 .67484 .70619 .73208

.0527274 .0517023 .0210751 .0233219 .0272716 .013029 .020517 .029553 .039910 .051350 .063648 .076600 .090029 .15981 .22717 .28824 .34233 .38990 .43173 .50129 .55635 .60081 .63735 .66788

.0684400 .0458442 .0540746 .0213701 .0232229 .0261419 .010200 .015389 .021645 .028869 .036947 .045759 .097007 .15287 .20751 .25851 .30511 .34728 .41965 .47875 .52752 .56828 .60275

.0628066 .0*20340 .0515487 .0356237 .0314138 .0328543 .0249853 .0378624 .011501 .015886 .020978 .055015 .097785 .14349 .18879 .23202 .27245 .34437 .40520 .45665 .50044 .53802

.0792027 .0571491 .0458951 .0322970 .0561430 .0213094 .0223991 .0339479 .0359982 .0285717 .028998 .059290 .095122 .13316 .17129 .20831 27688 .33718 .38962 .43517 .47489

Likelihood ratio tests for mean vectors and covariance matrices

561

Table 15 (continued)

N
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 1120

10

a =0.01

.0543589 .0290698 .0263672 .018484 .036601 .059079 .084300 .11098 .13820 .16531 .19191 .21772 .24258 .26643 .28921 .31094 .40477 .47807 .53614 .58300 .62148 .65359 .70402 .74174 .77100 .79433 .81336

.0512499 .0527002 .0222060 .0272407 .015834 .027717 .04230 .058928 .076996 .095996 .11552 .13526 .15497 .17448 .19367 .28233 .35739 .42005 .47248 .51672 .55441 .61496 .66130 .69782 :72731 .75159

.0637056 .0485568 .0379055 .0228787 .0268646 .012922 .020971 .030792 .042102 .054610 .068043 .082159 .096750 .11164 .18647 .25618 .31803 .37205 .41907 .46011 .52778 .58095 .62363 .65857 .68767

.0610191 .0528099 .0328834 .0211499 .0229634 .0259613 .010244 .015805 .022566 .030407 .039190 .048769 .059006 .11600 .17543 .23208 .28406 .33097 .37305 .44456 .50244 .54989 .58936 .62263

.0731276 .0594513 .0310624 .0345954 .0212711 .0227191 .0249297 .0279719 .011865 .016588 .022094 .028321 .067633 .11443 .16289 .20995 .25421 .29519 .36725 .42755 .47818 .52105 .55769

.0710254 .0532295 .0439383 .0318340 .0254147 .0212266 .0223392 .0239560 .0261272 .0288778 .012211 .036777 .070887 .10974 .15005 .18970 .22777 .29736 .35782 .40995 .45497 .49404

.0835966 .0511195 .0414656 .0473027 .0322907 .0354763 .0210956 .0219339 .0231126 .0246685 .018549 .041578 .070837 .10347 .13738 .17124 .23587 .29440 .34639 .39228 .43277

*The entries in this table give the values of c io where P [Xl0 ~ c 101I'12fq Ha ] = ( 1 - a). For an explanation of the notations, the reader is referred to Section 6. The entries in this table are reproduced from Nagarsenker (1975a) with the kind permission of Marcel Dekker, Inc.

562

P. R. Krishnaiah and Jack C Lee

Table 16" Percentage points of the, like~ood ratio test statistic for the equality of means, variances and covariances 5 6 7 a=0.05 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100 0.0458284 0.0124987 0.011531 0.027632 0.049172 0.074247 0.10129 0.12916 0.15709 0.18455 0.21124 0.23695 0.26160 0.28513 0.30754 0.32886 0.42051 0.49181 0.54823 0.59373 0.63110 0.66229 0.71130 0.74799 0.77645 0.79917 0.81771 8 9 10

0.0414294 0.0174096 0.0139892 0.010803 0.021204 0.034680 0.050554 0.068162 0.086935 0.10641 O.12622 0.14610 0.16585 O.18530 0.20436 0.29193 0.36573 0.42724 0.47872 0.52218 0.55921 0.61878 0.66443 0.70045 0.72955 0.75354

0.0538976 0.0323417 0.0114293 0.0142929 0.0191786 0.016125 0.024971 0.035458 0.047293 0.060194 0.073901 0.088190 0.10287 0.11778 0.19208 0.26089 0.32185 0.37509 0.42147 0.46197 0.52888 0.58153 0.62387 0.65858 0.68751

0.0511422 0.0476872 0.0352215 0.0117171 0.0139638 0.0174335 0.012176 0.018151 0.225263 0.033385 0.042377 0.052099 0.062420 0.11921 0.17794 0.23380 0.28505 0.33134 0.37292 0.44370 0.50112 0.54829 0.58758 0.62075

0.0635519 0.0425869 0.0319296 0.0368819 0.0117036 0.0133936 0.0158588 0.0191461 0.013259 0.018167 0.023820 0.030151 0.069462 0.11572 0.16346 0.20980 0.25343 0.29386 0.36513 0.42492 0.47525 0.51794 0.55448

0.0611201 0.0388559 0.0471825 0.0327575 0.0372798 0.0115342 0,0127833 0.0145400 0.0168439 0.0197117 0.013141 0.037823 0.071549 0.10978 0.14935 0.18838 0.22587 0.29456 0.35441 0.40616 0.45095 0.48989

0.0738902 0.0530708 0.0426858 0.0311033 010330923 0.0368711 0.0213065 0.0122223 0,0134784 0.0151063 0.019146 0.041939 0.070689 0.10269 0.13596 0.16921 0.23284 0.29065 0.34215 0.38771 0.42802

Likelihood ratio tests for mean vectors and covariance matrices

563

Table 16 (continued) 4 5 6 7 a=0.01 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100 0.o523634 0.0244158 0.0233851 0,010557 0.022158 0.037515 0.055700 0.075827 0.097158 0.11912 0.14128 0.16333 0.18505 0.20628 0.22692 0.24692 0.33650 0.40980 0.46977 0.51931 0.56074 0.59579 0.65172 0.69426 0.72762 0.75448 0.77654 0.0656984 0.0312661 0.0211201 0.0239329 0.0290997 0.016703 0.026539 0.038270 0.051519 0.065929 0.081180 0.097003 0.11317 0.12951 0.14587 0.22492 0.29563 0.35695 0.40972 0.45521 0.49462 0.55914 0,60949 0.64973 0.68257 0.70985 8 9 10

0.0615302 0.0438951 0.0238711 0.0115020 0.0237819 0.0274584 0.012601 0.019158 0.027006 0.035985 0.045921 0.056643 0.067993 0.079828 0.14245 0.20460 0.26223 0.31421 0.36060 0.40188 0.47147 0.52737 0.57299 0.61080 0.64260

0,0745507 0.0412502 0.0313716 0.0358070 0.0115767 0.0233190 0.0259339 0.0294795 0.013958 0.019329 0.025529 0.032475 0.040080 0.085022 0.13534 0.18575 0.23371 0.27824 0.31908 0.39024 0.44931 0.49864 0.54026 0.57573

0.0715137 0.0541262 0.0449354 0.0322589 0.0265687 0.0214684 0.0227676 0.0246324 0.0271093 0.010217 0.013949 0.018285 0.047583 0.085302 0.12658 0.16833 0.20885 0.24730 0.31684 0.37670 0.42801 0.47215 0.51034

0.0848639 0.0513883 0.0417941 0.0488132 0.0227296 0.0364526 0.0212781 0.0222361 0,0235705 0,0253171 0,0274973 0,024844 0.051074 0,082819 0,11722 0.15233 0,18693 0.25218 0,31065 0,36223 0.40756 0,44744

0.0820433 0.0647449 0.0565673 0.0434424 0.0211304 0.0228160 0.0358459 0.0110668 0.0217694 0.0227272 0.012035 0.028960 0.051924 0.78816 0.10784 0.13769 0.19664 0.25187 0.30217 0.34741 0.38795

*The entries in this table give the values of cll where PDkll > c l l I H ~ n H~O H~] = 1-a. For an explanation of the notations, the reader is referred to Section 6. The entries in the table are reproduced from Nagarsenker (1975b) with the kind permission of the Gordon and Breach Science Publishers, Ltd.

564

P. R. Krishnaiah and Jack C. Lee

Table 17" Percentage points of the distribution of the likelihood ratio test statistic for compound symmetry M~2) (2, 2) (2, 3) (2, 4) (2, 5) (3, 3) (3, 4)

(4,4)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 60

.1397 .1886 .2350 .2782 .3178 .3540 .3871 .4173 .4449 .4702 .4935 .5148 .5346 .5528 .5698 .5855 .6002 .6138 .6265 .6385 .6604 .6799 .6972 .7126 .7267 .7565 .7805 .8002 .8168 .8428

.0569 .0847 .1142 .1443 .1741 .2030 .2310 .2577 .2831 .3072 .3301 .3518 .3723 .3917 .4100 .4274 .4439 .4595 .4743 .4884 .5145 .5383 .5599 .5977 .5799 .6368 .6691 .6962 .7194 .7564

b=O,a=O.05 .0240 .0102 .0388 .0178 .0561 .0274 .0751 .0386 .0951 .0513 .1158 .0649 .1366 .0794 .1574 .0944 .1780 .1097 .1981 .1252 .2178 .1408 .2369 .1563 .2554 .1717 .2733 .1868 .2906 .2018 .3072 .2164 .3233 .2308 .3387 .2448 .3536 .2584 .3679 .2718 .3949 .2975 .4200 .3217 .4432 .3447 .4648 .3664 .4848 .3868 .5293 .4332 .5669 .4735 .5992 .5088 .6271 .5399 .6728 .5920

.0239 .0388 .0560 .0750 .0950 .1157 .1365 .1573 .1779 .1980 .2177 .2368 .2553 .2732 .2905 .3071 .3232 .3386 .3535 .3678 .3949 .4199 .4431 .4647 .4847 .5292 .5669 .5992 .6270 .6728

.0102 .0178 .0273 .0386 .0512 .0649 .0793 .0943 .1096 .1251 .1407 .1562 .1716 .1868 .2017 .2164 .2307 .2447 .2584 .2718 .2974 .3217 .3446 .3663 .3868 .4331 .4735 .5088 .5399 .5919

.0044 .0081 .0132 .0196 .0271 .0357 .0452 .0554 .0662 .0775 .0890 .1009 .1129 .1250 .1371 .1492 .1613 .1733 .1851 .1968 .2197 .2419 .2633 .2838 .3034 ,3490 .3898 .4262 .4589 .5149

Likelihood ratio tests for mean vectors and covariance matrices


Table 17 (continued)

565

"X'~pt,p2 )

(2, 2)

(2, 3)

(2, 4)

(2, 5)

(3, 3)

(3, 4)

(4, 4)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 60

.0973 .1365 .1755 ,2130 .2487 .2822 .3135 .3427 .3697 .3950 .4185 .4404 .4608 .4799 .4977 .5145 .5302 .5449 .5588 .5719 .5959 .6175 .6368 .6545 .6705 .7046 .7325 .7555 .7750 .8059

.0372 .0577 .0804 .1044 .1290 .1536 .1779 .2016 .2246 .2468 ,2682 ,2886 .3083 .3270 .3450 .3622 .3786 .3943 .4093 .4236 .4504 .4750 .4977 .5185 .5378 .5801 .6155 .6455 .6712 .7129

b = 1,a ~0.05 .0149 .0061 .0251 .0110 .0375 .0175 .0517 .0255 .0671 .0347 .0835 .0450 .1005 .0561 .1178 .0680 .1353 ,0803 .1527 .0930 .1700 .1060 .1870 .1192 .2038 ,1324 .2201 .1456 .2361 .1588 .2517 .1719 .2668 .1848 .2815 .1975 .2958 .2101 .3096 .2224 .3360 .2464 .3607 .2694 .3840 .2914 .4057 .3125 .4261 .3325 .4719 .3785 ,5114 .4194 .5456 .4556 .5755 .4878 .6252 .5425

.0149 .0251 .0375 .0516 .0671 .0835 .1005 .1178 .1352 .1527 .1699 .1870 .2037 .2201 .2361 .2516 .2667 .2814 .2957 .3095 .3359 .3607 .3839 .4056 .4261 .4719 .5114 .5456 .5755 .6251

,0061 .0110 .0175 .0255 .0347 .0449 .0561 .0679 .0803 .0930 .1060 .1191 .1323 .1456 ,1587 .1718 .1847 ,1975 .2100 .2224 .2463 .2694 .2914 .3124 .3324 ,3785 .4193 .4555 .4878 .5425

.0025 .0048 .0081 .0124 .0177 ,0239 .0309 .0385 .0468 .0557 .0649 .0745 .0844 .0945 .1048 .1152 .1256 .1361 .1465 .1569 .1775 .1978 .2175 .2367 .2554 .2992 .3391 .3755 .4084 .4657

566 Table 17 (continued)

P. R. Krishnaiah and Jack C. Lee

~__(Pl,P2)

M~
1 2 3 4 5 6 7 8 9 10 11 t2 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 60

(2, 2)

(2, 3)

(2, 4)

(2, 5)

(3, 3)

(3, 4)

(4, 4)

.0720 .1039 .1366 .1692 .2009 .2314 .2603 .2878 .3136 .3381 .3611 .3827 .4030 .4223 .4404 .4575 .4736 .4890 .5034 .5170 .5425 .5653 .5860 .6050 .6223 .6596 .6905 .7162 .7379 .7729

.0258 .0412 .0590 .0783 .0986 .1194 .1404 .1612 .1818 .2019 .2215 .2406 .2590 .2768 .2940 .3105 .3265 .3418 .3566 .3708 .3977 .4225 .4456 .4671 .4869 .5310 .5684 .6004 .6283 .6736

b=2,a=0.05 .0098 .0038 .0170 .0071 .026t .0117 .0369 .0174 .0489 .0242 .0620 .0320 .0758 .9407 .0902 .0501 .1050 .0600 .1200 .0705 .1351 .0813 .1501 .0924 .1651 .1037 .1798 .115l .1944 .1266 .2087 .1382 .2227 .t497 .2365 .1612 .2499 .1725 .2630 .1838 .2883 .2060 .3123 .2275 .3350 .2483 .3565 .2683 .3767 .2876 .4229 .3325 .4633 .3730 .4987 .4094 .5300 .4421 .5824 .4985

.0098 .0170 .0261 .0368 0489 .0620 .0758 .0902 .1050 .1200 .1350 .1501 .1650 .1798 .1943 .2087 .2227 .2364 .2499 .2630 .2883 .3122 .3350 .3564 .3768 .4229 .4634 .4987 .5298 .5824

.0038 .0071 .0117 .0174 .0242 .0320 .0407 .0500 .0600 .0704 .0813 .0923 .1037 .1151 .1266 .1382 .1497 .1611 .1725 .1838 .2059 .2274 .2482 .2682 .2875 .3325 .3730 .4093 .4422 .4985

.0015 .0030 .0052 .0082 .0119 .0164 .0216 .0274 .0339 .0408 .0482 .0560 .0641 .0726 .0812 .0900 .0990 .1081 .1172 .1264 .1448 .1631 .1812 .1989 .2163 .2578 .2964 .3319 .3647 .4223

Likelihood ratio tests for mean vectors and covariance matrices

567

Table 17 (continued) M~P2) x, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 60 .0555 .0818 .1097 .1380 .1662 .1936 .2202 .2458 .2702 .2934 .3155 .3365 .3564 .3754 .3933 .4104 .4267 .4421 .4568 .4708 .4967 .5205 .5422 .5621 .5802 .6200 .6529 .6807 .7046 .7429 .0187 .0305 .0446 .0603 .0772 .0949 .1130 .1313 .1496 .1677 .1855 .2031 .2202 .2368 ,2531 .2688 .2841 .2989 .3132 .3271 .3534 .3781 .4011 .4228 .4429 .4882 .5268 .5604 .5896 .6381 b = 3 , a =0.05 .0067 .0025 .0119 .0048 .0188 .0080 .0270 .0122 .0365 .0173 .0470 .0233 .0584 .0301 .0704 .0376 .0829 .0457 .0958 .0543 .1088 .0633 .1221 .0726 .1353 .0823 .1486 .0921 .1618 .1022 .1748 .1123 .1877 .1225 .2005 .1328 .2129 .1430 .2253 .1532 .2492 .1735 .2721 .1933 .2941 .2128 .3150 .2317 .3349 .2501 .3808 .2934 .4213 .3330 .4573 .3692 .4894 .4020 .5438 .4592 .0067 .0119 .0188 .0270 .0365 .0470 .0584 .0704 .0829 .0957 .1088 .1220 .1353 .1485 .1617 .1747 .1877 .2004 .2130 .2252 .2492 .2720 .2940 .3149 .3349 .2806 .4212 .4573 .4892 .5438 .0025 .0048 .0080 .0122 .0173 .0233 .0301 .0376 .0457 .0542 .0633 .0726 .0823 .0921 .1021 .1123 .1225 .1327 .1430 .1532 .1734 .1933 ,2128 .2316 .2500 .2934 .3330 .3691 .4019 .4592 .0009 .0019 .0034 .0055 .0082 .0115 .0154 .0199 .0249 .0304 .0364 .0427 .0494 .0564 .0637 .0712 ,0788 .0867 .0946 .1027 .1190 .1354 .1518 .1681 .1842 .2232 .2601 .2945 .3266 .3839 (2, 2/ (2, 3) (2, 4) (2, 5) (3, 3) (3, 4) (4, 4)

*The entries in this table give the values c12 where P[X12 >c121H12] = ( 1 - a). For an explanation of the notations, the reader is referred to Section 7. T h e entries in this table are reproduced from Lee, Krishnaiah and C h a n g (1976) with the kind permission of the editor of the South African Statistical Journal.

568

P. R. Krishnaiah aeM Jack C. Lee

References
Abruzzi, A~ (1950). Experimental procedures and criteria for estimating and evaluating industrial productivity. Ph.D. Thesis, Columbia University. Anderson, T. W. (1958). A n Introduction to Multivariate Statistical Analysis. Wiley, New York. Anderson, T. W. (1969). Statistical inference for covariance matrices with linear structure. In: P. R. Krishnalah, ed., Multivariate Analysis--f1. Academic Press, New York, pp. 55-66. Bock, R. D. and Bargmann, R. E. (1966). Analysis of covariance structures. Psychometrika 31, 507-534. Bowker, A. H. (1961)~ A representation of Hotelliug's T 2 and Anderson's classification statistic W in terms of simple statistics. In: H. Solomon, ed., Studies in Item Analysis and Prediction. pp. 285-292. Box, G. E. P. (1949). A general distribution theory for a class of" likelihood criteria. Biornetrika 36, 317-346. Chang, T. C., Krishnaiah, P. R and Lee, J. C. (1977). Approximations to the distributions of the likelihood ratio statistics for testing the hypotheses on covariance matrices and mean vectors simultaneously. In: P. R. Krishnaiah, ed., Applications of Statistics. North-Holland Publishing Company, Amsterdam, pp. 97-103. Consul, P. C. (1969). The exact distributions of likelihood criteria for different hypotheses. In: P. R. Krishnaiah, ed., Multivariate Analvsis--11. Academic Press, New York, pp. 171-181. Davis, A. W. (1971). Percentile approximations for a class of likelihood ratio criteria. Biometrika 58, 349-356. Davis, A. W. and Field, J. B. F. (1971). Tables of some multivariate test criteria. Tech. Report No. 32, Division of Mathematical Statistics, CSIRO, Australia. Dixon, W. J., ed. (1974). BMD Biomedical Computer Programs. University of California Press, Berkeley, CA. Hotelling, H. (1931). The generalization of Student's ratio. Ann. Math. Statist. 2, 360--378. Johnson, N. L., Nixon, E., Amos, D. E. and Pearson, E. S. (1963). Table of percentage points of Pearson curves, for given ~ and f12, expressed in standard measure. Biometrika 50, 459-497. Kendall, M. G. and Stuart, A. (1947). The Advanced Theopy of Statistics (third edition). Hafner Publishing Company, New York. Korin, B. P. (1968). On the distribution of a statistic used for testing a covariance matrix. Biometrika 55, 171-178. Korin, B. P. (1969). On testing of equality of k covariance matrices. Biometrika 56, 216-217. Korin, B. P. and Stevens, E. H. (1973). Some approximations for the distribution of a multivariate likelihood-ratio criterion. J. Roy. Statist. Soc. Ser. B 29, 24-27. Krishnaiah, P. R. (t978). Some recent developments on real multivariate distribution. Developments in Statistics 1, 135-169. Academic Press, New York. Kdshnaiah, P. R. and Lee, J. C. (1976). On covariance structures. Sankhya, Ser. A 38, 357-371. Lee, J. C., Chang, T. C. and Krishnaiah, P. R. (1977). Approximations to the distributions of the likelihood ratio statistics for testing certain structures on the covariance matrices of real multivariate normal populations. In: P. R. Krishnaiah, ed., Multivariate Analysis--IV. Academic Press, New York, pp. 105-118. Lee, J. C., Krishnaiah, P. R. and Chang, T. C. (1976). On the distribution of the likelihood ratio test statistic for compound symmetry. S. African Statist. J. 10, 49-62. Lee, J. C., Krishnaiah, P. R. and Chang, T. C. (1977). Approximations to the distributions of the determinants of real and complex multivariate beta matrices. S. African Statist. J. 11, 13-26.

Likelihood ratio tests for mean vectors and covariance matrices

569

Lee, Y. S. (1972). Some results on the distribution of Wilks' likelihood ratio criteria. Biometrika 59, 649-664. Mathai, A. M. (1971). On the distribution of the ~ e l i h o o d ratio criterion for testing linear hypotheses on regression coefficients. Ann. Inst. Statist. Math. 23, 181-197. Mathai, A. M. and Katiyar, R. S. (1977). The exact percentage points for the problem of testing independence. Unpublished. Mathai, A. M. and Rathie, P. N. (1970). The exact distribution for the sphericity test. J. Statist. Res. (Dacca)4, 140-159. Mathai, A. M. and Rathie, P. N. (1971). The problem of testing independence. Statistica 31, 673-688. Mathai, A. M. and Saxena, R. K. (1973). Generalized Hypergeometric Functions with Applications in Statistics and Physical Sciences. Springer-Veflag, Heidelberg and New York. Mauchly, J. W. (1940). Significance test for sphericiVy of a normal n-variate distribution. Ann. Math. Statist. 11, 204-209. Morrison, D. F. (1967). Multivariate Statistical Methods (second edition published in 1976). McGraw-Hill Book ComI-_~ny, New York. Nagarsenker, B. N. (1975a). Percentage points of WiLks' L~ criterion. Comm. Statist. 4, 629-641. Nagarsenker, B. N. (1975b). The exact distribution of the likelihood ratio criterion for testing equality of means, variances and covariances. J. Statist. Comp. and Simulation 4, 225-233. Nagarsenker, B. N. and PiUai, K. C. S. (1973a). The distribution of the sphericity test criterion. J. Multivariate A n a l 3, 226-235. Nagarsenker, B. N. and Pillai, K. C. S. (1973b). Distribution of the Likelihood ratio criterion for testing a hypothesis specifying a covariance matrix. Biometrika 60, 359-364. Nagarsenker, B. N. and Pillai, K. C. S. (1974). Distribution of the likelihood ratio criterion for testing 32=320,/~ ~P~0, J. Multivariate Anal. 4, 114-122. Nair, U. S. (1938). The application of the moment function in the study of distribution laws in statistics. Biometrika 30, 274-294. Nie, N. H. et al. (1975). Statistical Package for the Social Sciences (second edition). Olkin, I. (1973). Testing and estimation for structures which are circularly symmetic in blocks. In: D. G. Kabe and R. P. Gupta, eds., Multivariate Statistical Inference. North-Holland Publishing Company, Amsterdam, pp. 183-195. Pillai, K. C. S. and Gupta, A. K. (1969). On the exact distribution of Wilks' criterion. Biometrika 56, 109-118. Rao, C. R. (1948). Tests of significance in multivariate analysis. Biomeo'ika 35, 58-79. Rao, C. R. (1951). An asymptotic expansion of the distribution of Wilks' A criterion. Bull. Inst. Intt. Statist. 33, Part II, 177-180. Rao, C. R. (1971). Estimation of variance and covariance c o m p o n e n t s - M I N Q U E theory. J. Multivariate A n a l 1, 257-275. Roy, J. (1951). The distribution of certain likelihood criteria useful in multivariate analysis. Bull. Inst. lntt. Statist. 33, 219-230. Schatzoff, M. (1966). Exact distributions of Wilks's likelihood ratio criterion, Biometrika 53, 347-358. Sinha, B. K. and Wieand, H. S. (1977). MINQUE's of variance and covariance components of certain covariance structures. Indian Statistical Institute, Tech. Rep. 28/77. Srivastava, J. N. (1966). On testing hypotheses regarding a class of covariance structures. Psychometrika 31, 147-164. Tukey, J. W. and Wilks, S. S. (1946). Approximation of the distribution of the product of beta variables by a single beta variable. Ann. Math. Statist. 17, 318-324. Varma, K. B. (1951). On the exact distribution of Lmvc and L~ criteria. Bull. lnst. Int. Statist. 33, 181-214.

570

P. 1L Krishnaiah and Jack C. Lee

Votaw Jr, D. F. (1948). Testing compound symmetry in a normal multivariate distribution. Ann. Math. Statist. 19, 447- 473. Wald, A. and Brookner, R. J. (1941). On the distribution of Wilks ~ statistic for testing independence of several groups of variates. Ann. Math. Statist. 12, 137-152. Wijsman, R. A. (1957). Random orthogonal transformations and their use in some classical distribution problems in multivariate analysis. Anr~ Math. Statist. 28, 415-423. Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24, 471-494. Wilks, S. S. (1935). On the independence of k sets of normally distributed statistical variables. Econometrica 3, 309-325. Wilks, S. S. (1946). Sample criteria for testing equality of means, equality of variances, and equality of covariances in a normal multivariate distribution. Ann. Math. Statist. 17, 257-281.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 571-591

17

Assessing Dimensionality in Multivariate Regression


Alan Julian Izenman

1.

Introduction

In this paper we describe a certain generalization of the multivariate linear regression model which also provides a unified approach to the classical multivariate techniques of principal c o m p o n e n t and canonical variate and correlation analysis. The regression model can be described as follows. Let

IX]

(1.1)

be a collection of r + s variables partitioned into two disjoint sub-collections, where X = [X 1..... Xr]" has r components, Y = [Y1, Y2..... Y,]" has s components, and X and Y are jointly distributed with m e a n vector and covariance matrix given by

e[xl [.x

YJ=L~y]

(1.2)

(1.3)
respectively, where Nxx and 2eyy are both assumed nonsingular, A ~ denotes the transpose of the matrix A, and E is the expectation operator defined by the distribution associated with the variate (1.1). Assume further that the variates X and Y are linearly related, so that
sX1 sxl sr rxl sxl Y = /L + C X + e , 571

(1.4)

572

Alan Julian lzenman

where # and C are unknown parameters and e is the corresponding error variate of the model, uncorrelated with X and having mean O and covariance matrix I E . Descriptions of this model in the literature assume implicitly that the regression coefficient matrix C has full rank, and then demonstrate that simultaneous (unrestricted) least-squares estimation applied to all s equations of (1.4) yields the same results as does equation-by-equation least. squares. As a result, nothing is gained by estimating the equations jointly. A true multivariate feature enters the model when we know (or suspect) that the regression coefficient matrix C may not have full-rank, that in fact, rank(C)
= t <

min(r,s)

s,

(1.5)

say, so that a number of linear restrictions on the set of regression coefficients of the model may be present. The value of t in (1.5), and hence the number and nature of those restrictions, may or may not be known prior to analysis. Although X is presumed in (1.5) to be the larger of the two sets of variates, this reflects purely a mathematical convenience, and similar expressions as appear in this paper can also be obtained for the case in which r < s. The statistical literature contains some discussion of this type of multivariate regression model. Most of the papers on the subject assume that the value of t in (1.5) is either k n o w n / t priori ([5], [6, p. 335], [13], [17, p. 505], [18], [19], [21]) or that a suitable hypothesized value can be stated for the value of t ([1], [2, Section 14.2], [21]). With this in mind, it was found convenient (in [13], [18], [19]) to distinguish the case t - - s from the case 1 < t < s by terming the former full-rank regression and the latter reduced.rank regression. Similarly, the regression coefficient matrix C is called 'full-rank' or 'reduced-rank' as appropriate, and to show dependence on its rank, the matrix C is also written C (t). Several of the above-mentioned papers were specifically concerned with the relationships between multivariate regression analysis and the dimensionality-reduction techniques of principal component analysis ([16], [5, Ch. 9], [13]) and canonical variate and correlation analysis ([5,Ch. 10], [6], [13], [18], [19], [21], [24]). Bartlett [3], however, seems to have been the first to observe the important connections between these various methodologies. What is probably of greatest interest to the statistician is the case in which the rank of C cannot be so specified beforehand and has instead to

Assessing dimensionality in multivariate regression

573

be determined from a given multivariate sample of n independent observations, (1.6)

on the variate (1.1). Such data will introduce noise into the relationship between Y and X, and hence will tend to obscure the actual structure of the matrix C, so that rank determination for any particular problem will be made more difficult. There is, therefore, a need to make a distinction here between the "true" or "mathematical" rank of C, which will 'always' be full (since it will be based o n a sample estimate of C), and the "practical" or "statistical" rank of C - - t h e one of real interest--which will typically be unknown. The problem is, therefore, a selection problem: from the set of integers from 1 through s, we are to choose the smallest integer such that the reduced-rank regression of Y on X with that integer as rank will be close (in some sense) to the corresponding full-rank regression. The sense by which one multivariate regression can be 'close' to another multivariate regression forms the subject of this paper. Section 2 gives the main results concerning reduced-rank regression and its relationship to principal component and canonical variate analysis. Section 3 discusses the nature of the residuals from a specific (known rank) reduced-rank regression. Section 4 introduces the problem of assessing the rank of C and Section 5 illustrates some of these concepts through a simple but interesting real-data example Sections 6 and 7 then consider new types of graphical displays by which that dimensionality may be determined.

2.

Reduced-rank regression: main results

The general objective of reduced-rank regression, therefore, is to a p proximate the s-vector variate Y by a set of s linear combinations, # + C X , of the r-vector variate X in which the number of linearly independent combinations (i.e. dimensionality of the regression) may be less than s. In the event that there exist t ( < s ) such combinations, the (regression coefficient) matrix C will have rank t so that there will exist two (nonunique) matrices, an ( s t ) - m a t r i x A and a ( t r ) - m a t r i x B, such that C = A B , where A and B are both of rank t. The problem, therefore, becomes one of finding an A and a B such that the variate ~ + A B X is approximately equal to Y. We have the following general result.

574

A l a n Julian I z e n m a n

THEOREM. Let (1.1) be an (r + s)-vector-valued variate having mean vector (1.2) and covariance matrix (1.3). Suppose that ~ x x is non-singular and that F is positive-definite symmetric. Then, the (s X 1)-vector ix, an (s O-matrix A , and a (t r)-matrix B, where 1 << t-N<s < r, that minimize E ( ( Y - ~, - +IBX)W( Y - t' - A n X ) )
are given by A = A ( = F -I/2 V~. . . . . Vt],

(2.1)

(2.2)

v;
B = B (t) = " F1/Z~Yx~x~ ,

(2.3)

v;
I~ = !L~t)= # r - A ~t)B(t)#x,

(2.4)

where Vj is the latent vector corresponding to the jth largest latent root, 2tj, of the matrix F1/2lgrxl~xlxZxrF 1/2, j = 1,2 . . . . . s A t the minimum, the criterion (2.1) has the value
t

(2.5)

W(t)=tr{IgyyF}-

~ 2~j.
j=l

(2.6)

PROOV. A straight forward application of the E c k a r t - Y o u n g T h e o r e m [10]. Some remarks regarding this theorem are necessary. 1. The matrix B (t) in (2.3) can be re-expressed as

B=B (0=

"

~ x x1/2

(2.7)

v7
where Uj is the latent vector corresponding to t h e j th largest latent root, Xj, of the matrix
xx --xY=--Yx--xx

, j = 1,2 ..... r.

(2.8)

Since it has been assumed here that s <r, then ~ = 0 f o r j = s + 1.... ,r.

Assessing dimensionality in multivariate regression

575

2. The regression coefficient matrix C in (1.4) with rank t is, therefore, given by

c= c(')=r
1

(2.9)

which, if t = s, reduces to the full-rank regression coefficient matrix, to be denoted henceforth by

6) = ~ r x Z x x .

--1

(2.10)

3. A principal components analysis of the r-vector variate X corresponds to setting Y= X, s = r, and F = I, in the above theorem. This gives the following versions of (2.2)-(2.6):

a<'> = [ v, ..... v,] = n('>',


t

(2.1 l)
(2.12)

C(t)= E VjVf,
j=l

O=/~,

where Vj is the latent vector corresponding to the jth largest latent root, ~j, of Zxx. The minimum value of the criterion (2.1) in this case is given by Y~=t+l~, the sum of the residual r - t latent roots of Y~xx. The first t principal components of X are given by the elements of the vector (O=B(OX, (i.e., ~.= VfX,j= 1,2 ..... t), where var(~} =Xj and cov(~,~k} = 0 forjvak. 4. A canonical variate and correlation analysis of the two sets of variates, X and Y, corresponds to setting F = ~{~ in the above theorem, so that the minimum of (2.1) over choice of A, B and # is invariant under simultaneous linear transformations of the variates X and Y. The reducedrank regression coefficient matrix (2.9), therefore, becomes

C= C(t)=~
1

~ry

-1/2 ~-~YX~XX, ~ I

(2.13)

where Vj is the latent vector corresponding to t h e j th largest latent root, ?~j, of the matrix
R_,p.-1/2~;j~ ~-1~ ~-1/2 -- ~ y y ~YXmXX~XY~yy .

(2.14)

The matrix, R, is a multivariate generalization of the simple squared correlation coefficient between two variables ( r = s = 1), and also of the

576

Alan Ju~anIzenman

squared multiple correlation coefficient between a single variable and a number of other variables (s = 1, and any r). Set

A(t) - =

z~/~

(2.15)

v?
The matrix A (o- is for 1 < t <s, a reflexive generalized-inverse of A (0, and for t = s, A (*)- is the unique inverse {A (')}- l (see [20]). Note the symmetric relationship between B (0 (as given in (2.7)) and A (0- (in (2.15)). The transformed variates (O=B(oX and o~(=A(O-K (2.16)

have the correlation matrix

where Pt = diag(pl,o2 ..... Pt}, Pj =~kl/2, J = 1,2 ..... t, and t h e j th components of both ~(t) and o~(t), namely

= v;~;2/~x

and

~j= ~Z;~/~r,

respectively, a r e the jth pair of canonical variates, and ~, the correlation between them is the jth canonical correlation coefficient ( j = 1,2 ..... t).

3.

Residuals from a reduced-rank regression

Estimation of the vector and matrix quantities in Section 2 above is carried out using the sample of values (1.6). First, (1.2) is estimated by , 1Xj a n d / i t --n-tY.~=lYj, and (1.3) by /i x = n -1 Yj=

(n- 1)-~ ~

[ IF
xj-/ix

xj-/ix "=

=:~. (3.1)

All estimates of unknowns are then based on the appropriate elements of (3.1), and denoted by placing a circumflex above the quantity to be estimated. In this way, we d e n o t e / i (t) to be an estimate of (2.4) and ~(t) to be an estimate of (2.9).

Assessing dimensionaBty in multivariate regression

577

The collection of n residual s-vectors from a rank t reduced-rank regression of Y on X is given b y the matrix (3.2) where

~(t) .~. y j _ ~ ( t ) _ d ( t ) X j = ( ~ j _ ~ y ) _

d(t)(Xj_~lx),

j = 1,2 ..... n.
(3.3)

The columns of the matrix ~-'(t) in (3.2) are each asymptotically (large n) s-variate normally distributed with mean zero and covariance matrix r~, - F - ~/2
1

3.4)

where ~ and Vj are the jth latent root and vector respectively of (2.5), if indeed the variate (1.1) is (r + s)-variate normally distributed. Furthermore, the columns of (3.2) are asymptotically pairwise uncorrelated. (These results can be obtained using perturbation expansions as in [13].) For the full-rank case, the corresponding set of residual vectors are each asymptoto ically jointly normal with m e a n zero and covariance matrix If, r r - X rxX~.xl~xr. In view of these remarks, we estimate the residual covariance matrix Z ~ , by

l ~ ) = (n - 1 - r ) - I ~ ~4o~4t)~, j=l
and we write, for the full-rank case only (where t = s), 1 ~~ (') = N~o ^

(3.5)

4.

The case of unknown rank

Discussion so far has centered around the case in which the regression coefficient matrix C has a specific rank, t say. The remainder of this paper is concerned with the case in which the value of t is unknown ~t priori and has to be assessed from the sample data (1.6). It was pointed out in Section 2 of this p a p e r that a reduced-rank regression of rank t corresponds to a choice of either the first t principal components of X or of the first t pairs of canonical variates of X and Y. Recall that W ( t ) denotes the minimum value of (2.1) for a fixed value of t.

578

A l a n Julian i z e n m a n

The reduction in W(t) obtained by increasing the rank from t = t o to t= t~(to<tl) is, therefore, given by
11

W(to)- W(t~)=

~]
j=to+ I

~,

(4.1)

where A s is the jth largest latent root of (2.5). That is, one method for assessing the rank of C can be based either on the sequence of ordered latent roots, {~,j,j= 1,2 ..... s), in which ~s is compared with suitable reference values for each j, or on the sum of the ( s - t o ) residual latent roots (see, e.g., Kshirsagar [14, sections 8.7 and 11.7]). An obvious disadvantage of relying solely on such formal testing procedures is that any routine application of them might fail to take into account the possible need for a preliminary screening of the multidimensional data set in question. Robustness of sample estimates of the latent roots and hence of the various tests when outliers or distributional peculiarities are present in the data is a major statistical problem. In the context of this paper, disregard for such details could lead to incorrect inferences regarding the dimensionality of the regression.

5. A simple example
The data for this example was taken from Rao [15, p. 245], where it is attributed to Frets [11]. The same data appears in Anderson [2, p. 58], Dixon [7, p. 212], and various other places for illustrating certain statistical procedures. While the original investigation consisted of about 3600 measurements on 360 families, this particular subset consists of two measurements, head-length and head-breadth, on each of the first and second sons in a sample of 25 families. Thus, Xl(X2) is the head-length (head-breadth) of the first son, and Yl(Y2) is the head-length (headbreadth) of the second son. Estimates of the mean vector and covariance matrix of these four variables can be found on p. 303 of Anderson [2] and will not be repeated here. Two regressions were made on the data, a reduced-rank regression (t = 1) and a full-rank regression (t = 2), using the canonical variates set up. The results are as follows: ~(1)=[0.43 0.29 0.54] 0.36 ' (~(2,=[0.45 0.27 0.511, 0.38

L41.40 '

37.17 "

Assessing dimensionality in multivariate regression

579

A n initial i n s p e c t i o n of t h e s e r e s u l t s s h o w s t h a t t h e m a t r i c e s (~(1) a n d ~,(2) are not very different from each other. F u r t h e r d e t a i l e d s t u d y o f this d a t a i n d i c a t e d c e r t a i n u n e x p l a i n a b l e p e c u l i a r i t i e s . T h e 25 o b s e r v a t i o n s w e r e c h e c k e d a g a i n s t t h e c o m p l e t e c o l l e c t i o n in F r e t s [11]. It w a s f o u n d t h a t six of t h e s e 25 o b s e r v a t i o n s w e r e d i f f e r e n t f r o m t h o s e in t h e o r i g i n a l s o u r c e data. T h e i n c o r r e c t o b s e r v a t i o n n u m b e r s a r e 11, 12, 13, 14, 15, a n d 25. F r o m a c l o s e e x a m i n a t i o n of the o r i g i n a l s o u r c e d a t a t a b l e s it a p p e a r s t h a t the v a l u e s of t h e first five of t h e s e e r r o r s w e r e t a k e n f r o m t h e w r o n g c o l u m n s o f t h e o r i g i n a l d a t a ; the sixth a p p e a r s to b e a n i n d e p e n d e n t error. B o t h sets of d a t a a r e g i v e n h e r e in T a b l e 5.1. Table 5.1 Corrected data for example 1; r = s = 2, n = 25 Head length, first son Head breadth, first son Head length, second son H e a d breadth, second son

Xl
1

X2
155 149 148 153 144 157 150 159 152 150 161 (158") 147 (147") 153 (150") 160 (159") 154 (151") 137 155 153 145 140 154 143 139 167 153 (163") 179 201 185 188 171 192 190 189 197 187

Y1
145 152 149 149 142 152 149 152 159 151

Y2

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

19t 195 181 183 176 208 189 197 188 192 186 (179") 179 (183") 195 (174") 202 (190") 194(188") 163 195 186 181 175 192 174 176 197 190

179 (186") 183 (174") 174 (185") 190 (195") 188 (187") 161 183 173 182 165 185 178 176 200 187

158 (148") 147 (147") 150(152") 159 (157") 151 (158") 130 158 148 146 137 152 147 143 158 150

*Asterisks denote incorrect observations used by Rao, Anderson and Dixon.

580

Alan Jufian Izenman

The 'corrected' sample was analyzed in a similar manner as was tile previous set; the results are summarized as follows:

~X

151.12
102.83 (0.82) (0.70) (0.76)

187.40 ]

/2~,= i 183"32 l, 149.36 59.62 51.86 (0.62) (0.82) 70.32 44.25 97.98 (0.77) 52.68 -] 40.21 51.71 46.24

~=

(values in parentheses are correlations between appropriate variables)

0~ 0~, ~,~__I4218], 29.99

o.62] 2, i057 0201 0~9 0~6,


~2~__ 146.61 L29621

This time the estimates of C for t = 1 and t = 2 look very different from each other.

6. The rank trace


We now propose a more elaborate method for assessing the dimensionality of a multivariate regression. It is described in terms of the following steps:

Step 1. Carry out a sequence of reduced-rank regressions for specific values of t. Step 2. From each of the regressions of step (1), compute ~'~0 and ~ o . Step 3. Make a scatterplot of the s + 1 points
(A~'),A~')), where t = 0 , 1,2 ..... s, (6.1)

I1~11
AI~t) = I[~t~ ) ^ -- ~l[^ ,

(6.2)
(6.3)

Assessing dimensionality in multivariate regression

581

and join up successive points in the plot. This is the rank trace for the regression of Y on X. Step 4. Assess the rank of C as the smallest rank for which both (6.2) and (6.3) are approximately zero. In small problems (where the value of s is at most 10), all values of t should be examined. For larger problems (s > 10), the costs of computation become critical and it is, therefore, recommended to be more selective in choices of t; one possible way is to car[y out regressions for a few small values of t, say t = 1,2 ..... to + 1, Where to might be an initial estimate of t (perhaps based on the sequence of sample latent roots) plus the usual full-rank regression model (in which t = s) for purposes of comparison For the examples in this paper, the classical Euclidean norm

HAll=(IrAA$)I/2--(~i Xa2)I/2
J

is used ill computing (6.2) and (6.3). For the case when t = 0, define ~(0) to be the null matrix with all entries equal to zero, and l~ ) to be l~ry. Thus, the first point (corresponding to t = 0 ) is always plotted at (1, 1) and the last point (corresponding to t = s) is always plotted at (0, 0). The horizontal coordinate (6.2) gives a quantitative representation of the dif ference between a reduced-rank regression coefficient matrix and the full-rank regression coefficient matrix, while the vertical coordinate (6.3) shows the proportionate reduction in the residual variance matrix in using a simple full-rank model rather than the computationally more elaborate reduced-rank model. The reason for including a special point for t - - 0 is that without such a point, it would be impossible to deduce in many applications that the statistical rank of C should be t = 1. In this formulation, t = 0 corresponds to the completely random model, Y=l~+e. Assessing the dimensionality of the regression by using step 4 above involves a certain amount of subjective judgment, but from experience with many of these types of plots, the choice should not be too difficult. Due to the nature of ~(t), the sequence of values for the horizontal coordinate (6.2) is not guaranteed to decrease monotonically from 1 to 0. It does appear, however, that in many of the applications of this method (in particular, for the canonical variates case), the plotted points appear within the unit square, but below the (1, 1)-(0,0) diagonal-line, indicating that the residual variance matrices typically stabilize faster than do the regression coefficient matrices.

582

Alan Julian Izenman

For the principal components case, the expressions (6.2) and (6.3) reduce to the following simple forms:

E
j~t+l 1

(6.5)

It is clear from (6.4) and (6.5) that: (a) we are really looking at the residual latent roots again (although this time they are each squared); (b) all the information regarding dimensionality of the regression is contained in the residual covariance matrices and not in the regression coefficients; and (c) the r + 1 plotted points do indeed decrease monotonically from (I, 1) to (0, 0) in this special case. (Unfortunately, no similar reduction of (6.2) and (6.3) can be obtained for the canonical variates case.) In view of (6.4), a different criterion of assessing dimensionality from the rank trace plot in the principal components case needs to be applied. A natural rule (which has also been proposed for obtaining multidimensional scaling solutions: see, e.g., Gnanadesikan [12, p. 46]) is that of assessing the rank of C by the smallest integer value between 1 and r at which an "elbow" can be detected in the PC rank trace plot.

Example 1 (continued). The CV rank trace of the data from Rao [15] is plotted in Fig. 6.1(a). From the plot it appears that the rank of C is best estimated by t - ! , which also seems reasonable on the basis of the canonical correlations, 0.7885 and 0.0537. The 'corrected' data, however, yield a very different result with the CV rank trace plotted in Fig. 6.1 (b). The plots suggest that the estimated rank of C should be t = 2 (the canonical correlations are 0.8386 and 0.3256). A third analysis (not shown here) was made on the complete data in Frets [11] on the same four variables. (The set of extensive data tables was screened very carefully since cross-referencing of observations there often proved inconsistent: this meant that the data were boiled down to 247 points.) This larger set (which also contained the 'corrected' 25 values) gave a similar plot of the CV rank trace to the 'corrected' data, again suggesting that the rank estimate should be t--2. The sample canonical correlations, 0.6588 and 0.5077, appear to reflect the same information.

Assessing dimens'ionality in multivariate regression

583

I.O

- f / ~

0.8

0.6

0.4

o2 i

02L~//
0

0.2

0.4

0.6

0.8.

1.0 ~ . (.t)

~ Lx

1.0

0.8

0.6

0.4

0.2

O.2

0.4
(b)

0.6

0.8

i.O &~. (t)

Fig. 6.1. Plot of CV rank trace for example 1, (a) Rao's data, and (b) corrected data, on heredity of headform in man (r = s = 2, n = 25)

584

Alan Julian Izenman

Example 2. U.S. and European temperature records These data, which were made available by J. M. Craddock (Meterological Office, Bracknell, Herts., England), the World Weather Records, Smithsonian Miscellaneous Collections, and the U.S. Weather Bureau, consist of 516 mean monthly temperatures (1918-1960) for five U.S. cities (New Haven, Cape Hatteras, Cincinnati, Nashville, and St. Louis) and for five European cities (Copenhagen, de Bilt, Paris, Odessa, and Valentia). Before analysis, the series for each city was seasonally adjusted by subtracting out the mean for each of the 12 months of the year. The European cities were treated as the X variables and the U.S. cities as the Y variables, so that r = s = 5 and n = 516. As in the previous example, all points of the CV rank trace plot (see Fig. 6.2) are interior to the unit square, and the rank of C is assessed at t---3. It is worth noting that a formal X2 test for the significance of the residual canonical correlations gives only the first two canonical correlations as being non zero. The five correlations are 0.3836, 0.2746, 0.1340, 0.0507, and 0.0090, and the CV rank trace plot has, therefore, yielded additional information for this example.

I.C

0.8

0.6

0.4

O.2
2

;5

f
0.2 0.4 0.6 0.8 1.0 AS(t)

O0

Fig. 6.2. Plot of CV rank trace for example 2 on U.S. and European temperature records (r=s~5,n=516)

Assessingdimensionalityin multivariateregression

585

Example 3. L.A. heart study data These data were taken from Dixon and Massey [9, pp. 14-17], and consist of measurements on 200 men who were survivors of a group having had an initial examination in 1952 and who were re-examined in 1962. For this example, the variables in [9] were divided into two groups: a set of r = 6 (1952) X variables O.e., A, C, D, G, I, and J ) and a set of s = 4 (1962) Y variables (i.e., E, F, H, and K). Only those cases with L = 0 were used here, where L is coded 1 (or 0) if a coronary incident occurred (or, did not occur) between 1952 and 1962; this reduced the size of the sample to n = 174. The plot of the CV rank trace (see Fig. 6.3) yields an exterior value for the reduced-rank regression of rank one; all other points are interior to the unit square. The rank of C is assessed at [ = 3, which agrees with the appropriate X2 test for significance of the canonical correlations (namely, 0.6704, 0.5443, 0.4790, and 0.0932). Example 4. Fisher's iris data This is a classical data set of n----50 measurements on the r - - 4 variables, sepal length, sepal width, petal length, and petal width, of the species lris

A(%
tD o

0.8

0.6

0.4

0,2

0,2

0.4

0.6

0.8

fD

A~ (~)

Fig. 6.3. Plot of CV rank trace for example 3 on L.A. heart study data (r=6,s~4,n= 174)

586

Alan Julian Izenman

Am 2~E~

(-t) _5

1.0

0.8

0.6

0.4

0.2

o o ~--

, 0.2

--

, o.4

0.'6

0.'8

hO A G

(~)

Fig. 6.4. Plot of PC rank trace for example 4 on Fisher's iris versicolor data ( r = 4 , n = 5 0 )

A}

(t) l.c

0.8

0.6

0.4

0.2
5 4

O0

0.4

o:s

I A e ( n

>-

Fig. 6.5. Plot of PC rank trace for example 5 on Jarvik's smoking questionnaire data (r= 12,n= 110)

Assessing dimensionality

in m u l t i v a r i a t e regression

587

Versicolor. See, e.g., Anderson [2, Section 11.5]. The PC rank trace plot is given in Fig. 6,4, and the rank is assessed as t~= 1. The corresponding latent roots are 0.4879, 0.0724, 0.0548, and 0.0098, Example 5. Jarvik smoking questionnaire data These data were taken from Dixon and Brown [8, p. 624]. They refer to r = 12 answers to a smoking questionnaire administered to n = 110 subjects. Each question was coded 1 to 5 such that a high score represents a desire to smoke. The PC rank trace plot (see Fig. 6.5) shows an "elbow" at [ = 3. The latent roots for this example (which were calculated from the (12 12)-correlation matrix of the data, are given by 5.426, 2.997, 1.361, 0.560, 0.363, 0.302, 0.241, 0.200, 0.158, 0.146, 0.137, and 0.110.

7o Comparing gamma plots of multivariate residuals


The methodology proposed in Sections 4 and 6 for assessing the rank of the regression coefficient matrix used various summary measures resulting from each regression, namely the set of latent roots, the sequence of regression coefficient matrices themselves, and their corresponding residual variance matrices. The purpose of this section is to describe an additional method using the set of multivariate residuals from a series of reducedrank regressions. The residuals are the n s-dimensional v e c t o r s 1,2 ..... n, obtained from a reduced-rank regression of rank t. See Section 3 above. One method of comparing these vectors simultaneously is to construct a quadratic form for each vector in which the choice of 'compounding' matrix is positive-definite, but otherwise arbitrary. The n derived quadratic forms (for a given compounding matrix) may then be compared with each other (for example, through a linear ordering of their values). If M is some positive-definite matrix, the quadratic form

~t)j=

\ J

".1

'

(7.1)

converges in distribution to the random variable


~1 = )~(t)~Me(t) \ e j ( ]t -j fi
j 9

(7.2)

with distribution (7.3)


k=l

588

Alan Julian Izenman

where (#~0) are the latent roots of the matrix ~ M and X2 denotes an independent chi-squared variate having one degree of freedom [4, Theorem 2.1]. In the special case when M = (Z~0}- 1, (7.1) converges in distribution to a central chi-squared variate with s degrees of freedom. The distribution (7.3) is approximated here by a gamma distribution with density

g(x;

(7.4)

where A = ~(0 is the unknown scale parameter and ~/= 71(0 is the unknown shape parameter, both depending on the value of t. Estimation of ~(0 and ~/(0 is carried out by the method of maximum likelihood using the first K order-statistics of the n values,

f(")(et),f(")(e~ 0 ) ..... f (')(e(~).

('7.5)

The details may be found in [23]. Following the estimation of h(t) and 7/(t) in (7.4), gamma probability plots are prepared in the manner of Wilk et al. [22] using gamma quantiles computed from the estimates ~(0 and ~(0. Such a plot should resemble a straight-line configuration whenever the values (7.5) are from the estimated gamma distribution. If several of the largest values of (7.5) appear too large, or if a certain degree of 'curvature' is visible in the plot, then the assumption that all the values in (7.5) are gamma distributed may be invalid. For purposes of comparing several reduced-rank regressions (each having a different rank), it is important that the same number K, of smallest order-statistics be nominated for each value of t. Revised gamma plots omitting any 'overly-large' values might be made to check better agreement of the model to the remaining data. As long as the statistical rank of the regression coefficient matrix is larger than those values of t being considered, the corresponding gamma plots should differ markedly for different values of t. When t reaches this rank, the plots should cease to change significantly and should settle down. The characteristics of these gamma plots that yield information regarding degree of stability are: (1) the sequence of estimates of (~,,/), namely, (~(l),~(l)) ..... (~(S) ~(S)); and (2) the general 'shape' of the plots. For a given value of t, the gamma plot indicates the presence of outliers and any distributional peculiarities that may exist in the data. On the other hand, a comparative analysis of gamma plots for different values of t will

Assessing dimensionafity in multivariate regression

589

help to assess the value of t. For the latter type of analysis, the shape of each plot is, therefore, important only in so far as it allows two or more plots to be compared with each other. Choices of M include the (s s)identity matrix I s, and ( ~ ) } - l , for t = l , 2 ..... s. Results so far indicate that the estimates (2t(,~ (t)) are much smaller when I~ is used as a compounding matrix than when { l ~ ) } :-1 is used.

Example 3 (continued).
The gamma plots for the multivariate residuals are shown in Figs. 7.1 (M=I4) and 7.2 ( M = f k~ ( t h3 -1 ~ t=1,2,3,4); for these plots, the smallest ~ee
"

K = 150 order-statistics of (7.5) were used to estimate the gamma quantiles. Internal features of these plots show a lack of near-zero values and a possible outlier, as well as evidence of non-normality in the residuals. However, comparisons of the plots over the four values of t reveal that the configurations of the quadratic forms in the residuals change much less markedlY following t = 3 than for any previous value of t, again suggesting that t = 3.
26.0 (a) 20.6 15.6
oo

21.7 t=l
M=I s

17.4 13.0

(b)

t=2
M= Is

oo

10.4 5.21
03 W

/
I

~o

8.69 4155
I .... I __

D -J
Q

0.01~ 0.013

5.21

10.4

156

20.6

260

0.011 0.011

I__. 4135

L 8169

J 1310

17.4

21.7

13.7
w
D Or"

15.6
(c)

t=3 M=Is oo 10.9 8A9 5.46 2.74

(d)

t =4 M=Is
oO g oo o

II.0 8.22

~o o o

5.49
2.75 0.015 0.015

275

5'.49

8'.22

I1.0

.....

00]~ 13.7 5.46 0.013 2.74 GAMMA QUANTILES (x!~ 3)

8.19

109.

156

Fig. 7.1. Gamma probability plots of observed residuals for Example 3

590 16.0
(a)

Alan Ju~anlzenman

--

t : l

~8.0 (b)

t=2
0 O0 0

15.0 98

. . - [L~] r.,~ H)L-p ivl


8

14.olIIC 72 3.7 /

J
6,6 ~

~
-~
la.i G: i.iJ

33
O.ll _ __

,",

0.11

3.3

6.6

98

130

012 _ _ 012

i __ 3.7 Z2

II1.0

14.0

18.0

18-

(c)

t:3
^ (3) -I

15o
I1.0 7.4 57 011 0.11 3.7

,5o
o oo*o o

M:I .t

" (4) -j

o o

,,o I
7 5.7 3 ~

fo

Z~4

I1.0

15.0

18.(3 011 5.7 GAMMA QUANTILES

74

II.O

t5.0

18.0

Fig. 7.2. Gamma probability plots of observed residuals for Example 3

References
[1] Anderson, T. W. (1951). Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Statist. 22, 327-351. [2] Anderson, T. W, (1958). Introduction to Multivariate Statistical Analysis. Wiley, New York. [3] Bartlett, M. S. (1947). Multivariate analysis. J. Roy. Statist. Soc. (Supplement) 9, 176-197. [4] Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classification. Ann. Math. Statist. 25, 290-302. [5] Brillinger, D. R. (1975). Time Series: Data Analysis and Theory. Holt, Rinehart and Winston, New York. [6] Dempster, A. P. (1971). An overview of multivariate data analysis. J. Multivariate Anal. 1, 316-346. [7] Dixon, W. J. (1968) (ed.). BMD Biomedical Computer Programs. University of California Press. [8] Dixon, W. ,l. and Brown, M. B. (1977) (eds.). BMDP Biomedical Computer Programs, P-Series. University of California Press.

Assessing dimensionatity in multivariate regression

591

[9] Dixon, W. J. and Massey, F. J. (1969). Introduction to Statistical Analysis (third edition). McGraw-Hill, New York. [10] Eekart, C. and Young, G. (1936). The approximation of one matrix by another of lower rank. Psyehometrika 1, 211-218. [11] Frets, G. P. (1921). Heredity of headform in man. Genetics 3, t93-400. [12] Gnanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate Observa.tions. Wiley, New York. [13] Izenman, A. J. (1975). Reduced-rank regression for the multivariate linear model, d. Multivariate Anal. S, 248-264. [i4] Kshirsagar, A. M. (1972). Multivariate Analysis. Marcel Dekker, New York. [15] Rao, C. R. (1952). Advanced Statistical Methods in Biometrie Research. Wiley, New York. [16] Rao, C. R. (1965a). The use and interpretation of principal component analysis in applied research. Sankhy~ Ser. A 26, 329-358. [17] Rao, C. R. (1965b). Linear Statistical Inference and its Applications. Wiley, New York. [18] Rao, C. Ro (1978a). Matrix approximations and reduction of dimensionality in multi variate statistical analysis. To appear in: P. R. Krishnaiah, ed., Multivariate Anab, sis V. North-Holland, Amsterdam. [19] Rao, C. R. (1978b). Separation theorems for singular values of matrices and their appfications in multivariate analysis. Technical Report No. 78-01, Department of Mathematics and Statistics, University of Pittsburgh. [20] Rao, C. R. and Mitts, S. K. (1971). Generalized Inverse of Matrices and its Application~. Wiley, New York. [21] Robinson, P. M. (1973). Generalized canonical analysis for time series. J. Multivariate Anal. 3, 140-160. [22] Wilk, M. B., Gnanadesikan, R. and Huyett, M. J. (1962a). Probability plots for the gamma distribution. Technometrics 4, 1-20. [23] Wilk, M. B., Gnanadesikan, R. and Huyett, M. J. (1962b). Estimation of parameters of the gamli~a distribution using order statistics. Biometrika 49, 525-545. [24] Williams, E. J. (1967). The analysis of association among many variates. J. Roy. Statist~ Soc. Ser. B 29, 199-242.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company 0980) 593-615

| Q
1

Parameter Estimation in Nonlinear Regression Models


H. Bunke

1.

Introduction

Whereas the statistical theory of parameter estimation in linear models is almost completely developed, many problems are unsolved in the nonlinear case. Of course, a similar complete theory for finite sample size can be hardly expected. The geometrical concepts and the theory of exponential families, useful in the linear model case, break down because of the structure of the parameter range. In applications it should be always investigated, if a linear model is sufficient as approximation, because then the statistical treatment is much more simple. But in many cases just the nonlinearity contMns essential statements for the scientist. For instance in a linear approximation the clear interpretation of parameters may be lost. The model choice has to take into consideration the complete scientific knowledge about the system and consistency with the data. Previous data transformations are often useful, but then the plausibility of error assumptions, which are often violated by the transformation, must be carefully checked. This needs the experience of the statistician or must be considered with proper caution in the development of statistical computing systems, because there the influence of the experienced statistician is diminished. As for linear models the least squares method occurs as the most important estimation method, because many favourable properties known from the linear theory can be at least asymptotically assured. The solution of the corresponding nonlinear optimization problem requires in general iterative methods. In concrete case studies the selection of an appropriate procedure from the large set of iterative methods proposed in the literature, the choice of starting guesses, incremental change and step size requires a lot of experience. We consider random variables Yt, t = 1,2 ..... n, which are considered as observable and satisfy the equations
y,=f(xt)+e,, t = l , 2 . . . . . n.

(1)

593

594

H. Bunke

et, t = 1,2 ..... n are independent r a n d o m variables with Ee, = 0 and


Vare, = ot2, (2)

and the regression function f : 9()-~R 1 is a real function. The regressor values or design points x t vary in a subset % of an Euclidian space, f is unknown but a parametric set

9= (AIO Go)

(3)

is given wiflif=fo for a certain 0oG. H e r e f o is for fixed v ~ G a given real function defined on % . - O c R l is the parameter set, a n d v~ o the true parameter. If fo is nonlinear in 8, then (1) is called: nonlinear regression model. We use the notations ~: = ( x 1..... x,), f~:=(fo(xl) . . . . . f o ( x n ) ) T and ~ : = {f~[~ GO}. If 2~ is a linear subspace of R ' , (1) describes a usual linear model, and if ~ is a convex subset of R n, (1) is called a convex model (compare [1]). Surveys on statistical methods in nonlinear regression models and related problems are given in [2, 3, 29].

2. Examples
2.1. Empirical growth curves
Empirical growth curves arise in m a n y chemical or biological problems where a size u is observed during the time t. The growth rate d u / d t is supposed to be proportional to the size already achieved and the difference between u and an assumed achievable size b : d u / d t = au(b-u). Integration, taking logarithms: y = l n u and reparametrization lead to the response model with:

fo(x)=a-ln{l+e-X+l~x},

v~= (a,X,~).

This model was investigated b y e . g . J . A . Netder [4, 5].

2.2. Exponential models


For m a n y purposes a mixture of exponential curves is quite appropriate to fit the data:
P

fo(x)=~o + E aie~'x, O = ( a o , a, ..... ap,fll ..... tip).


i=1

Parameter estimation in nonfinear regression models

595

For the simple case p = 1 special parameter estimation procedures have been developed (look e.g. [6, 7]).
2.3+ C o o b - D o u g l a s models

In various economic contexts, in study of production or demand func.+ tions, occurs the response function i,, ( x ) = o+Xl ,X .... x

+,.+=(,,+,< ..... g ) .

In this approach the error is sometimes also specified to be multiplicative. For detailed discussion look e.g. [8, 9].
3+ I~ast squares estimation

The least squares estimation is the main estimation method in nonlinear regression. If f~ is to be estimated, it seems reasonable to use as estimators projections f~ of y=(y~ ..... Yn)' on ~+ with respect to a norm
] 1/2

with weights w t > O, t = 1. . . . . n. This means, it holds

'11~n(Y) : W l y DEFINITION 1.

= mi

,Wly-

(4)

An estimator "In for a parametric function y.(#) is called

weighted least squares estimator (WLSE), if ~n= "~n(y)='/(#n(y)), where ~n(Y) is a solution of (4). If w, = 1 ( t = 1..... n), ( w = w0), ,~n is called ordinary least squares estimator (OLSE). If % = a , -2, ( t = 1..... n), ( w = % ) "~n is called generalized least squares estimator (GLSE).

If e t ~ ( N ( O , at2), t = 1 . . . . . n, then the GLSE is a maximum likelihood estimator. For the computation of WLSE iterative procedures are needed which mostly solve the normal equations w,(y,-A(xt))U=o,
t=l

596

H. Bunke

where

Generally, WLSE is biased. M. J. Box [10] has given a rough approximao tion of the bias under

et~N(O,o 2)
(l~n E --~o )

( t = l . . . . . n)~ "~ 1 T

w =w0: F/Ttr i=1 ~ 1 T '

"~' -- "2-

(5)
where

Because unbiasedness is not generally a property of the nonlinear estima~ tions the concept of estimability used in linear models (e.g. [34]) is no longer helpful. For the asymptotic properties discussed later we need an asymptotic identifiability (compare A6 and A 7 in [6]). For the convergence of iterative procedures (compare [11] or [12]) the uniqueness of the solution of the normal equations is important. A technique is presented by Bird and Milliken [13] to determine parameter functions 7 which are invariant to the solution obtained from the normal equations. If the structure of the regression function is too complicated we may look for an approximation in a given class of eventually more simple functions { g#,~ EO}. (Here we use the same notation for the parameter, as no confusion is possible.) Then in (4) we replace f~ by g~ and obtain: #.: Wly - gd.(y)[. = min ~]y -g~l.
~EO

(6)

DEFINITION 2. g- : = g~.(r) is called weighted inadequate least squares approximation of f (WILSA), if ~.(y) is a solution of (6). The corresponding parameter estimators ~. are called WILSE, GILSE and OILSE. The formulation as an inadequate least squares problem may be needed, because of the fact, that the numerical treatment may be much more simple or, looking on statistical efficiency, it could seem to be reasonable

Parameter estimation in nonlinear regression models

597

to diminish the dimension of parameters. However, the investigation of WILSA is of basic importance taking into consideration that the model choice problem in the nonlinear case is difficult, such that in applied situations the assumption " f E 65" is mostly valid only as an approximation. If in g+ there are linear parameters present, that is

g+(x) =,~The(x> (O=(,~,Z)e#)=av~,xe%),

(7)

with ha: % ~ R p, the dimension of the least squares optimization problem can be diminished (see e.g. [14, 15]). With P~ =H~Hff, Hff =

(H~DwH~)-'H~D +,
t~ 1,,..,n

HB=((hB(o(xt)))i=l

..... p'

D+=diag[w l.... ,wn],

(6) can be written as

"'IY -&[. = min "iY - P~Y[. = ' i Y -

P(y>Yl,,

+o=(<,A),

(8)

However, it may happen, that in spite of the lower dimension the problem becomes more complicated for the numerical treatment, because of the complicated dependency of Pa on 13. In such cases it is easier to include the optimization on a in the iteration. EXAMPLE. We illustrate this by the simple example

fa(x) = ao+ a,e ~+`+.


Criterion (8) takes the form: min E [ Yi - g+o(fl ) - al( fl )e- flxi] 2 B where

gto(fl)= ~[ ~ y i - 8 , ( f l )
and

~ e ~+']

598

H. Bunke

4.

Linearization

The problem of approximating a nonlinear response function f in a given linear set of functions qb may e.g. arise by expanding the real function f in a truncated Taylor series about x0:

fo(x) ~So (Xo) + [(x -- Xo)'r grad~ =~o] So(x)


+'" +~. [ ( X - X o )
k
1 T

grad,=/o] f~(x)

=ao...o+ ~ aio...o(Xi-Xio)
i=1 k

+ ~ a,jo o(Xi-- x,0)(xj--Xjo)


i,j= 1 i<j k "+" E i=1 k

aiiO,'.o(Xi--Xio) 2"i . . . .

-t- E i=l

ai.,.i(Xi--Xio) r"

(9)

By interpretation of ai,,.2 .i, as new parameters in the model and extending their range {aili=...ir(~)iv~ ~O} to R 1, we get a linear model. is generated by the polynomials in (x i - Xio ), i = 1.... , k up to the order r. There are disadvantages because the physical interpretation of the parameters is lost due to the extension of the parameter range in (9). Very often in regression practice the regression function can be transformed into a function, which is linear in the parameters after some reparametrization (see e.g. [16]). If we consider as example an exponential growth curve f ( x ; a , b ) = a eax, then we obtain the linear function l n f = a + bx with a = l n a . For the C o b b - D o u g l a s function f ( x ; a , b 1.... ,bk)= ax~ . . . . x~ k we obtain the function l n f = a + ~ = , b i l n x i which is linear in the parameters. In some physical problems the function a/(1 + bx) occurs which can be transformed by 1 / f to a linear one. We assume, that there is a real function T with continuous derivative of second order, such that Tfo(x)=[a(o)]Tg(x), where g is an m-dimensional vector function on % and the parameter ~ is identifiable from the vector a(,}). This means, that there is a function cp with ,}= cp[a(~)]. It is a widespread use in applications to transform the model formally into

Tyt= Tfo(xt)+rl ,,

t=l,...,n,

(10)

without taking care of possible violations of the error assumptions. In general there is then no consistency or other good statistical properties of

Parameter estimation in nonlinear regression models

599

the least squares parameter estimation in this "wrong" linear model (comp. e.g. [9]). In [17] and [18] the construction of consistent estimation using no other nonlinear technique as the transformation T is discussed. F o r this method an appropriate choice of the design is needed. The idea is to make the error terms in (10) ,m, = +,,]
-

as small as possible. This is reached by replicated observations in x t and using the transformation T on the vector (Yl . . . . . fn) of averages )7t of the observations at x t (t = 1. . . . . n).

5.

Polynomial

approximation

The approximation of f by interpolation polynomial was treated in [19] and [20]. To describe the quality of the approximation by the polynomial depending on the vector of observations y we introduce the risk

R(~) : = E IIf-Wl[ 2,
where

IIhll2 : = (h,h)
and

(h, k):

--

h(x)

(x)p(x)dx,

with a nonnegative function p defined on ~ a n d with


gcp(x) d x = 1.

We assume % = [a,b] to be a b o u n d e d closed interval in R l, and that there are a q > 0 and a nonnegative n u m b e r p, such that

c ~ = { f ~ c"(%)llf<")(x)-f<")(y)l <q[x - y l ; x,y ~ ~A}.


Further we assume that nj = n~j observations with same expectations f ( x j ) and variances o2"rj ( j = O ..... s,,, Y.~."__0~j = 1) are made. Therefore there is a

600
partition
Sn

H. Bunke

j=O and a set of different design points (xlj = 0 ..... s~) (spectrum of ~) with x t = x j, a 2=tr2~) for tETy.

We look for an approximation polynomial f~,, of order G which fulfills tile unbiasedness condition

E~dxO=f(xO, U = o
Let be
Z. m. (Z(O) . . . . . Z(s~)) T,

. . . . . s.).

(11)

z(j) = nZ ~ ~

}ET.,

Yl.

Then we should restrict our attention on polynomials fs. with coefficients depending on y. The class of such polynomials with (11) is denoted by @. and, if the dependency on y is linear, by @~. In the space of polynomials of order s. we consider an orthonormal basis % ..... %. with respect to the scalar product ( . , . ) and we introduce the following denotations:

"~0,...,S

.: /~. : = ~212",

......Z,

THEOREM 1. ([20]). The best approximation o f f in @1 is given by f. =/3~ep: R(f.) = min R(f).

In the case of normality @~ can be replaced by @..


An important problem for the approximation of f is the choice of the order of the polynomials s. for given sample size n. The aim is to make the

Parameter estimation in nonlinear regression models

601

risk as small as possible. But at present it is only possible to calculate an upper bound for R(f,,). For this we need the following assumptions: (I) Let be x s, as from the Gaussian formula
Sn

f f(x)

(x)dx=
j=O

J(xJ),

and let the numbers n) be given by


Sn

nj= nafrj/'r,

"r= E ajzj. j=0

(12)

A n d more specially (II) % = [ - 1, 1], p(x)=[rrV(1 - x2)] -1, a2= a2 ( t = 1..... n), xS=cos[(2j+ 1)~r/(2s,, + 2) ] (roots of the Tschebyscheff-polynomials), aj= - j = 0 . . . . . sn. sn+l ' THEO~M 2. ([20]). Under the introduced assumptions and (I) it holds: (13)
1

R . ( ) < R. = W(epq)2s; zp-1 + (s. + 1)o2~'n - ' , where w = 3 + 2k/2 and Cp: Jacksons constant

cp=

6p +'pp(p + 1)(b - ay +' p!

Minimization of R~ with respect to s, gives


re_in Rn = dp(wCpq) l/O +P)[ "/'02/n
Sn

](2p+l)/(2p + 2) .~=.ro2/n

= O ( n --(2p+ l)/(2p + 2))

and the optimal sn is s* = (2p + 1)( [3 + 2 # 2 ] Cpq)'/(2P+2~[n/'ro2] 1/(2p+2)


= O ( n 1/(2p + 2)).

602

14. Bunke

Under (II) (13) holds with


Rn := k(cpqlSff) 2"- (2Sn + 1 ) 0 2 / n " An optimality property for the numbers nj from (12) is proven in [19] under (II). If

lirnoo I f , ( x ) - f ( x ) [ = O, x ~ %,
where f~=qo'fl is the interpolation polynomial with f ~ ( x J ) = f ( M ) 0 ..... s), and
s,*

(j=

lira --1 Z ~j2(x) =0,


n---~oo n j = O

x~')C,

then lirno~ E [ f , ( x ) - f ( x ) l 2-- 0, is proven in [20].

6.

Consistency of least squares estimators

From the linear theory it is already well known, that for the consistency of least squares estimators some conditions on the sequence of the design points are needed. By E. Malinvaud [22] simple examples of inconsistent nonlinear least squares are given. The first papers on asymptotic properties of nonlinear least squares were the papers of R. I. Jennrich [21] and E. Malinvaud [22]. The multivariate case brings no further, essential difficulties into the asymptotic considerations and was discussed e.g. by E. Malinvaud [22], W. A. Barnett [23] and V. V. Fedorov [24]. W. A. Barnett [23] discussed the conditions under which the iterated GLSE locates a consistent local maximum of the likelihood function. K. C. Chanda [25] investigated the efficiency of least squares estimators relative to maximum likelihood estimators. H. Bunke and W. H. Schmidt [26] extended the results of [21] to the case of not necessarily identically distributed errors, and robustness against model errors and two step estimation is treated. The case where the sequence of errors is generated by a stationary time series was investigated by E. J. Hannan [37].

Parameter estimation in nonfinear regression models

603

In this chapter we follow the presentation of [26], including results of S. Zwanzig [27] on the consistency of ~, in inadequate models. We consider again the model
yt=f(xt)+et

(14)

with an unknown regression function f : %--->R 1. We investigate the consistency of W I L S A gn and W I L S E ~ , from definition 2. At first we introduce some notations. F o r l , k : %--~R ~ we define
(l,k). ="(l,k)n := n
t=l

and

Wlll%=W(l,O.,
where w: = ( w ~ n ) [ t = 1,2 ..... n; n = 1,2 .... ) is a double sequence of positive r a n d o m variables with s, : = max [w~n) - utt-->O a.s.,
l~t~n

where u = (u t, t = 1,2 .... ) is a sequence of positive constants with 0 < ~ < u t < p < o0, t = l , 2 . . . . . F o r y E R " : Wly __ 112: = n - 1 f ~ w~n)(y t __ l( xt) ) 2.
t=l

For l : % - - > R p,
(n)

- [ ( w ( l k ~ ~'~J=l ..... q k:~---~Rq:W(l,k)n: -~ ~ i, J : n H i = I ..... p"

The W I L S A g~. is solution of the optimization problem:


rain Qff(v ~)

with
:= lytn"

N o w we allow that g~ depend on n to include approximation procedures depending on the sample size and the weights m a y be random, such that two step procedures with estimated ~r72 as weights are possible. We need the following assumptions: A.1. e t, t = 1,2 .... are independent r a n d o m variables with E e t = 0 , EeZt = :o~ and it holds either (a) or (b)

604

H. Bunke

(a) et, t = 1,2 .... are identically distributed with % = o. (b) The e t fulfill a modified Lindeberg condition: o t >17 > 0 for all t and
sup f x 2 d t ' ; ( x ) ----> 0, t ]x]>c c--,~

where F t denotes the distribution function of e r Further: a t < o < o e ,


t= 1,2,...,

n--~oo t = l

lim ~

t-2Eea<oe

and

n -l ~ u, o2,,~oorw>O.
t=l

A.2. o T = ( a T , flT) E O = R p ~ , ~

compact subset of Rm, g(")(x,v~): =

ctXh(")(x,/3)(h(~): % ~3 -->RP), where it is required that

sup "1 h}~") - hal..~---~oo 0 BEg for a function h : % 63~--~R p, ha :h(.,/3), which is continuous in /3 for fixed x ~ %. Then go : = aThp A.3. If ~C denotes the set of functions {f, ha(0li= 1..... p ; B ~
}
(hfl(i)" ith component of ha)

then for all h , k E ~ there exist real numbers " ( h , k ) with sup l U ( h , k ) , - " ( h , k ) l - - , O .
h,k~%

For all fl ~ ~ "(hB, h~) is a nonsingular matrix. A.4. There is an unique solution ~ f E O of

min " l f - go[2="lf - gojlz= : @


A.5. f = goy for v ~y E O, where go = aTh~ A.6. For all v~,v~'~O it holds U l g o - g o , [ = 0 iff v~=~ '. A.7. Let 7 be a continuous function: O---~F. For all 0 , # ' E O U [ g o - g o , [ = 0 iff y(v~)=7(~').

it holds

Parameter estimation in nonlinear regression models

605

The following theorem gives us the consistency of WILSA, WILSE and WLSE. THEOREM 3. We assume that A. 1, A.2 and A.3 are fulfilled. 7hen: (1) [26] "" n m , ~ w j~ - g d .(") = l i m , ~ o u [ f - g @ = A y a.s. (consistency of WILSA).
(consistency of WILSA).

(consistency of WILSE). (3) [26] Under A.5 it holds: Ay=0


and l i m w "

(4) [26] Under A,5 and A.6 it holds': O n l yaz a. s.


Under A.5 and A.7 it holds:
%~v(Os) a. s.

(consistency of WLSE). The first both statements of Theorem 3 establishes a stability of nonlin.o ear regression against model errors. Asymptotically the best approximation function under all admitted functions will be reached. Statement 3 gives the possibility, to construct for special models consistent estimators for variance parameters using appropriate weighting sequences w (comp. [26]). Statement 4 gives the consistency of weighted least squares and includes the consistency result of R. I. Jennrich [21].

7,

Asymptotic distribution of least-squares estimators

For results concerning the asymptotic distribution of v~ n further conditions must be fulfilled.

606

H. Bunke

A.8. h has derivatives of 1st and 2nd order with respect to fl which are continuous in f l a n d A.3 is true for the extension ~1 of ~ which includes the components of these derivatives, h (') has derivatives of 1st and 2nd order with respect to fl which are continuous in fl and it holds:

n 1/2 sup
nllZ sup
n ~~

"lh#(o. Oh(n) tl(o

)1--+0 ,

Ohm( --)0,

a3t

03~

i = 1..... p; l= 1..... m.

A.9. For all i , j = 1..... p + m and k=Ogo/OOIo=ar

co.(u ) = n -lim n - l Z a~u~Zki(xt)kj(xt) -) ~


exists and C(u)= ((cij(u))) is nonsingular. It holds

Lo,
t~l

#y is an interior point of . A. 10. The matrix


~2 .xi~l,

,m+p

\\

i j

I#=O$/]j=I

..... m+p

is nonsingular and sup, (nl/Z)W(k,f - go), < ~ a.s. The following theorem establishes the asymptotic distribution of the parameter estimations and approximations. THEOREM 4. We assume that A. 1, A.2, A.3 are fulfilled. (1) [26] Under A.5, A.6, A.8 and A.9 it holds with B - l ( u ) = ~ ( k , k ) (B is

assumed to be nonsingular), M ( u ) = B(u)C(u)B'(u): { n'/2(~,-Of) } ~ N[ O,M(u) ].

(2) [27] UnderA.4, A.8, A.9 and A.10 it holds

--~ N [ 0 4 [ G ( , u ) ] - l C ( u ) [ G ( u ) J - ' ] .

Parameter estimation in nonlinear regression models


REMARK.

60'7

Because of
G(u) = 2[U(k,k) .... ( f -- go,,g~,) ]

under A.5 and A.6 the statement (1) of Theorem 3 is a consequence of (2). The statement (1) is a generalization of a result of R. I. Jennrich [21]. These results can be used to construct asymptotic tests for model adeo quacy, model choice procedures and confidence bands for the regression function f (see [31]).

8,

Asymptotic optimality of GLSE without normality

As in linear model case different optimality properties of GLSE ~n can be proven assuming normality for the error variables e t or not. Of course the characterization of estimators b y linearity is no longer reasonable and will be replaced by the property to be a solution of a weighted least squares problem. We state the optimality of GLSE in a class of WLSE's in the sense of semi ordered asymptotic covariance matrices. We denote by W the class of all weighted sequences w which fulfill the assumptions for (1) in Theorem 4. THEOREM 5 [26].
I f wo: = ( W t = % - 2 , t = l , 2 .... ) E IV, then

M(wo) <M(u(w)),

w e W.

Usually the optimal weighting sequence is unknown, but can be estimated under some regularity conditions by the aid of least squares under several different weighting sequences. This is discussed in [26], specially the case of grouped observations is considered. Let 6t(n) be an estimator of ot. Then with ~ : ={W}~)=(~}n)) -2) we define the "two stage WLSE" #n minimizing 6 [ y - g~(")[n" COROLLARY. I f Wo E W and ~t n) is consistent in the sense
sup
t

- - : - o,-:1-*o a. s.,

and

t=l

then it holds M ( u f v ) = M(wo)

608

H. Bunke

In the case that the error variables e t are normally distributed the G L S E is m a x i m u m likelihood estimator and stronger optimality properties m a y be proven.

9o Maximum likelihood estimation A comparison of least squares estimation and m a x i m u m likelihood estimation is done by K. C. C h a n d a [25]. The efficiency of m a x i m u m likelihood estimator is proved applying the results of R. R. Bahadur [28]. We describe now this result and some results from H. Bunke a n d W. H. Schmidt [26]. We consider the model y, = foo(Xt) +e,, (15)

with f O o E ( f o [ O ~ O } , where is a /-dimensional open interval, e,, t = 1,2 .... are independent r a n d o m variables with
E~. t = 0

and

Vare t = atz

We denote the density of e t by p t ( e ) and lt(e ) : = log p,(e),

1}i)(e) : = d'It(e) de i

Let be k~(t)= Ofo(xt)lOv~. We assume that the following regularity conditions hold. B. 1. For each t, fo(xt) i s partially and continuously differentiable at least three times with respect to ~ and these derivatives are bounded, uniformly in ~ b y a constant 0t with lim n 1

n - q , OO

i
t=l

lot ( M

for some r > 6.

B.2. For each t = 1,2 . . . . . Pt is independent of 0 and is continuously differentiable at least three times almost everywhere in ( - c e , ~ ) . Further for all ~ E (9 and t = 1, 2 . . . . . for all y and i = 1,2

[p~i)(y _ fo(xt))[ < Q , ( y )


where Qt is independent of ~ and Q t ( y ) , y Q t ( y ) are Lebesgue integrable over ( - ce, m).

P a r a m e t e r estimation in nonlinear regression models

609

B.3. There exists a sequence { / ~ t , t = l , 2 , . . . ) of positive n u m b e r s with limn__,~n - l]~,t. 1/L~< M such that for all t = 1,2 .... a n d for some 8 > 0, not depending on t

<,,.
B.4. Let ( dt, t = 1,2... } be a n y sequence of real functions o n 0 such that for every t,d t is b o u n d e d uniformly in a9 b y a constant ct with l i m n ~ n - 11~7=1c[ < M for r > 6. Moreover, let there exist measurable funco tions R t > 0 i n d e p e n d e n t of ~ a n d constants vt such that

E((R,(e,)/+"} --.,
for some 8 > 0 , with llmn~oon ' -1 ~ t = n l P t 2 <M, t -- 1,2 . . . . . for almost all e a n d 1 < i < 3 a n d for all ~ @ , every

dt)l
and for w" = {'r), t =

B.5. Let "c?=E((lt(1)(et))2 }. T h e n for all # E O , 1, 2 .... } the matrix

Wo - ~ -

(ke, ko)= nlim n -~ ----> oo

%Zko(xt)k~(xt)
t=l

exists and is nonsingular. The likelihood e q u a t i o n is given b y

t=l

~, O l t ( Y t - f ( x t ) )

~i

=0,

i=1 ..... L

(16)

THEOREM 6 [25]. Under B . I - B . 5 there exist unique consistent solutions ~n(Y) to the equation (16):

On---~agoa. s. Further:
(nl/2(~n--V~o) )

(consistency of M L E ) .

n__>----) N [ 0 , Woo].

The next t h e o r e m establishes the efficiency of least squares estimators and m a x i m u m likelihood estimators.

610

H. Bunke

THEOREM 7 [25]. For any estimator ~* with ~{ Vn( 0 . - ~o)} = ~ N[0,A],

where A is positive definite and continuous in ~o, it holds Woo <<. A under
B.1-B.5.

Theorem 7 means that the maximum likelihood estimator 4, is a BAN (best asymptotically normal) estimator of ~0. Obviously under the conditions of Theorem 6 and et~N[O,o 2] the GLSE ~q, is BAN. Consequently the same is true for the two stage estimator ~, if additionally the conditions of the corollary to Theorem 5 are fulfilled. Let the conditions of Theorems 4 and 6 be fulfilled and et~N[0,ot]. It is interesting to look for conditions under which the OLSE is BAN (M(wo)= W,%,wo = { 1, 1.... }). It holds from Theorem 4 with

kT(xO

Ofo

x.= kT(x.)

O=Vao

and Y~.= Diag[alz..... o.21:

M(wo) = W(k, k)-~ C(wo)WO(k,k)-'


= (limlxTx.)-I l i m n X . X . X , ( l i m n X . X. )
1 T 1 T -1

This matrix is equal to Wo0=(limn- 1xTz n IX,)-~ if for all n:

( XJX.) -1XJ~nXn(X2~Xn) -1= ( X : ~ ; 1Xn)- 1.


(17) is fulfilled iff:

(17)

x.

(18)

where ~ ( X . ) denotes the linear space generated by the columns of X. (see [30]). Consequently, if (18) is fulfilled, the OLSE is a BAN estimator.

Parameter estimation in nonlinearregressionmodels

611

EXAMPLE. Let us assume an exponential regression function and replicated observations at different points xo) and x(2):

yt=aeBX(')+et,
rl=(l
n n--,~

e , ~ N [ O , a2(i)], t ~ T i ,
r 2 = ( n m + 1..... h i + n 2 ) , i=1,2, x0)#x(2),

i=1,2,

..... hi),

/-z i - - --~ v i > 0 ,

from a bounded closed interval. All previous assumptions including (17) are fulfilled.

10,

Robust nonlinear regression

Because of the fact that least squares estimates are highly sensitive to gross errors in the observations W. G r o s s m a n n [33] following P. J. Huber's [32] concept of M-estimates proposed estimators which are obtained by minimizing the following expression:

v~K: ~eemint=,= PK(Yt--f~(xt)), where OK is a less rapidly increasing function:

(19)

OK(z) = - ~ K]z]

for z < K ,

K2 - ~

for Izl > K .

(20)

We consider the model

y,=foo(Xt)+et,

OoC,

where O is a compact subset of R l and et are i. i. d. variables with

Ee t = 0

and

Vare t =

0 2.

We need the following assumptions: C. 1. The distribution function F of e t is continuous and symmetric, and there exists a constant K 0 > 0 with F(Ko) - F ( - Ko) > O.

612

H. Bunke

C.2. fo(x) is continuous and monotonic in # for fixed x and all partial derivatives with respect to # to first and second order of fo exist and are continuous and the limits

exist. The matrix Wo(k,k) = B - 1 with k = 0f//0~ THEOREM 8 [33].

I~=Oois positive definite.

Assume that C.1 and C.2 hold. Let be K >~ K o and {K~) a sequence converging to K. If ~ffo is for each n a solution of (18) with K= K n, then

~ { n 1/2(~/~ -- l~0)) n~ N[O,A(K,F)B],


where A (K, F ) = [/ff'~(z) dF]/[fql)(z) d F ] 2 and
%A:) =

Let us establish further assumptions and notations. C.3. There exist a sequence {O,} of estimates of o, which is consistent, shift and scale invariant. K is allowed to vary only in some interval [Kw, K2o ]. With the OLSE O, there is an estimate for A(K,F) given b y

Let be K n and K* the solutions of


Kn : min(

An( K,.F)lK e [ K16,,,KzO. ] }

and K*: min{ A( K,F)[K E [ Kw, K2o ] }.

THEOREM 9 [33].

Assume that C.1-C.3 hold. Then Kn--->K* and

(~{ Vn('l~nKn--'l~o)} n___,---~ N[O,A(K*,F)B].

Parameter estimation in nonlinear regression models


11. Confidence regions

613

We consider again the model


Yt=fOo(Xt)+et,

f~o~ { folv~ ~ 0 ) ,

where e t are i. i. d. variables with c R l and


E e t = 0,

Var e t = 0 2.

We take tile OLSE ~ (w = Wo= (1, 1.... }). Under the assumptions of the Theorem 4 we m a y construct the asymptotic a-confidence region for 9 o
,,(,1~n ---,oo,)Tan l(l~n)(~n __ 0,) '~ 02X2;a,, '

(21)

where

e#
and

L,

~rn --

QB

Qn(ag) = n.

wo

lY - f ~ 1,,.

~2

Under the additional assumption e t ~ N [ O , 0 2] several approximative confidence regions has been proposed using linearization techniques. If fo is linear in ,9, then
Qn(Oo)-Qn(~n)

l<
corresponds to the likelihood ratio statistic and follows an F-distributiono In the general nonlinear case "Q,(v~) ^ - z .... -- Q n ( ~ n ) < lanl~l,n--l;a (22)

m a y be considered as approximative a-confidence region. E. M. L. Beale [35] used this region corrected by a certain nonlinearity correction term. Using a truncated second order Taylor expansion of the left hand side of (22) to get a more convenient shape of the region it follows:
agn

--I.~ ) I~n(~n)(~n "--~ ) <2l<I4l, n _,;a

,,

(23)

614 with 02

H. Bunke

w h i c h w a s p r o p o s e d b y G . E. P. B o x a n d G . A. C o u t i e [36]. L i n e a f i z a t i o n of t h e m o d e l t a k i n g a first o r d e r t r u n c a t e d T a y l o r e x p a n s i o n of f l e a d s to the a p p r o x i m a t i v e a - c o n f i d e n c e r e g i o n

"(O n^-t}) TBn_ l (l}n)( ^ On " __ t~ ) <

l(fr:l~l,n _l;a,, '

(24)

w h i c h was p r o p o s e d b y S. M. G o l d f e l d a n d R. E. Q u a n d t [8].

References [1] Humak, K. M. S. (1977). Statistische Methoden der Modellbildung, Band I. AkademieVerlag, Berlin. [2] Cox, D. R. (1977). Nonlinear models, residuals and transfo1~nations. Math. Operationsforsch. Statist., Ser. Statist. 8, 3-22. [3] Bunke, H., Henschke, K., Striiby, R. and Wisotzki, C. (1977). Parameter estimation in nonlinear regression models. Math. Operationsforsch. Statist., Ser. Statist. 8, 23-40. [4] Nelder, J. A. (1961). The fitting of a generalization of the logistic curve. Biometrics 17, 89-110. [5] Netder, J. A. (1962). An alternative form of a generalized logistic function. Biometrics 18, 614-616. [6] Rasch, D. (1967). Sch/itzprobleme bei eigentlich nichtlinearen Regressionsfunktionen. Abh. Deutsch. Akad. Wiss. 121-128. [7] Agha, M. (1971). A direct method for fitting linear combinations of exponentials. Biometrics 27, 399-413. [8] Goldfeld, S. M. and R. E. Quandt (t972). Nonlinear Methods in Econometrics. NorthHolland, Amsterdam, London. [9] Goldberger, A. S. (1968). The interpretation and estimation of Cobb-Douglas functions. Econometrica 36, 464-472. [10] Box, M. J. (1971). Bias in nonlinear estimation. J. Roy. Statist. Ser. S,c. B 33, 171-201. [11] Hartley, H. O. (1961). The modified Gauss-Newton method for fitting of nonlinear regression functions by least squares. Technometrics 3, 269-280. [12] Marquardt, D. W. (1963). An algorithm for least squares estimation of nonlinear parameters. S l A M J. Appl. Math. 11, 431-441. [13] Bird, H. A. and Milliken, G. A. (1976). Estimable functions in the nonlinear models. Comm. Statist. A 5, 999-1012. [14] Barham, R. H. and W. Drane (1972). An algorithm for least squares estimation of nonlinear parameters when some of the parameters are linear. Technometrics 14, 757-766. [15] Lawton, H. and Sylvestre, E. A. (1971). Elimination of linear parameters in nonlinear regression. Technometrics 13, 461-467. [16] Draper, N. R. and Smith, H. (1966). Applied Regression Ana~sis. Wiley, New York.

Parameter estimation in nonlinear regression models

6 ~5

[17] Bunke, H. (1976). Simple consistent estimation in nonlinear regression by data transforo mations and design of experiments. Math. Operationsforsch. Statist, 7, 715-719. [18] Bunke, H. (1977). Linear parameter estimation in nonlinear regression models by previous data transformations, Biom. J. 19, 253-256. [19] Petersen, I. (1969). Comparison of the method of reproducing kernels with the method of least squares (in russian). Izv. Akad. Nauk. Est. S S R Fiz. Mat. 18, 403. [20] Wisotzki, C. (1977). Polynomial approximation of nonlinear regression tractions. Math Operationsforseh. Statist. Ser. Statist. 8, 313-321. [21] Jennrich, R. I. (1969). Asymptotic properties of nonlinear least squares estimators. Anm Math. Statist. 40, 633-643. [22] Malinvaud, E. (1970). The consistency of nonlinear regressions. Ann. Math. Statist. 41, 956-969. [23] Barnett, W. A. (1976). Maximum likelihood and iterated Aitken estimation of nonlinear systems of equations. J. Amer. Statist. Assoc. 71, 354-360. [24] Fedorov, V. V. (1977). Estimation of regression parameters in the case of vector valued observations. In: V. V. Nalimov, ed., Regression Experiments (in Russian), Izd. Mosk. Univ., Moscov. [25] Chanda, K. C. (1976). Efficiency and robustness of least squares estimators. Sankhyd Ser. B 38, 153-163. [26] Bunke, H. and Schmidt, W. H. (1980). Asymptotic results on nonlinear approximation of regression functions and weighted least squares. Math. Operationsforsch. Statist. Set. Statist. 11, Heft 1. [27] Zwanzig, S. (1980). Inadequate least squares. Math. Operationfforsch. Statist. Set. Statist. 11, Heft 1. [28] Bahadur, R. R. (1964). On Fisher's bound for asymptotic variances. Ann. Math. Statist. 35, 1545-1552. [29] Bard, J. (1974). Nonlinear Parameter Estimation. Academic Press, New York. [30] Kruskal, W. (1968). When are Gauss-Markov and least squares estimators identical? A coordinate-free approach. Ann. Math. Statist. 39, 70-75. [31] Bunke, O. and Grabowski, B. (1979), Model choice in nonlinear regression. Math. Operationsforseh. Statist. Ser. Statist. 10, to appear. [32] Huber~ P. J. (1964). Robust regression of a location parameter. Ann. Math. Statist. 35, 73-101. [33] Grossmann, W. (1976). Robust nonlinear regression. In: Comstat 1976, Physica Verlag, Wilrzburg, pp. 146-152. [34] Bunke, H. and Bunke, O. (1974). Identifiability and estimability. Math. Operatiom .... forsch. Statist. 5, 223-233. [35] Beale, E. M. L. (1960). Confidence regions in nonlinear estimation. J. Roy. Statist. Soe. Ser. B 22, 41-71. [36] Box, G. E. P. and Coutie, G~ A. (1956). Application of digital computers in the exploration of functional relationship. Proc. I E E E B 103, suppl. 1, 100-107. [37] Hannan, E. J. (1977). Nonlinear time series regression. J. Appl. Prob. 8, 767-780.

P. R. Krishnaiah, ed., Handbook of Statistics; Vol. I North-Holland Publishing Company (1980) 617-622

] 0

At. J

Early History of Multiple Comparison Tests


H. L e o n H a t t e r

The problem of multiple comparisons is that of comparing statistical measures (means, proportions, etc.) of the properties or effects of pairs of levels of a factor (varieties, treatments, locations, etc.). If there are only two levels of the factor, a comparison of the values of the measure of interest for the two levels is quite simple and straightforward, and statistical tests of the significance of the difference between two means, two proportions, two variances, etc. are well known. Suppose, however, that one wishes to test the significance of the difference between two particular levels of a factor which are among several included in an experiment or a set of observations. Then the question arises as to how one should take account of the number of levels and the rank in the group of the two levels singled out for attention with regard to the measure of interest. The relevance of this ranking in testing the significance of the difference between two levels, or between one level and the average for all of the levels, has long been recognized. Cournot (1843) distinguished between the probability that the proportion of male births in one of the 86 departments of France chosen at random will differ by more than 8 from the proportion for France as a whole and the corresponding probability for the department chosen because its proportion deviates most from that of France as a whole. The same distinction, he said, does not apply if the statistician chooses in advance, before seeing the data, a particular department (say, the department of the Seine) because he has reason to believe that it is subject to exceptional conditions which can have a very palpable influence on the chance of male births. But, he inquired, does the same principle apply if one chooses the department of the Corse, or the department of the North, or any one of several others which seem a priori to be subject to exceptional conditions? Evidently, he said, there enters into this estimation an element which is variable and resistant tO mathematical determination. Fisher (1924) derived the distribution of the z statistic needed to perform tests of significance in connection with the analysis of variance, a procedure which he had devised earlier. The analysis of variance and the
617

618

14. Leon Harter

associated z test (or the equivalent F test, where F,--e 2~) provide a definite answer to the question of overall significance of the effect of a factor. Often, however, what the experimenter really wants to know is which levels differ significantly from which others, and that is precisely the problem of multiple comparisons. Fisher (1935) himself proposed one of the earliest multiple comparison tests. If the overall z test (or F test) of the effect of a factor shows significance, he proposed what has come to be known as the "protected" LSD (least significant difference) test; otherwise, he proposed a much more conservative test based on the Bonferroni inequality. In either case, the error mean square is used to estimate the variance. If tile Type I error rate is a for each comparison in the unprotected LSD (multiple t) test, it is less than a, and decreases as n (the n u m b e r of means) increases, in the protected LSD test, since pairs of means are c o m p a r e d only if the overall F test shows that the n means differ significantly. If the F test does not show significance, the error rate per comparison is taken to

be a/(2),where(2)lsthenumberofcomblnatlonsofnthlngstaken2at
a time. N e w m a n (1939), following a suggestion of "Student" (W. S. Gosset) [see Pearson (I939)], developed the first multiple comparison test based on the studentized range q = w/s, where w is the range of a sample of n observations from a normal population with standard deviation o and s 2 is an independent and unbiased estimate of 02 based on f degrees of freedom, obtained from a sum of squares in the usual manner. Suppose that, as a result of an experiment, n treatment means x~, x2,..., ft, are available along with an independent estimate, s 2, of their sampling variance. Student's idea was that one could test whether any treatment differences exist by comparo ing the difference between the highest and lowest treatment means, say w - Yn - :71, with s. If this difference, with due regard to the values of n and f, is found to be clearly significant, then one can set aside the m o r e divergent of the extremes, say 2~, and compare 2n-:72 with s, using n - 1 and f, and so on. This is a limiting case (as f--~oo) of the procedure suggested by Student (1927), based on w/o instead of w/s. In order to facilitate the use of tests based on the studentized range, N e w m a n tabulated one-decimal-place (2DP for n = 2 a n d / o r f = oo) 5% and 1% points of q=w/s for n=2(1)12, 20 and f--5(1)20,24,30,40,60, ce. H e worked out three numerical examples to illustrate the use of the tables. Better values of the percentage points of the studentized range were tabulated a few years later by Pearson and Hartley (1943). Keuls (1952) proposed a slight modification of N e w m a n ' s test. A synthesized version, called the Newm a n - K e u l s test, has enjoyed considerable popularity and is still widely used today. In this test, as in other multiple range tests, the means are arranged in order from smallest to largest and all groups of p consecutive ordered means (p = n,n- 1..... 2), except those which are subgroups of

Early history of multiple comparison testa'

619

groups already found not to differ significantly, are tested by comparison with the critical value of the studentized range of p means with f degrees of freedom. The ten years immediately following the close of World War li saw a great increase ill interest in multiple comparisons, especially in the United States. During that period, noteworthy contributions were made by David B. Duncan, Charles Wo Dunnett, H. O. Hartley, Henry Scbeff6 and John W. Tukey in the univariate case. S. N. Roy and R. C. Bose (1953) contributed important results in the multivariate case, which will not be discussed in detail here. Tukey (1949) proposed a so-called "gap-straggler" test based on gap, extreme deviate, and variance tests. Perhaps because of the rapidity of developments during the next few years, this test never came into widespread use, though Tukey himself still uses it on occasion in conjunc~ tion with other tests. Tukey (1952) explored allowances for various types of error rates and proposed his studentized range test, which, in contrast with the Newman-Keuls test, employs the critical value (at the desired significance level a) of the studentized range of n means in testing the signifi-cance of subgroups of p out of the n means (p = 2 , 3 , . . . , n ) . The unprotected LSD (multiple t) test has a fixed Type ! error rate per comparison, but its Type I error rate per experiment (or experimentwise) increases quite rapidly as the number of means n being compared increases above 2. Tukey's studentized range test and Fisher's test based on the Bonferroni inequality, on the other hand, have respectively a fixed Type I error rate experimentwise (the probability that at least one pair of means will differ significantly if all came from the same population) and a fixed Type I error rate per experiment. Their Type I error rates per comparison decrease quite rapidly as n increases, and this decrease is accompanied by an increase in the corresponding Type II error rate (a decrease in the power of the test) Tukey (1953a), in a m e m o r a n d u m which unfortunately was never published but nevertheless achieved wide circulation, gave an extended account of error rates and multiple comparison tests, including Tukey's X procedure. The latter is a compromise between Tukey's studentized range test and the N e w m a n - K e u l s test, its critical value being midway between those for the other two tests (the critical values of the studentized range of n means and of p means, respectively). Tukey (1953b) gave some related results on allowances and critical values for Tukey's studentized range test. In his doctoral dissertation, Duncan (1947) sought to devise a multiple comparison test for testing the n ( n - 1)/2 differences among n means that would have, as closely as possible, the same properties with respect to each difference as an a-level t test has to the single difference in the simple case n--2. Toward this end, he proposed his ranked difference test, which was never widely used. In subsequent papers, Duncan (1951, 1952, 1955)

620

1-1. Leon ttarter

pursued the same objective, and was led to the use of special protection levels based upon degrees of freedom. Let 72,~ = 1 - a be the protection level for testing the significance of difference between two means, that is, the probability that a significant difference will not be found if the population means are equal. Duncan reasoned that one has p - - 1 degrees of freedom for testing p means, and hence one may make p - 1 independent tests, each with protection level 72,~- Hence the joint protection level is y p . ~ = ( 1 - a)P-l; that is, the probability (when the population means are equal) that one finds no significant differences between sample means in making p - 1 independent tests, each at protection level 72,~, is 3,~,~1o Duncan devised both a multiple F test and a new multiple range test based upon protection level 7p,~ for tests on p means. Although multiple F tests are conceptually better than multiple range tests, their critical values are almost prohibitively difficult to calculate, so they have never been widely used. Duncan's new multiple range test, on the other hand, has been the most widely used of all multiple comparison tests, according to S c i e n c e
Citation I n d e x .

Scheff6 (1953) presented a method for judging all contrasts in the analysis of variance. Suppose the usual F test (with k - 1 and p degrees of freedom) of the hypothesis H : / z 1=/z 2 . . . . . /~ at the a level of significance rejects H. For any c I . . . . , c k with E k c i = O , write 0 for the contrast ~ c i l z i, and write 0 and 62 for the usual estimates of 0 and the variance of 0. Then the estimated contrast is said to be significantly different from zero if [0[ >SSg, where S 2 is ( k - 1) times the upper a point of the F distribution with k - 1 and p degrees of freedom. Dunnett (1955) gave a multiple comparison procedure for comparing several treatments with a control. The numerical results of an experiment performed for that p_urpose can be summarized_ in the form of a set of numbers Xo, X 1. . . . . Xp and s, where the X's are means of p + 1 sets of observations assumed to be independently and normally distributed, )~0 referring to the control and X i to the i-th treatment (i-- 1..... p), and s is an independent estimate of the c o m m o n standard deviation of t h e p + 1 sets of observations. The procedure given by Durmett for making confidence statements about the true (or expected) values of the p differences X i - X o has the property that the probability of all p statements being simultaneously correct is equal to a specified value P (in Tukey's terminology, the experimentwise error rate is 1 - P). The reader will note that most of the foregoing results have been stated in terms of significance, but he can easily change from significance statements to confidence statements (or vice versa) if he so desires. Hartley (1955) gave a sequential F test for multiple comparisons of 2 mean squares. Let s2; s 2 1..... si be independent mean squares, obtained in

Early history of multiple comparison tests

621

an analysis of variance, and due to error and various series of treatments, respectively. If the latter mean squares all have the same number of degrees of freedom, the procedure consists in comparing the ratios s 2 / s 2, taken in descending order of magnitude, with appropriate percentage points, until a non-significant ratio is reached. Hartley derived inequalities for the probability of wrongly returning at least one s~ as significant, and for the power with respect to a single effect. A m o n g the c o m m o n l y used multiple comparison tests proposed up through 1955, Scheffr's test for all contrasts and Fisher's test based on Bonferroni's inequality are very conservative. The user of Scheffr's test pays a penalty of added conservatism for its versatility; this test should never be used if one is interested only in comparing pairs of means. Tukey's studentized range test is somewhat conservative. Progressively less conservative tests are Tukey's X procedure, the N e w m a n - K e u l s test, and Duncan's new multiple range test. Least conservative of all is the unprotected LSD (multiple t) test. Non-conservative tests provide poor control over the error rate per experiment (or experimentwise). Conservative tests, on the other hand, m a y limit the error rate per comparison to unnecessarily low values, and tend to have low power unless the sample size is large. The relation between error rates and sample sizes for range tests in multiple comparisons was studied by Harter (1957). During the course of this study, it was discovered that tables of percentage points of the studentized range c o m p u t e d b y Pearson and Hartley (1943) and by May (1952) were slightly in error and that tables of the special percentage points for protection levels based on degrees of freedom given by D u n c a n (1955) were subject to more serious error. More accurate (and more extensive) tables were computed by Harter, Clemm and Guthrie (1959) and abridged versions of these tables were published by Harter (1960a, b). Harter (1961) gave corrected error rates for Duncan's new multiple range test based on the new tables of critical values. All of these results and m a n y others on the range and studentized range of samples from a normal population are included in a book by Harter (1970).

References
Cournot, A. A. (1843). Exposition de la Theorie des Chances et des Probabilit~s. Librairie de L. Hachette, Paris. Duncan, David B. (1947). Significance tests for differences between ranked variates drawn from normal populations. Ph.D. thesis, Iowa State College, Ames, Iowa. Duncan, D. B. (1951). A significance test for differences between ranked treatments in an analysis of variance. Virginia J. Sci. (N. S.) 2, 172-189. Duncan, D. B. (1952). On the properties of the multiple comparisons test. Virginia J. Sci. (N. S.) 3, 49-67.

622

H. Leon 14arter

Duncan, D. B. (1955). Multiple range and multiple F tests. Biometrics 11, 1-42. Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a control. J. Amer. Statist. Assoc. 50, 1096-1121. Fisher, R. A. (1924). On a distribution yielding the error functions of several well-known statistics. Proceedings of the International Congress of Mathematicians, Toronto, 805-813. Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh-London. Fourth edition, 1947. Harter, H, Leon (1957). Error rates and sample sizes for range tests in multiple comparisons. Biometrics 13, 511-536. Harter, H. Leon (1960a). Critical values for Dunean*s new multiple range test. Biometrics 16, 671-685. Harter, H, Leon (1960b). Tables of range and studentized range. Ann. Math. Statist. 3L 1122-1147. Hatter, H. Leon (1961). Corrected error rates for Duncan's new multiple range test. Biometrics 17, 321-324. Harter, H. Leon (1970). Order Statistics and their Use in Testing and Estimation, Volume 1: Tests Based on Range and Studentized Range of Samples from a Normal Population. U. S, Government Printing Office, Washington, AD-A058262. Harter, H. Leon, Clemm, D. S. and Guthrie, E. H. (1959). 'fhe probability integrals of the range and of the studentized range. Probability integral and percentage points of the studentized range; critical values for Duncan's new multiple range test. WADC TR 58-484, Volume II. Wright Air Development Center, Wright-Patterson Air Force Base, Ohio. AD 237733. Hartley, H. O. (1955). Some recent developments in analysis of variance. Comm. Pure AppL Math. 8, 47-72. Keuls, M. (1952). The use of the "studentized range" in connection with an analysis of variance. Euphytica 1, 112-122. May, J. M. (1952), Extended and corrected tables of the upper percentage points of the "studentized" range. Biometrika 39, 192-193. Newman, D. (1939). The distribution of range in samples from a normal population, expressed in terms of an independent estimate of the standard deviation. Biometrika 31, 20-30. Pearson, E. S. (1939). "Student" as statistician. Biometrika 30, 210-250. Pearson, E. S. and Hartley, H. O. (1943). Tables of the probability integral of the studentized range. Biometrika 33, 89-99. Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. Ann. Math. Statist. 24, 513-536. Scheff+, Henry (1953)o A method for judging all contrasts in the analysis of variance. Biometrika 40, 87-104. "Student" [W. S. Gosset] (1927). Errors of routine analysis. Biometrika 19, 151-164. Tukey, J. W. (1949). Comparing individual means in the analysis of variance. Biometrics 5, 99-114. Tukey, J. W, (1952). Allowances for various types of error rates. Unpublished invited address presented at Blacksburg meeting of Institute of Mathematical Statistics and Biometric Society. Tukey, J. W. (1953a). The problem of multiple comparisons. Unpublished memorandum in private circulation. Tukey, J. W. (1953b). Some selected quick and easy methods of statastical analysis. Tram'. New York Acad. Sci. 11 16, 88-97.

P. R. Krishnaiah,ed., Handbook of Statistics, Iiol. 1 North-Holland PublishingCompany (1980) 623-629

'~1"~

Representations of Simultaneous Pairwise Comparisons


A l l a n R. S a m p s o n

1.

Introduction

Simultaneous statistical inference is of crucial importance when making statistical inferences concerning multiple comparisons. There are numerous statistical methods for simultaneous inference; good references for this material include Miller (1966, 1977), O'Neill and Wetherill (1971), and the chapters in this volume by Harter and by Krishnaiah et al. Generally speaking, for a prescribed set of null hypotheses, a simultaneous inference procedure results in the acceptance or rejection of each hypothesis. The results of these tests are often described in terms of simultaneous confidence intervals. A fairly standard area of application of simultaneous inference is for making comparisons of certain combinations of parameters in the analysis of variance. Within this setting, frequently encountered is the problem of making multiple comparisons among many treatments allowing for the possibility of taking into account other variables (e.g., covariates or blocking factors). The treatments may be, for example, different fertilizers, new industrial processes, experimental human drugs, different teaching methods, potentially toxic substances, or different parole monitoring methods. Usually there is at least one control or standard treatment. Sometimes there are concepts of ordering among treatments (e.g., increasing amounts of fertilizer per acre) or groupings among treatments (e.g., different classes of experimental drugs). While linear combinations of the treatment effects may be of interest, usually pair-wise comparisons among the treatments are of more interest. If there are n treatments, there are n ( n - 1) pair-wise comparisons. The standard simultaneous procedures yield which of these pair-wise treatment differences or comparisons are significant. When there are numerous experiments or analyses under consideration, the interpretation of these 623

624

Allan R. Sampson

many sets of simultaneous results becomes increasingly difficult. In these situations, the presentation and organization of the simultaneous compario son results across the experiments are of vital importance. Considered in Sections 2, 3 and 4 are methods to present and organize large numbers of pair-wise simultaneous results. The methods range from standard ones to newer tabular and graphical methods. The following example describes an experiment where multiple comparisons over time are important; a simple example has been chosen for expository ease. EXAMPLE 1.1. Chun et al. (1977) present the results from a three treat ment, three period complete crossover study used to determine the effects of antacid on the absorption over time of clorazepate dipotassium. Comparisons of blood serum levels among the three treatments denoted by A , B , C were made at 0.5, 1, 2, 3, 4, 6, 8, 12, 24, 48, 72 and 96 hours after dosing, In this case, A is the control, and B and C are two different experimental conditions. Based upon the standard fixed-effects crossover model, simultaneous statistical pair-wise comparisons of treatments A , B and C were done at each time point. These results are summarized in Table 1.1.
Table 1.1 Time (hours) 0.5 1 2 3 4 6 8,12, 24, 48, 72, 96 Significant comparisons A greater than B, A greater than C A greater than B No comparisons significant No comparisons significant A greater than C A greater than C No comparisons significant

2.

Techniques for small and moderate numbers of comparisons

If the number of treatments being compared and the number of experi~ merits being analyzed are quite small, it usually suffices to give the results in paragraph form. For example, suppose that there are three treatments, namely, a control treatment and two experimental treatments, with corresponding treatment parameters t 0, t~ and t 2. If a simultaneous pair-wise comparison procedure rejected both Ho: t o = t~ and K0: to= t 2, but did not reject L0: t I = t 2, these results could be summarized by: "Both experimental treatments are significantly different from the control, but there is no significant difference between the experimental treatments".

Representations of simultaneouspairwise comparisons

625

In the situation where the number of treatments and the number of experiments are moderately small, the underlining of nonsignificant groups of parameters is often used. See Miller (1966, p. 84), or Steel and Torrie (1960, pp. 106-112). To employ this display approach, first rank the treatments from smallest to largest based on the estimated treatment parameters. If there were five treatments, To, T1, T2, T 3 and T4, then the display would initially look like

r/1

7~2 r~

i"~, ~,

where ~, .... , are the ordered estimated treatment parameters. If r/. and T~ were not significantly different, a line would be drawn underneath , and t/. If in addition 7],. was not significantly different from T/, and Ti~, the line would be extended under ti~; otherwise, that line would stop. The procedure would then be repeated starting with Ti~ and a line would be drawn at a slightly lower level. Because this procedure is in limited general scientific use, it is recommended that when reporting data in this manner to include below the display a sentence similar to "If treatment parameter estimates are underlined by a common line, they are not statistically significantly different; otherwise, they are significantly different". EXAMPLE 2.1. Steel and Torrie (1960, pp. 101-109) report the results of a completely randomized design experiment examining the effects upon nitrogen content (mg) in red clover plants when inoculated with six different red clover--alfalfa strain mixtures which are denoted by T~..... T6. The mean nitrogen contents and results of a comparison of treatments can be summarized as follows:

r3
13.3 14.6

r6
18.7

~
19.9

~
24.0

Tl
28.8

Thus, for instance, treatments T2 and Ta are not significantly different, but T~ and T4 are.

3, Tabular displays for large numbers of comparisons


For the case of multiple experiments, interes~t is often focused on detecting changes over the course of the experiments. To aid in evaluating results over experiments tabular displays with proper organization of the

626

Allan R. Sampson

columns and rows can be useful. The basic concept is to display all possible treatment comparisons as columns and order the experiments along the rows according to some natural ordering, e.g., time or distance. In each row-column combination, record whether or not the treatment comparison for that experiment is significant. Other information can also be included. The proper organization of the columns is important and depends on the experimental setting. Given below for different experimental situations are recommended column groupings. Semicolons are used to separate different column groupings. For ease of illustration, it is assumed that there are four treatments, T 0, T 1, T 2, and 73. (a) Treatment versus control. The treatment To is the control and Tl, T2, and T 3 are experimental. The suggested column headings are: T Ovs T1, r 0 VS 7"2,T o vs T3; T l vs T2, T l vs T 3, T 2 vs T 3. (b) Dose response. The treatments To, Tl, T2, T 3 are ordered by dose. "File suggested column headings are: T Ovs T 1, T 1 vs T2, T 2 vs T3; T Ovs T 2, T 1 vs /'3; T0VS T3. (C) Between groups and within groups. The treatments T o and T 1 are in one group and T 2 and T 3 are in another group. The suggested column headings are, To vs T2, T O vs T3, T 1 vs T2, T 1 vs /'3; T o vs T1, T 2 vs T 3. Generally the ordering of the rows is fairly automatic. If the experiments are over time, then time should obviously be used to order the rows. If tile experiments are at different locations, a geographical ordering (e.g., by distance from a central point or by region) would be appropriate. Each entry in the table would then provide the results of the particular column treatment comparison for the given row experiment. There are a number of different formats for the entries. Some suggested formats are: (i) record S to indicate that the treatment comparison for that experiment was significant, and leave a blank, otherwise (if there are a preponderance of significant results, record then only the NS's), (ii) for treatment comparison Ti vs Ty, record " > " if the comparison is significant and the estimated^parameters satisfy q > ~ and " < " if the comparison is significant and ti <tj; otherwise, leave the entry blank, (iii) for treatment comparison T~. vs ~ , follow (ii) and for data summarization record directly below " > ", etc., the pair of values (~, ~), where the values are not numerically ordered but are presented in the given column treatment ordering. There is a possible embellishment to the tabular display suggested by Wieand (1978). Within a column, record at each row the proportion of significant results up to that row relative to the cumulative n u m b e r of experiments (rows). An alternative to this would be to compute the

Representations of Mmultaneouspairwise comparisons

627

proportion of significant results to the remaining n u m b e r of experiments. The purpose of these proportions is to try to indicate at which experiments the column comparison becomes significant (or vice-versa). EXAMPLE 3.1. Example 1.1. Table 3.1 gives a tabular representation for the results of

Table 3.1 Hour 0.5


1

2 3 4 6 8, 12,24,48,72,96
4. C h o r d s on a c i r c l e

A vs B > > NS NS NS NS NS

A vs C > NS NS NS > > NS

B vs C NS NS NS NS NS NS NS

When faced with interpreting simultaneous comparsions from multiple experiments, often the pattern over experiments is more important than the specific individual comparisons. For multiple experiments, the underlining method noted in Section 2, is less than adequate for pattern recognition. Specifically, the ordering of the treatments usually changes over experiments due to changes in the ordering of the estimated treatment parameters. And as such, little pattern recognition is possible by scanning under. ling patterns over experiments. The tabular methods of Section 3 allow for some scanning over experiments, in that treatment comparison positions, i.e., columns, are standardized. This problem of pattern recognition is shared with other areas of statistics, in particular, multidimensional representations. In these other areas graphical methods have been found to be quite illuminating. For example, Chernoff (1973) uses computer drawn representations of faces to depict points in a k-dimensional space. The representation of simultaneous comparisons as chords on a circle is a graphical method that is amenable to pattern recognition. In this approach, the results from each experiment are depicted on a circle. The p treatments are represented by p points spaced equidistantly on the circumference of the circle. The correspondence of treatments to points is the same for each experiment, e.g., T o m a y be at the "12 o'clock" point and T 1..... Tp_ 1 correspond in a clockwise fashion to the remaining points on the circle. (As in any graphical representation method, some care should

628

Allan R. Sampson

be exercised to preclude labeling points in a manner that psychologically produces tile desired patterns.) Chords are then drawn between the points to indicate results of the comparisons. Possible techniques are: 0) Chords are drawn if the comparisons are significant (or if there is a preponderance of significant results, chords correspond to nonsignificant comparisons), (ii) if the comparisons are significant, chords are drawn with arrows indicating the direction of the difference of ~ with respect to ~, (iii) in addition to (ii), for purposes of data summarization, the values, , are recorded on the circumference of the circle. The advantage of this graphical method is that certain important experimental patterns have a corresponding simple graphical pattern. Thus, if these graphical patterns were observed for a set of data, this would indicate certain underlying experimental results that should be investigated further. Some possible graphical patterns with corresponding experimental conditions are illustrated in (a), (b) and (c) that follow. (a) The treatment TO is a control and T l, T2, T 3 are, respectively, low, middle and high levels of the experimental treatment. The experimental condition of a control versus treatment difference, but no level effect corresponds graphically to Fig. 4.1. Y

T3~ T2

T1

Chords indicate significantcomparisons Fig. 4.1. (I0) The treatment 70 is a control and T 1, Tz, 73 are, respectively low, middle and high levels of the experimental treatment. The experiment is repeated over time. The experimental pattern over time consists of T 3

To T @ T2 Early

T O T1 T 3 O T 1 7 2 Middle
Fig. 4.2.

TO T3CT1 T~ Late

Chords indicate nonsignificant comparisons

Representations of simultaneouspairwise comparisons

629

b e i n g significantly d i f f e r e n t f r o m the others e a r l y in time; 23, 7~ being significantly d i f f e r e n t f r o m t h e others in the m i d d l e ; a n d T 3, T 2, T 1 being significantly d i f f e r e n t f r o m c o n t r o l late in the s e q u e n c e of experiments. T h e c o r r e s p o n d i n g g r a p h i c a l p a t t e r n s over time a r e given in Fig. 4.2. (c) T h e t r e a t m e n t s T o a n d T 1 a r e in one g r o u p a n d the t r e a t m e n t s T~ a n d T 3 a r e in a n o t h e r group. T h e e x p e r i m e n t a l c o n d i t i o n is t h a t there is a significant difference b e t w e e n g r o u p s b u t n o n e w i t h i n g r o u p s ; the graphio cal r e p r e s e n t a t i o n is Fig. 4.3.

Chords indicate significant comparisons Fig. 4.3. EXAMPLE 4.1. pie 1.1.
A

Fig. 4.4 gives c h o r d s o n the circle r e p r e s e n t a t i o n for E x a m -

Hour 0.5

Hour I

Hour 2

Hour 3

Hour 4

Hour 6

Chords indicate significant comparisons Fig. 4.4.

References
Chernoff, H. (1973). The use of faces to represent points in k-dimensional space graphically. d. Am. Statist. Assoc. 68, 361-368. Chun, A. H. C., Carrigan, P. J., Hoffman, Do J., Kershner, R. P. and Stuart, J. D. (1977). Effects of antacids on absorption of elorazepate. Clinical Pharmacology and Therapeutics 22, 329-335. Miller, R. G. (1966). Simultaneous Statistical Inference. McGraw-Hill, New York. Miller, R. G. (1977). Developments in multiple comparisons 1966-1976. J. Am. Statist. Assoc. 72, 779-788. O'Neill, R. T. and Wetherill, B. G. (1971). The present state of multiple comparison methods. J. Roy. Statist. Soc. Ser. B 33, 218-241. Steel, R. G. D. and Torrie, J. H. (1960). Principles and Procedures of Statistics. McGraw-Hill, New York. Wieand, S. (1978). Private commtmication to the author.

P. R. Krishnaiah, ed., Handbook of Statistics, VoL 1 North-Holland Publishing Company (1980) 631-671

z., l

e) ]

Simultaneous Test Procedures for Mean Vectors and Covariance Matrices


P. R. Krishnaiah*, G. S. Mudholkar** and P. Subbaiah

1.

Introduction

Simultaneous test procedures play an important role in the area of statistical inference from the data since in many situations the experimenter is interested in testing several hypotheses simultaneously. For example, the experimenter m a y be interested in testing simultaneously the hypotheses that certain contrasts on means or variances of the normal populations are zero. For reviews of some early developments in the area of multiple comparison tests, the reader is referred to Miller (1966), Scheff6 (1959), and the chapter by Harter in this volume. For reviews of the literature on some later developments, the reader is referred to Krishnaiah (1969, 1979). Various nonparametric simultaneous test procedures are reviewed in the chapter by Sen in this volume. In this chapter, we discuss some simultaneous test procedures which are useful in multiple compario sons of mean vectors and covariance matrices. Throughout this chapter, we assume that the underlying distribution is univariate or multivariate normal. This chapter is written with the users of the multiple comparison tests in mind as potential readers. In Section 2, we discuss the multiple comparison tests for means proposed by Scheffr, Tukey, Dunnett and Krishnaiah. The tests proposed by Krishnaiah are known as the finite intersection tests (FIT) and the test of Dunnett, and Tukey are special cases of these tests. In Section 3, we discuss S. N. Roy's largest root test and T2~x test for multiple comparisons of mean vectors whereas Section 4 is devoted to Krishnaiah's finite intersection tests and J. Roy's step-down procedure. In Section 5, we discuss
*The work of this author is sponsored by the Air Force Fright Dynamics Laboratory, Air Force Systems Command under Grant AFOSR 77-3239. **The work of this author is sponsored by the Air Force Office of Scientific Research, Air Force Systems Command under Grant AFOSR 77-3360. Reproduction in whole or in part of this paper is permitted for any purpose of the United States Government. 631

632

P, R, Krishnaiah, G. S. Mudholkar and P. Subbaiah

multiple comparisons of mean vectors associated with the tests based upon traces of certain matrices. The material in this section is based upon the results of Mudholkar, Davidson and Subbaiah (1974b). Computer programs useful in the implementation of the tests discussed above for multiple comparisons of mean vectors are given in Section 6. Using the data collected on mentally retarded children at the Eastern Pennsylvania Psychiatric Institute, the procedures discussed in Sections 3 - 5 are illustrated in Section 7. In Section 8, we discuss simultaneous tests for the equality of the variances against different alternatives. Section 9 is devoted to a discussion of a procedure for testing the hypothesis that the covario ance matrices of various populations are equal to a specified matrix. This procedure is along the lines of Roy's test for testing the hypothesis that the covariance matrix of a single population is equal to a specified matrix. In Section 10, we discuss Roy's tests for the equality of two covariance matrices and certain extensions of these tests for several sample cases Also, we discuss the tests proposed by Krishnaiah (1968) for testing the equality of the covariance matrices; these tests are based upon conditional distributions.

2.

Multiple comparisons of means

In this section, we review certain tests for multiple comparisons of means. We will first define multivariate t, multivariate chi-square and multivariate F distributions since they are needed in the sequel. Let Y'---(Yl ..... yp) be distributed as a multivariate normal with mean vector /*'=(/~l ..... /5,) and covariance matrix o2f~ where ~2=(pu) is the correlation matrix. Also, let zi=y~2/o 2, ti=yiV~n/s and F,.=t 2 for i = 1,2 ..... p, where s2/o 2 is distributed independent of y as chi-square with n degrees of freedom and E(s 2) --- no 2. Then, the joint distribution of z~. . . . . zp is a central (noncentral) multivariate chi-square distribution with one degree of freedom and with ~ as the correlation matrix of tile accompany ing multivariate normal when t.t=0 ~4=0). Also, the joint distribution of t~.... ,tp is a central (noncentral) multivariate t distribution (see Dunnett and Sobel (1954)) with (1,n) degrees of freedom and with ~2 as the correlation matrix of the accompanying multivariate normal when # = 0 (#4:0). Also, the joint distribution of F l ..... F? is known (see Krishnaiah (1965)) to be a central (noncentral) multivariate F distribution with (1,n) degrees of freedom and with f~ as the correlation matrix of the accompanying multivariate normal when # = 0 (#=~0). Various details of these distributions are discussed in another chapter by Krishnaiah.

Test procedures for mean vectors amt covariance matrices

633

We will now discuss Tukey's test, Dunnett's test, Scheff6's test and Krishnaiah's finite intersection tests for multiple comparisons of means. For i = 1,2 ..... k, let Xil,...,xi,,, be distributed independently and normally with unknown mean/~i and unknown variance o 2. Also, let
~. "-= CilXl" "1" . . . -~ CikJ(k. ,

ni
nixi.-~- E yij, j=l n = nl + o.. + n k , p ~ rl-- k~

H,.: ki=0, A*: Xi>0,

Ai: X;v~0, A**: ~ki.< 0 ,

Ho: /~i- ~tj=0,

Au: t ~ - & - ~ O .

Tukey (1953) considered the problem of testing the hypotheses H u simultaneously against A u when the sample sizes are equal to m. If we use this procedure, we accept or reject H u against A u according as

Ito.INt~l,
where

(2.1)

P[ltul<t,1; i < j = l ..... k l H ] = ( l - a ) ,


k ni

(2.2)

.2= E E
i=tj=l

(2.3) and H: /z1. . . . =/zg. The distribution of 2 maxi<j(F0) is the same as the distribution of the square of the studentized range statistic, where F u = t2. Values of "v/2 t~l for different values of a, k and 1, were tabulated by Harter (1960). Tukey (1953) also proposed a procedure for testing the hypotheses H u (i < j = 1..... k) and /~i = 0 (i = 1,2 ..... k) simultaneously against A u (i <j) and / ~ 0 ( i = 1,2 ..... k) when the sample sizes are equal to m. If this procedure is used, we accept or reject H o. according as V~ Itul X t~2. Also, the hypothesis/~i--0 is accepted or rejected according as (2.4)

It,*lXt 2,

(2.5)

634

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

where t* =

2i.v~mp/s 2. T h e
P [ Vr~

critical value t~2 is chosen such that 1. . . . . k; It* l < 1~2, i =

It,jl <t~2,i<j=

l,...,klH ] = ( 1 - - ~ ) .
(2.6)

Now, let R o = max( U, R), where R = 2 max/</I to I) a n d U = max,( It* I)- T h e distribution of R 0 is k n o w n to be the studentized a u g m e n t e d range distribution. Percentage points of the distribution of the statistic R 0 were given in Stoline (1978). Starting from (2.2), the following confidence intervals were o b t a i n e d (see Scheff6 (1959)) on )~g when )~g's are contrasts:

i=l

(2.7)

<)tg+(t~s/~

X
i=l

Ic~,l

T h e hypothesis Hg is accepted or rejected according as the confidence interval on Xg covers or does not cover zero. T h e a b o v e procedure is k n o w n in the literature as the extended T u k e y ' s test. Next, consider the p r o b l e m of testing the hypotheses Hi,, simultaneously against A~k when n I = ... = n k = m. In this case, D u n n e t t (1955) p r o p o s e d the following procedure. W e accept or reject H~k according as
!tikl ~ t~3,

where t.3 is chosen such that

P[It, kl ~<ta3; i =

1. . . . . k - - I [ H ]

= ( 1 - 00.

(2.8)

T h e joint distribution of i l k . . . . . lk-l.k, when H is true, is the multivariate t distribution with (1, v) degrees of f r e e d o m a n d with f~ = (0u) as the correlation matrix of the a c c o m p a n y i n g multivariate n o r m a l where Pig= 1 a n d p = ! 2 (i4=J). St -ring from eq. (2.8), Shaffer (1977) o b t a i n e d the following confidence J when cg,' s are subject to the constraints E ~ = lc~, = 0:
k-!

'.g- st~3(2/mv) '/2 ~,


i=1

I%1 <X~

<X~+ st~3(2/mv) 1/2 ~, I%1i=1

k-1

(2.9)

The hypothesis Hg is a c c e p t e d or rejected according as the confidence interval on )~ covers or does not cover zero. T h e a b o v e p r o c e d u r e is k n o w n (see Shaffer (1977)) as extended D u n n e t t ' s test.

Test procedures for mean vectors and covariance matrices

635

H l .... ,Hq simultaneously against the alternatives A 1. . . . .

We now discuss alternative procedures for testing the hypotheses Aq when ?tg's are contrasts. According to the classical overall F test, we accept or reject H according as

Fo~G,
where P [ F o < (k - 1)F~,IH ] = (1 - ~), (2.10)

nY~..= E E xo,

k F 0 = E ni(xi.- ff..)2v/s2. i=1

When H is true, Fo/(k - 1) is distributed as the central F distribution with ( k - 1 , v ) degrees of freedom. Scheff6 (1953) gave the following ( l - a ) % simultaneous confidence intervals associated with the above procedure:

a"~-- ( a' a ( k - l)]~;s2/p}l/2 <a'v<a'~-t - {a' a ( k - 1)r~s2/p} 1/2


(2.11) for all non-null vectors a ' = ( a I..... ak) where

(V

.....
(Xk---X--)},

"y'= { ' ~ H 1 (.~1.--.~..) . . . . . ~ k i=1 i j

Here, we note that H = C/,v~oH. where H . : a'3,= 0. Any subhypothesis c'3,=0 (for given c) is accepted or rejected according as the confidence interval on c'7 given by (2.11) covers or does not cover zero. This is equivalent to acceptance or rejection of the hypotheses c'3,= 0 according as

(c"y)2v X(k- 1)F~.


C'CS 2

(2.12)
be on to we

The above method of testing the hypotheses c ' 7 = 0 is known to Scheff6's test. Here we note that a'~,= b 1/-t1+ . . . d-bkl~k is a contrast /~l. .. . /~k since X ki=lbi =0" Also, if we choose a' to be equal k - 0 in (2.11), (%l/V~n~ ..... Cgk/V~nng ) subject to the restrictions ~i=lCgiobtain the confidence interval on ?~g when Xg is a contrast.

636

P. R. Krishnaiah,G. S. Mudholkarand P. Subbaiah

We now discuss the implications of the overall F test for testing subhypotheses of the form H(i) where H(i): /xI = . . . = # i =/~. The subhypothesis H(i) can be expressed as H ( i ) = ('~ Ha,

Ha=d'y(i)=O,

d' is of order l i , y ' ( i ) = { V ~ n l (/~l-/~) ..... V~ni (/~i-/~)} and k/~=N~= l /zi. Confidence intervals on d ' ~ 0 can be obtained (for all non-null d) from (2.11) by choosing a' to be equal to (d',0'). The hypothesis H(i) is accepted if the confidence intervals on d'3'(0 for all non-null d, cover zero and it is rejected otherwise. This is equivalent to acceptance or rejection of H(i) according as F(i) N (k - 1)F, where
i

F(i) = 1, E nj(Yj.- ~..)2/s2.


j~l

(2.13)

The above implication of the overall F-test in connection with testing subhypotheses of the form H(i) is obvious and has been known to m a n y workers in the area for a long time. We now review Krishnaiah's finite intersection tests for multiple comparisons of means. If we use the finite intersection tests, the hypothesis Hg when tested against A s is accepted or rejected according as

Fg<>F,,,
where
2 1/2 k ~,,

(2.14)

r e - t ;,

d=

/n;

j=l

and F~l is chosen 1 such that

P[ Fg<<'F"I;g= I'2 ..... ql A Hi]

(2.15)

The joint distribution of F 1..... Fq is the central (noncentral) q-variate F distribution with (1, v) degrees of freedom and with ~2= (Osh) as the correlation matrix of the accompanying multivariate normal, when fq qi=1Hi is true (false), where k

Ogh=(1/dgdh) l/z E CgrChr/nr,


r=l

(2.16)

1The critical values for testing H l.... ,Hq may be chosen to be unequal. For practical
purposes, they are chosen to be equal.

Test procedures for mean vectors' and covariance matrices

637

and 4 -- ~,= k lci,/nt 2 The simultaneous confidence intervals associated with the above test are given by

for g = 1..... q. When hg's are contrasts, the lengths of the confidence intervals on )~g associated with the finite intersection tests are shorter than the lengths of the corresponding confidence intervals (2.11) associated with Scheff~'s method. Next, let H*: /x~= ... =/*k = 0. If we apply the overall F-test, we accept or reject H* according as

F~ > kF~,
where/~* is chosen such that

P[ F~) <kF*]H*J=(1-a)
and

(2.18)

(2.19) When H* is true F ~ / k is distributed as the central F distribution, with (k, v) degrees of freedom. The simultaneous confidence intervals associated with the above procedure are known (see Scheff6 (1959) to be

a~9*-~a'akF*~sZ/v < a ' 7 * <<.a'5,*+~/a'akF%sz/v

(2.20)

f o r all n o n - n u l l a , "y*' = ( V T n l / . L 1. . . . . V ~ k - ~ k ) a n d "~*'= (-VFn[ Yl ...... ~ Yk.)- If H* is rejected, we accept or reject H*(i) according as F*(i)% kF* where

F*(i) = (nlY 2.+... + ni~2 )v/s 2

(2.21)

and H*(i): /~ = ... =t4.=0. Similarly, the hypothesis e'y* (for given c) is accepted or rejected according as

- -

CtCS 2

~ kF*~.

(2.22)

Here, we note that the finite intersection tests of Krishnaiah yield shorter

638

P. R. Krishnaiah, G. S. l~ludholkar and P. Subbaiah

confidence intervals on ?tg than Scheff6's method when Xg's are arbitrary linear combinations of /~,...,/xk. The lengths of the confidence intervals associated with the finite intersection tests become shorter as the number of hypotheses (that is, q) becomes smaller. When the sample sizes are equal, the finite intersection test of Krishnaiah is equivalent to Tukey's test if one is interested in testing simultaneously only the hypotheses p~.=~ ( i < j = 1..... k) and it is equivalent to Dunnett's test if he is interested in testing simultaneously only the hypotheses bti=lx k ( i = 1. . . . . k - 1). If we use the finite intersection tests to test H~ . . . . . Itq simultaneously against A ]~ ..... Aq, we accept or reject H i according as
t i ~ lal ,

(2.23)

where

Similarly, when we test H i ( i = 1..... q) against A** simultaneously, we accept or reject H i according as
ti ~ t,2,

(2.25)

where P[ t i ) t , 2 ; i = l ..... ql(~i=l Hi] ==(1 - O/)" (2.26)

The joint distribution of t l , . . . , t q is the central (noncentral) multivariate t distribution with (1, u) degrees of freedom and with ~--(O0) as the correlation matrix of the accompanying multivariate normal when H i is true (not true) and Pu is given by (2.16). For detailed comparisons of the finite intersection tests of Krishnaiah with some alternative procedures for multiple comparisons of means the reader is referred to Cox, et al. (1980).

3. Roy's largest root test and T~a x test for multiple comparisons of mean vectors

For i = 1..... k, let xil ..... xin' be distributed independently as p-variate normal with unknown mean vector ~[L i and unknown covariance matrix N=(o0. ). Then, it is of interest to test H 1. . . . . Hq and H simultaneously

Test proceduresfor mean vectors arwlcovariance matrices

639

where

Hi:c~tz=O,

H:Ot = .-.~k~

/~' = (/~1..... /~k)

and ci~s are known subject to the restrictions c / l = 0 . According to RoSs largest root test, we accept or reject H according as cL(GE - 1) X c*, (3.1)

where CL(A) denotes tile largest characteristic root of A, c* = ( k - - l ) c J ( n k) and


-

P[(n-k)cL(GE-1)<(k-1)c,~=(k - 1)c,~]H]=(l--a)o
In the above equation,
k

(3.2)

G = E n,(~,.- -~..)(-~i.- Y..)',


i=1 k
ni

(3.3) (3.4)

E= ~ ~ (xo.-~.)(x ~- ~.)',
i=lj=l ni
k ni n = F l l ' ~ "'" "~nk ni~"= E gij' j=l HX..= E E Xij' i=lj=l

When H is true, G is distributed as the central Wishart matrix with ( k - 1) degrees of freedom and E ( G ) = ( k - 1 ) Z where ~ denotes the expected value. Also, E is distributed independently of G as a central Wishart matrix with ( n - k) degrees of freedom and E ( E ) = ( n - k)E. Values of c~ for different values of a, k, n and p are given in another chapter. The 100(1- a)% simultaneous confidence intervals associated with the above procedure are given by

a ' r b - Vc*a'Ea <~aTb <a'I'b-k- Vc*a'Ea

(3.5)

for all non-null a and b where b's are subject to the restriction b'b = 1. In (3.5), F=(V~nl (/~,--/~) ..... ~
]~=(~ ('~1. -- X--). . . . . ~

(/~-~)) (-rk- -r..))

(3.6) (3.7)

where k/~=t~l+-.-+/~k. The confidence intervals (3.5) were derived by Roy and Bose (1953). If we choose b' to be equal to

640

P. R. Krishnaiah, G. S. M u d h o l k a r a n d P. S u b b a i a h

(% , / ~

.... , cgk / ~

),

then the confidence intervals (3.5) become (3.8)

.--7--a ,~ )kg-V~*~a Eadg <a')ks <a'g + V~:a'Ea~---dg


fo~ ali non-null .. But H~= n.~oH.o where H.. : . % = 0 .

So, H. is

accepted if (3.8) covers zero for all non-null a and it is rejected otherwise. This is equivalent to accepting or rejecting Hg according as

TgZ%(k- 1)c~
where
]kg ~--- Cg I ~1 "~ " " "~ C g k ~ k ,

(3.9)

Xg = Cgl,,~l. "~ . . .
k

" C g k X k . ,

rS={x;(e/,-~) %/d~}. as= E %/n,.


i~l

Similarly, the confidence intervals (3.5) imply the following procedure for testing H(i) where H(i):#I = . . . =~i =ft. We accept or reject H(i) according as CL(G(i)E where
i

- 1) >~c,~ ~< *,

(3.10)

G(i)= ~]
t=l

n,(Kt.-..)(t-..)'.

(3.11)

We now discuss TmZaxtest to test Hl,...,Hq simultaneously. When TmZ,xtest is used we accept or reject Hg ( g = 1..... q) according as T~X T~ where P[ T~x < T~2IH] = ( 1 - a), Tm2ax = max(T~ ..... Tq2). (3.12)

The 1 0 0 ( I - a ) % simultaneous confidence intervals associated with the above procedure are given by

a'X~--{ (dJn--k)T2,a'Ea} l/2<a'2~ <


(3.13)

,'X~+ ( % / , -

~)r~,'e,) '/2 .

for all non-null a. The Tm2axtest was formulated by Roy and Bose (1953) for pairwise comparisons of mean vectors. For a discussion in the general case, the reader is referred to Krishnaiah (1969). A discussion of computing approximate values of T~ is given in another chapter of this volume.

Test procedures for mean vectors and covariance matrices


.

641

Step-down procedure and finite intersection tests The model in Section 3 can be written as

E(X)=A,
where
Xt=(X|I'"'XInI'X21 . . . . . Xkl . . . . . "]Cknk),

(4.1)

(~' ~-"(J[~1..... ~[lk)

a n d the elements of the design matrix are equal to 1 or 0. Also, A ' A = diag(n 1.... ,n~). Now, let

X = ( X I ..... Xp),
and

[ ~ - ( 0 1 .... ,Op),

X j = ( x i ..... xj),

Oj = (0, . . . . . Oj)

f o r j = 1. . . . . p,

where x i and 0 i respectively denote the ith columns of X a n d O. Also, let where Xj denotes the top j j left-hand corner of E. W h e n x~ ..... xj are held fixed, the conditional distribution of the elements of ;1~+1 are n o r m a l with variance of+ l and means given by

@,=l~j+ll/IXjr

E ( x j . + l l X j ) = ( A Xj)[

.[~j+l)
l~j

f o r j = O , 1. . . . . p - 1

(4.2)

with the understanding that n o variable is held fixed when j = O . In Eq. (4.2),
8o=0, .....

The hypothesis H : g 1 = ... = #k can be expressed as

t4:Bo=o,
where
1 0 ... 0 -1

(4.3)

0 B= 0

-1

...

-1

642

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

But H = f)f= lHoi where Hoi:B, b = 0 and so H can be tested Hm,...,Hop simultaneously or in a sequential m a n n e r under

by testing the model (4.2). According to the step-down procedure proposed by J. R o y (1958), we accept H if ~ <f) for j = 1. . . . . p (4.4)

and reject it otherwise, where P [ Fj < f f i j = I ..... p and (n - k - j + F)--l)(B~j)' Cj-I(B~j) ( k - 1)sf

lH] = (1 -

a),

(4.5)

(4.6)

In (4.6), ~) denotes the least square estimate of ~), sf denotes the error sum of squares and C2o22 denotes covariance matrix of B~ij under the model (4.2). Here we note that

'b+,]

A'

-'[A' t
^ t

(4.7)

s2+,=(xj+,- A~lj+,- Xj[~j) (xj+,-A~lj+l- Xj~j)


for j = 1..... p - 1, and
~ I - ~ ( A ' A ) - I A ' xl, s2=(x1- A~II)'(x1-A~I

(4.8)

).

It is known that F l ..... Fp are distributed independently, and f j ( j = 1. . . . . p) is distributed as central F with ( k - l , n - k - j + l ) degrees of f r e e d o m when H0j is true. The 1 0 0 ( 1 - a ) % simultaneous confidence intervals associated with the above procedure are given by

a'BOy

{ ( k - 1)ffifia'Cja/(n - k - j +

1)),/2 < a ' B ~ j <.


(4.10)

atB~j -{- { (k - 1)fjs)a' Cja/( n - k - j -1- 1)} 1/2


for all non-null a. Let c~=(% 1. . . . . %k) where e ~ l = 0 as in Section 3. If we choose a in (4.10) such that c~=a'B, we obtain the following confidence

Test procedures for mean vectors and covariance matrices

643

intervals:

eg~j-{(k-l)Jj-sfd~/(n-k-j+l)}
t ^ .

1/2

<cg~qj
t

(4.11)

< c;,b +
t ^ __ 2

1)) '/:
t,

where var(%~j)-deuo). So, the subhypothesis % ~ j = 0 is accepted or rejected according as (4.11) covers or does not cover zero. This is equivalent to acceptance or rejection of c ~ j = 0 according as

Fgd%(k- 1)fj
where

(4.12)

F~=(n-k-j+

,~ 2 g2 1)%n)/d

(4.13)

We will now discuss Krishnaiah's finite intersection tests for multiple comparisons of mean vectors. Let H ~ : c g ~ j = 0 for g = 1..... q, and let H= ffl qg=lHg where Hg is as defined in Section 3. Since H g = ("l J= 1H~, the hypothesis Hg, for any given g, can be tested by testing Hg~..... Ha, simultaneously in a sequential way. According to the finite intersection tests of Krishnaiah, we accept Hg ( g = 1..... q) if

F~<hj

f o r j = l , 2 ..... p

(4.14)

and it is rejected otherwise where

P[Fgj < kg;g= 1..... q,j= 1,...,p IH] = ( 1 - a).


But, the left-side of (4.15) is equal to II;= 1Pj, where

(4.15)

Pj=P[ f ~ <<.hj;g= l ..... qlH].

(4.16)

When H is true, the joint distribution of Fl9 ..... Fq/ is the multivariate F distribution with ( 1 , n - k - j + 1) degrees of freedom and with ~j = (Ptud) as the correlation matrix of the accompanying multivariate normal; here Ptuj is the correlation between ( ~ j / V d t j ) and ~.{C'u~i/Vdui . - ). For practical purposes, we can choose Pfs to be equal to ( l - a ) l/p. The 100 ( 1 - c 0 % simultaneous confidence intervals associated with the finite intersection tests are given by

eJlj-- ( hjs~dgi/(n- k - j + 1) ) '/2<c;~lj <<.e'g~j+ { hfifd~/(n- k - j + 1) )l/2.

(4.17)

644

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

The lengths of the simultaneous confidence intervals associated with the finite intersection tests are shorter than the lengths of the corresponding confidence intervals associated with the step-down procedure. The simultaneous confidence intervals on e ~ j discussed above are useful since it is of interest to compare the means of various populations on certain variables after eliminating the effect of some variableS. Sometimes, it is of interest to compare the unconditional means of various populations on any given variable. Mudholkar and Subbaiah (1975, 1979) obtained simultaneous confidence intervals on linear functions of the elements of B@ by starting with the step-down procedure and finite intersection tests. The simultaneous confidence intervals associated with Roy's step-down procedure are
P

a'BOd~a'B(3d+-(a'B(A'A)-~B'a)

~/2
j=l

Ihjlc)*~/2

(4.18)

for all non-null a where


h ' = ( h l..... h p ) = a ' L , LL'=E, ci=tf./(n-r-i+l),

f being the upper 0 i percentage point of F distribution with degrees of freedom (t, n - r - i + 1), and
c ~ = c l, c*=c i 1+ ~
j=l

i=2,...,p.

The bounds associated with Krishnaiah's finite intersection tests are

i=l

E Ih,lo

(4.19)

where
h' = (h 1. . . . . hp) = d ' L , L L ' = E, c i = f l / ( n - r - i + 1),

f/ being as given in (3.14), c ~ = c l , c f = c i ( l +Y,)211@*). The bounds on the means of the original variables given above were derived by using certain inequalities and so they are not exact, whereas we can obtain the corresponding exact confidence intervals associated with the Roy's largest root test. Also, the exact percentage points of Tm2axtest are not available. So, it is quite difficult to make comparisons of the finite intersection tests and step-down procedure with the largest root test and TZ,x test on the basis of the lengths of the confidence intervals on the parametric functions bjOd.

T estprocedures f o r m e a n vectors and covariance matrices

645

50

Tests based on traces

The responsewise infinite decomposition of M A N O V A hypothesis H ' A = B @ = 0 of (4.3) may be represented as {~/: A = 0 } = A
a~a

A
d~R p

{ H ( a , d ) : a'Z~a=0),

(5.1)

where a is a finite set in case of tests such as Tm2axand infinite in case of the largest root test. Now a'Ad=tr(da'A)=tr(MA), M" being the rank l matrix codimensional with A. This suggests a more general decomposition of H0, namely

1t=

(~
M ~ 6Yrc

(H(M):tr(MA)=O},

(5.2)

where 63fFCis some set of p t matrices. In particular, when 62)1Lis the set of all p t matrices the decomposition is termed the matrix decomposition of MANOVA. Mudholkar, Davidson, and Subbaiah (1974b) use this decomposition to establish the union-intersection nature of Hotelling's trace and give the simultaneous confidence bounds associated with it. They also show that this decomposition combined with different choices of the test statistics for the components Ho(M ) yield a class of invariant MANOVA tests containing both the largest root and Hotelling's trace. Specifically they show that if the test statistic

t(M) =

tr(M/~)

(5.3)

trl/Z( MWM'E)
is used for Ho(M ), then the resulting union-intersection criterion is tr(GE-1). This yields the following simultaneous confidence intervals for tr(Mdx), tr(MA) ~ tr(M~) _ + _ k trl/Z(MWM'E), (5.4)

for all p t matrices.M, k being the critical constant obtained from the null distribution of t r ( G E - 1). Notice that (tr(MA)) is a much richer class of parametric functions than the subclass of bilinear functions. It is related to a class of hypotheses termed as the extended linear hypotheses discussed by Mudholkar, Davidson, and Subbaiah (1974a). An extended multivariate linear hypothesis is of the form

H=

('~ {H(M):tr(MA)=O),
M E ~IL

646

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

where 91L is the span of a finite set o f p t matrices. Mudholkar, Davidson and Subbaiah (1974a) discuss how such hypotheses arise in statistical practice and study some simultaneous test procedures in this context. Clearly this notational device m a y be used to define even more general hypotheses. Now we describe the class of M A N O V A union-intersection procedures constructed by Mudholkar, Davidson and Subbaiah (1974b) which is in terms of the symmetric gauge functions of the square roots of the maximal invariants Chi( G E - 1), i = 1,2 . . . . . p. DEFINITION 5.1. A function qs:RP-->/q~ is known as a s y m m e t r i c gauge function (sgf) if ~ is a homogeneous norm which is also symmetric in the extended sense, namely

0(Elaa(l),... ,~paa(p)) - ~ ( a

l..... ap),

where a, is a permutation of a and ei = + 1. Some examples of sgf's are p r 1/r, (i) ~ r ( a ) = { E , = , l a , I } (ii) q~(a)=~i=k la(i), where a(i ) is the ith largest of [ a j [ , j = 1,2 ..... po DEFINITION 5.2. The conjugate ~ of an sgf ~ is defined by

~b(a) = sup[ 22f= ,asbi/q~(b)], where supremum is over either of the sets (i) b O , (ii) Y,f=,lbg[= 1, or (iii) q,(b)= 1. It can be shown that ~b is also an sgf. The conjugate of ~,r(a) is q~s(a) where 1 / r + l / s = i. In particular, )2(a) is self conjugate and q h ( a ) = ]~P= llai[ is conjugate of q ~ ( a ) = ao). DEFINITION 5.3. Given any matrix A ( p n), p < n, and an sgf ~, define l[A I[~,= q'()t~/2 ..... 7~/2), where X1..... ~ are the eigenvalues of A A ' . It is proved by von N e u m a n n (1937) that [[A[[, is a matrix n o r m and that all such unitarily invariant matrix norms can be generated f r o m sgf's in this fashion.

Test procedures for mean vectors and covariance matrices


LEMMA 5.4.

647

For any real matrix A (p n) and any sgf with conjugate ~, IIAIt+= sup
flNI]+= l

tr(AN'),

where the supremnm is taken over all N ( p n) such that IIN][= 1. PROPOSITION 5.5. I f the component hypothesis H o ( M ) : t r ( M A ) = O accepted for small values of

is

It+(M)l = Itr(M2x) l/ll W'/~M'E ~/211+, II w-'/2;~E-'/2II+=+(c#/= ..... cy ~) < const.,


where ci, c2..... 9 are the eigenvalues of ?X'W - 1AE - 1 GE
=

(5.5)

then the acceptance region of the union-intersection procedures" is of the form

The associated simultaneous confidence bounds on tr(M'A) for all

M ( p t), are
t r ( M b ) E tr(M/~) + k ~ ( ( M W M ' E ) I / 2 ) , (5.6)

k being the critical constant, and 4,((MWM'E)I/z)=~b(g~/2 ..... gy2), and gi's are the eigenvalues of M W M ' E . Notice that when t ( M ) = t+=(M), the union-intersection test statistic is ~ ( c ~ / 2 ..... c y 2 ) = { CL( G E - I ) } I/2, and the associated confidence bounds
are

tr(MA) ~ tr(M/X) + k + , ( ( M W M ' E ) I / 2 ) .

(5.7)

When M is of rank 1, then it m a y be expressed as M = da' and (5.7) may be reduced as

. ' ~ a ~ .':~d +_ k ( ( .' w . ) ( a' Ea ) ' /2


These bounds are the same as obtained originally by Roy. Notice the analogy between this situation and the univariate case of the simultaneous bounds associated with the studentized range discussed in Section 2.

6.

Computer programs for tests on multiple comparisons of mean vectors

Two F O R T R A N programs are prepared to compute the confidence bounds for one-way M A N O V A model. The p r o g r a m roots provides the bounds associated with th'e largest root, trace and T~a 2 ~ tests, and the

648

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

program fit provides the bounds associated with the step-down and finite intersection tests. These two programs are run on the Honeywell Multics System at Oakland University. Roots uses the IMSL routines vmulff, vmulfp, mdfi, eigrf; fit uses the IMSL routines vmulff, vmulfp, vcvtfs, ludecp, vcvtsf, linv2f, vmulfm, mdfi. The programs and a sample output are given in the Appendix.
Roots program

The input for this program consists of (i) O, the matrix of means, which is of order k Xp, k being the number of groups, p being the number of variables, (ii) error S P matrix E of order p p, (iii) n 1, nz,... ,n k, the sample sizes and (iv) the matrix B of order t k specifying the t contrasts of interest among the group means. Suppose M is any matrix of order p t with columns m v m 2 m t and A = B O . Then the confidence bounds on tr(MA) associated with T ~ , , largest root, and trace tests are given by
.....

tr(MA) ~ tr(M/~) + ~

E { ln~Emiwii/( 1"1-- k) } 1/2,


i=1 I

tr(MA)~tr(MA)- ~

~ ch]/2(MWM'E),
i=1

(6.1)

tr(MA) e tr(MA) + V k 2 t r l / 2 ( M W M ' E ) , where wii is the ith diagonal element of W = B ( A ' A ) - I B ', and A ' A = diag(nl, n z ..... nk), and ca, k l, k 2 are the critical constants corresponding to Tm2~, largest root and trace tests respectively, c h i ( M W M ' E ) , i = 1,2 . . . . . t are the characteristic roots of M W M ' E . If we are primarily interested on the t contrasts for each variable, then p t matrices can be constructed by considering all elements of M to be zeroes with the exception of (i,j)th element, which can be taken as unity, for specified i and j. Here i can take values 1,2,...,p and j can be 1,2 .... ,t. For any other contrast of interest, M can be selected accordingly. The values for % k 1 and k 2 are also part of the input for the program. The bounds on the p t contrasts are computed automatically in the program. For any other extended linear functions of interest, the corresponding M matrices are to be specified in the input.

Test procedures for mean vectors and covariance malrices

649

Fit program This program also uses the same input as the roots program, except the critical constants involved. The computational procedure followed in this program for evaluating the test statistics and the confidence bounds associated with the finite intersection tests and step-down procedure is same as that described in Mudholkar and Subbaiah (1979). After reading the values of ~, E, sample sizes nl,n 2. . . . . n k and the contrast matrix B, the computations are conducted as follows: Step 1. Construct B ~ , the t p matrix consisting of the estimates of bj0 i for j - - 1,2 ..... t and i = 1, 2 . . . . . p. Step 2. Compute the lower triangular matrix L such that E = L L ' (with the I M S L routines vcvffs, ludecp, vcvtsf), and W = B ( A ' A ) - I B ' . Step 3. Obtain Z = the matrix of order p t. Step 4. The finite intersection test statistics F~/of (4.13) are given by

L-I(B~))',

<= 4
w,j+ Y, z~j
k=l

f o r i = l , 2 ..... p;

j = l , 2 ..... t.

(6.2)

Step 5. Suppose G is the hypothesis SP matrix as in (3.3), and T = G + E. Let V be the lower triangular matrix such that T = VV'. Then the step-~ down statistics F~ of (4.6) can be computed as F i = ((eii/lii) - 1)(n - k - i + 1 ) / t ,
2 2

(6.3)

lii and vii being the ith diagonal elements of L and V respectively, for i-- 1,2 ..... p. The program computes the confidence bounds on the contrasts of the original means bjO~ as well as the contrasts of the conditional parameters bj~ i of the model (4.2). Step 6. The confidence bounds on ba:0s are computed using the following formulas: (a) Suppose ci = f / ( n k - i + 1), f~ being the constant such that

P[/~}j < f ; j = and 1 - a = I I f = l ( l - - a i ) .

1,2 . . . . . t I Ho] =(1 - ai) , Let c f = q ,


i=2

and
p.

c? = c, l + Y,, c* ,
j=l

650

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

Then the simultaneous confidence intervals associated with finite intersection tests are
I[ik i X,/c~ k=l

b j ~ i ~ bjOi ++-

/2

(6.4)

for i = 1,2 ..... p and j = 1,2 . . . . . t. (b) The confidence bounds associated with the step-down tests are given by the equation (6.4) replacing c/ by c~= t f i / ( n - k - i+ 1), when f is the upper 100a i percentage point of F distribution with d.f. ( t , n - k

-i+1).
(c) The bounds associated with Tm2,x given in (6.1) m a y be expressed
as t ^ EbjO i+-(eiiwjjca/(M--k)}

' bji

1/2.

Step 7. In order to estimate bj~ i, we compute

bj~i=zijlii

for i = 1..... p, j = l ..... t.

Step 8. (a) The simultaneous confidence bounds associated with finite intersection test are given by

1^ k=l

l. i

(6.5)

where ci is as defined in the step 6(a). (b) The bounds associated with the step-down tests are given by (6.5) using cj as defined in 6(b).

7o Illustration
Different methods for the comparisons of k mean vectors described in this chapter are now illustrated in terms of the data from Kaskey, Krishnaiah, and Azzari (1962) where four diagnostic groups are c o m p a r e d with respect to the prevalence of 20 psychiatric symptoms. The data consists of multivariate measurements for four groups of individuals with (1) psychoneurotic disorders, (2) childhood schizophrenia, (3) personality trait disturbances, and (4) psychophysiological disorders. The sample sizes for these four groups are n I =62, n2=54, n3=33, n4= 15 respectively, and

Test procedures f o r m e a n vectors a n d covariance matrices

651

~4 n = ~i=lni = 164. Each component of measurement vector is a score taking value 1 in case of the presence of the symptom, and 0 otherwise. Kaskey et al. (1962) argue and justify the following M A N O V A model in the context. Let Xij denote the j t h observation in the ith group, j = 1,2 ..... ni, and i---1,2, 3, 4. It is assumed that Xo's are independent and have multivariate normal distributions with the same dispersion matrix Z and mean vectors E(Xu)=/x,., j = 1,2 ..... n i and i = 1,2,3,4. In the notation of Section 4, Ot = ( ~ 1 ' ~IL2'/"t3'/'t4) and A ' A = diag(nl,//2, n3, n4)" The usual null hypothesis for such data is Ho: #I =/t2=/~3=/L4 For the purpose of illustrating the multiple comparisons we select three variables viz.: XI: unsatisfactory school work in general, X2: temper tantrums, X3: dawdling, procrastination, which are classified under the categories 'school maladjustment, negative attitudes and behavior, attitudes toward the self, respectively'. The means of these variables and the error SP matrix are given in Table 1. The simultaneous confidence bounds associated with trace, largest root, T~nax, step-down and finite intersection tests, having at least 95 percent confidence coefficient, are presented in Table 2. Here we note that the matrix B specified in H of (4.3) can consist of any three independent contrasts, for example, B may contain the first three contrasts of Table 2. However, this restriction is not necessary for computation of bounds associated with Tm2axand finite intersection tests. Tables 2 and 3 give bounds on all pairwise contrasts of the group means for each variable. The confidence intervals of Table 3 are on the condio tional parameters of the model (4.2) associated with the step-down and finite intersection tests. The tables 1, 2, and 3 are prepared summarizing the computer output when all the six pairwise contrasts are of interest. Similar tables can be prepared from the computer output if the interest is on a subset of these contrasts. The computer program given in the Appendix for the finite intersection tests computes the critical values by using a crude approximation to save computer time. Better approximations to the critical values may be computed using the program of Cox, Fang and Boudreau (1979).

Table 1 Diagnostic group means and 'error' SP matrix Group Variable X1 X2 X3 1 0.7581 0.6290 0.8065 Group means 2 3 0.3519 0.9091 0.9074 0.6061 0.5185 0.8485 'Error' SP matrix 4
Xl X2 2 3

0.5333 30.146 0.5333 - .2537 30.617 0.6667 8 . 4 5 7 0 3.8379 30.735

Table 2 Confidence bounds for contrasts of m e a n s Half width of confidence interval Vailable Group contrast Largest root
2 Tma x

Estimate

Trace 0.3929 0.5246 0.4029* 0.5322 0.5678 0.3420 0.3959 0.5287 0.4060 0.5363 0.5722 0.3427 0.3967 0.5297 0.4068 0.5373 0.5733

stepdown

FIT 0.2450~1 0.2837 0.3788 0.2909* 0.3842 [ 0.4099 0.2568* 0.2973 0.3970 0.3049 0.4027 0.4297 0.3521 0.4076 0.5443 0.4180 0.5521 0.5890

-x,

1-1
1 1 0 0 0

0.-a-062 -0.3394 ------~


-0.1510 0.2248 -0.5572 -0.1814 0.3758 -0.2784 0.0229 0.0957 0,3013 0.3741 0.0728 0.2880 --0.0420 0.1398 --0.3300 -0.1482 0. t818

0 -1 0 0 0 1 ! -1 0 1 0 -1 0 1 - 1 0 0 1 0 1 1

X2

1 -1 0 1 0 - 1 1 0 0 0 1 - 1 0 1 0 0 0 1 1 - 1 0 1 0 - 1
1

0.-~014" i0.2834" [ 0 . - - ~ 8 0.3489 0.3280 |0.3030 0.4659 0.4380 |0.4047 0.3577* 0 . 3 3 6 4 * /0.3107 0.4725 i0.4443 /0.4105 0.5042 i0.4741 r0.4379 t 0.3037 0 . 2 8 5 6 0.2754 0.3516 10.3306 0.3188 10.4695 104414 0.4257 |0.3605 0.3390 0.3269 0.4762 0 . 4 4 7 8 0.4318 0.5081 0.4777 0.4607 0.3043 0.3523 0.4704 0.3612 0.4771 0.5091 0.2861 0.3312 0.4423 0.3396 0.4486 i 0 478'7 0.3783 0.4379 0.5848 0.4490 0,5932 0.6329

X3

0 0 0

0 0 0 0 - 1 1 - 1 0 1 0 - 1 0 1 - 1

Table 3 Confidence bounds for contrasts of conditional m e a n s Half width of confidence interval Vailable X1 1
1

Group contrast
1 -1 0 0 0 -1 0

Estimate 0 0.4062 - 0.1510 0.2248 -0.5572 -0.1814 0.3758 -0.2750 0.0216 0.0976 0.2966 0.3726 0.0760 0.2092
- 0.0024

Step-down 0.2618" 0.3030 0.4047 0.3107" 0.4105 0.4379 0.2848 0.3089 0.4132 0.3457 0.4176 0.4533 0.2808 0.2957 0.3962 0.3388 0.4101 0.4343

FIT 0.2450* 0.2837 0.3788 0.2909* 0.3842 0.4099 0.2666* 0.2891 0.3869 0.3236 0.3910 0.4244 0.2629 0.2768 0.3710 0.3172 0.3839 0.4066

0 --1

0 0 0

1 --1 0 1 0 --1 0 1 -1 0 0

x2

1--1
1 1

0 --1 0 0 0 -1

0 0 0 X3 1-1
1 1

1 -1 0 1 0--1 0 1 -1 0 0

0 -1 0 0 0 -1

0 0 0

1 -1 0 1 0 -1 0 1 -1

0.0643 -0.2116 -0.1449 0.0667

Test procedures for mean vectors and covariance matrices"

653

8.

Simultaneous tests for equality of the variances

tn Section 2, we discussed various tests for multiple comparisons of means of normal populations under the assumption that the variances of the populations are equal. So, it is of interest to test whether the assumpo tion of the homogeneity of the variances is valid. In this section, we discuss some simultaneous tests for the equality of the variances against different alternatives. For i = 1,2 ..... k , let x~l..... x,,~ be distributed independently as normal with mean/~ and variance o,.2. Also, let
ni

4 = 2;
j=l

(~,j-x,./,

fsj = (,,j- 0~, /(,,, - l)~,

where
nl

,,xi = ~; ~0.
j=l

In addition, let

~: o f = ... -of,2
A/~: o2 > 9 2,
k--1

~,~: ~,~=52,
A : * : a? < 9 2,
k
i=1 k--I

Ao: o?52,
AI = U
i-7~j

A,y,
k--I

A2"= U Ai, i+l,


i=I k--I

A3= U Aik,
A~*-U
i=1

A~ ~- U Aiti+l,
i=1 k--1

A3 -

U
i~l

A~,

A**i,i+,,

~x~,

__ Aik .
i=1

When the sample sizes are equal to m, Hartley (1950) proposed the following procedure for testing H against A ~. The hypothesis H is accepted if
1 - <f]7 < c
C

for i v s j = 1,2 ..... k

(8,1)

and rejected otherwise where


P [ F m a x ~< c l H ] = (1 - o~), (8.2)

and Fm~=maxi4=j(F,y). The above test is known as Hartley's Fm~ test.

654

P. R. Krishnaiah, G. S. Mudhotkar and P. Subbaiah

Ramachandran (1956) showed that the above test is unbiased. Percentage points of the distribution of Fm~x are given in David (1952). The 1t30(1~)% simultaneous confidence intervals associated with the Fro,x test are given by
.....

k.

(8.3)

Next, consider the problem of testing H against A 2. In this case, we accept H if

a<Fi,i+l<b

i = 1 , 2 ..... k--1

(8.4)

and reject it otherwise where P [ a <F,,i+, < b ; i = 1..... k - l l H ] =(1-c~). (8.5)

The 1 0 0 ( 1 - a ) % simultaneous confidence intervals associated with the above procedure are given by

a( eisi2+l~ ui + 14) < (el2+ 1 / 4 ) <~b( pis2i+1/Pi + 'Si2)


i = 1..... k-- 1 where Pi = n~ - 1. If we test H against A~', we accept H if F~,~+~ < b i = 1 , 2 ..... k - l , (8.7) (8.6)

and reject it otherwise where P[Fz.,i+, ~<b;i =: 1,2 .... , k - l Ill j = ( l - oL). (8.8)

The 1 0 0 ( I - a ) % simultaneous confidence intervals associated with the above procedure are

, = 1 , 2 , . , k - 1o
Similarly, the hypothesis H when tested against A~'* is accepted if F,.,~+ ~>~a i = 1 , 2 ..... k - 1

(8.9)

(8.10)

and rejected otherwise where P[F/,,+ 1 >~a;i= 1,2 ..... k - I [ H ] = ( l - a ) . (8.11)

Test procedures for mean vectors and covariance matrices

655

The ( 1 - o0% simultaneous confidence intervals associated with the above procedure are

( e L l / e l 2) >>.a(vis2i+l/vi+lsi 2)

i= 1,2,...,k-1o

(8.12)

Gnanadesikan (1959) considered the problem of testing H against A 3. The above procedures for testing H against A2,A ~ and A~* were considered by Krishnaiah (1965b).
9. Simultaneous tests specifying the covariance matrices

For i = 1,2 ..... k, let xi~..... x~n ' be distributed independently as p-variate normal with mean vector/*i and covariance matrix Y'r Also let
ni ni

s,= X
j=l

j=l

In this section, we discuss a procedure for testing the hypothesis H: Yx----.-.--Y'~=Zo where Y~0 is a specified matrix. First, we discuss Roy's test for testing the hypothesis that Hi: Y~=N0 for a given value of i. The hypothesis H i can be expressed as

H,= N

Hia: a'~ia=a'~oa.

a~0

Roy's test (see Roy (1957)) is based upon testing Hi. simultaneously for all non-null a and accepting H i if and only if Hi. is true for all non-null a. According to this procedure, the hypothesis H i when tested against U ,4:o[a'Y.ia4:aZoa] is accepted if

a <c~(SiEo l) <CL(SiZo' ) <b,


and rejected otherwise where a and b are chosen such that P[ a < cs(SiS~o 1) < CL(SiZo 1) < b]Hi ] = (1 -- a),

(9.1)

(9.2)

where cs(A ) and CL(A ) respectively denote the smallest root and largest root of A. Since H = f') k i= l H ;, the following procedure may be used for testing H against the alternative U ~=1 U,~o(a'Eia4=aY, oa}. We accept H if
ai ~ c s ( S i ~ o 1) ~ C L ( S i ~ o

1) ~ b i

i = 1..... k,

(9.3)

656

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

and reject it otherwise where


k H i=l

P[ai<~cs(SiZol)<CL(SiY'o') <bilH] =(l-a)

(9.4)

When H i is true, Si2o ~ is distributed as the central Wishart matrix with (n i - 1) degrees of freedom and E ( S i Z o l) = Ip. Evaluation of the probabil.ity integral associated with the extreme roots and the joint distribution of the extreme roots of the central Wishart matrix is discussed by Krishnaiah in another chapter of this volume. For simplicity, we m a y choose the constants a i and bi in (9.4) such that c = bi= 1 / a i. If we wish to test H against one-sided alternatives of the form U~i=iU,c_o[a'Zia >a'Z0a], we accept H if

CL(SiZo' ) <b
and reject it otherwise where
k

i = 1,2 . . . . . k,

(9.5)

II P[ Cc(SiZff') < hiE]


i=l

= (1 - o0.

(9.6)

Similarly, the hypothesis H when tested against


k i= 1 a#:O

U U [a'~ia<a'~'oa]

is accepted if

c~(SiZ o ~) >>-a

i = 1..... k,

(9.7)

and rejected otherwise. In the univariate case, ce(Si2o 1) = e~(Si?Zo,) = s2/o~ where s~/o~ is distributed as chi-square with (n i - 1) degrees of freedom.

10.

Simultaneous tests for the equality of the covariance matrices

The hypothesis H: E x = Z 2 can be expressed as H = A . ~ o H . where a'Y.la = a'E2 a. The procedure proposed by Roy for testing H is based upon testing Ha simultaneously for all nonnull 1 x p vectors a' and accepting H if and only if all H , are true. According to this procedure, the hypothesis

Test procedures for mean vectors and covariance matrices

657

H when tested against U

.a[a'ZlaC=a'N2a]is accepted

if

a <cs( S,S2 -1) <,<CL( S1S2' ) <.b,


and rejected otherwise where

(t0.1)

V[ a < cs( S, Sz- ') <<. CL( S, S2 ') < b l H] = ( 1 - a ).


If H is tested against ing as

(10.2)

U ae=0[a'Y.la >a'~2a ], we accept

or reject H accord-

cL( S,S; ') ~ b,


where

(10.3)

P[ CL(S,S2-' ) <bill ] = ( 1 - a ) .
Similarly, the hypothesis H when tested against accepted or rejected according as

(10.4)

Uaco[a'Zla<a'Z,~a]

is

cs(S,S;')~a,
where

(10.5)
- a). (10.6)

P[c,(S,S2-') >~a[H] =(1

It is known that S l and S 2 are distributed independently as Wishart matrices with (n 1- l ) and (n 2 - l) degrees of freedom respectively and E(Sl/(n I --1))=~ 1 and E(Sz/(n 2- 1 ) ) = ~ 2. The evaluation of the probability integrals which arise in (10.2), (10.4) and (10.6) are discussed in another chapter of this volume. We now discuss generalizations (see Krishnaiah (1979)) of the above procedures to several sample cases. Now, let
k--1

H: Zx= ... =Zk,


k 1

AI= U [Zi>Zi+I]
i=1

A2=

U
i=1

The hypothesis H when tested against A~ is accepted if


cL(s,s,;',) <e i = 1. . . . . k 1,

(10.7)

and rejected otherwise where P[

CL(SiSi;I) <'b; i =

1..... k - 1 IH ] = ( 1 - a).

(lO.8)

658

P. R. Krishnaiah, G. S, Mudholkar and P. Subbaiah

Similarly, the hypothesis H when tested against A 2 is accepted if

a<cs(SiSi+ll)<CL(SiSi+l)<b
and rejected otherwise where

i = 1 , 2 ..... k - I

(10.9)

PIa ~.cs(SiSi;ll) ~ct(SiSi~.{) <b; i

1,2 .... , k - 11 H I = ( 1 - a).

(10.10)
The exact evaluation of the probability integrals in (10.8) and (10.10) is complicated but we can use Bonferroni's inequality to obtain bounds on these probability integrals. For example, using Bonferroni's inequality, we know that p [ CL(SiSiT_ 11) < b ; i = 1..... k - 1[ H I
k-I

>/1- E P[cL(S~S,+I)>~b].
i=1

(lO.11)
If we test H against tO i<a.[Ei~Z/], we accept H if

a<~cs(SiSj-1)<~CL(SiSj-l)<~b

i < j = 1,2 ..... k

(10.12)

and reject it otherwise where a and b are chosen such that the probability of (10.12) holding good when H is true is equal to ( 1 - a). Gnanadesikan (1959) proposed a procedure for testing H against U ~_511[Eg=/=k]. For a discussion of the simultaneous tests for the equality of variances against different alternatives, the reader is referred to Krishnaiah (1965b). We now discuss the tests proposed by Krishnaiah (1968, 1978a) for H against different alternatives by using conditional distributions. Let E o and S,j respectively denote the top j j left-hand corners of the matrices Eg and S; respectively. Also, let
[ Oil,j +1 ]

]8~=E~'/i

1'

(10.13)

[%,/+l J
bu=S ~' " siJd+ 1 j
S2 -+++,-Is+++,l/Iso.I,

(10.14)
~2

,O+l

]Eio+l/

E/j

for j = 1,2,...,(p-- 1),


s2=sin, 4=On,.

Test procedures for mean vectors and covariance matrices

659

W h e n / / j is true, the common unknown value of o~ is denoted by o02j. In addition, the following notation is needed:
2 2 ~.~: oa-...= %, g2: v =... =O~,j,

k--1
02 2

k--1 i=1 k-1

Aljl = U [ 0=/=O'~:+l,j]' Alj2-- U [~#=/=J~i+l,j]


i=1 k--I

U
i=1 k A3jl =

U
i=1 k

U
ii'= 1

[O~=~Oi~]'

A3j2 =

U
i~i'= 1

[J~0":;~i~/] "

The total hypothesis H can be expressed as


p p--I

I-/= A/~,, (-I ~ ,


j=l j=l

and so H can be tested by testing Hll .... ,HpI,Ha2 ..... Hp_l, 2 simultar ~Hjl, c~r-~H neously. Also, f")j= ,j= 1 j 2 is equivalent to the hypothesis that XI~ - - . . . =Xk,. So, when H is rejected, any subhypothesis Xl~= ... =Ykr can be tested by testing H I~. . . . . H~, H12 . . . . . H,_ 1,2 simultaneously. Motivated by the above considerations, Krishnaiah (1968, 1978a) proposed the procedures for testing H against A t, A2 and A 3 where
p p--[ p p--I j=l

A1--- U
j=l p

Aljl U
j~l p--I

AV'2'

A2 = U A2jl U A2s2,
j=l

A3=

U j=l

A3jl U j~l

A3f2.

Let
Filj~_(n I s ,j/s~(ni-;), 2 2 -;)

(10.19
(10.16)

Fi~ j = Dio( n - k - kj) / s~j+ ,,


where
S2,j + 1 ~
--

k E $2 i,J+ 1' i~l


^ A t

n~Hl-[--1

" "~-nk
I ^

(s,; + s ; , ) -

660

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

We first discuss the problem of testing/-/jl against the alternative A ly~ when the first ( j - 1) variates are held fixed; here it is understood that no variate is held fixed when Hi1 is tested. In this case, we accept/4y 1 if

aj <Fi, i+l,j <bj,


where

(10.17)

e[a~ <~,,+,,j < ~ ; i = 1..... k - l l ~ , ]

=(1-~,).

(10.18)

If we test//j2 against A 1i2 when the firstj-variates are held fixed and/-/j+ 1,1 is true, we accept/-/j2 if F~*i+Ij <cj i = 1 , 2 ..... k - I , (10.19)

and reject it otherwise where the constants cj are chosen such that the probability of (10.19) holding good for i = 1..... k - 1 when /-/j2 is true, is equal to (1 - a2). Combining (10.17) and (10.19), the following procedure was proposed. The hypothesis H is accepted if

aj<F~,i+l,j<<.bj;i=l . . . . .

k-l,

j = l ..... p , j = l ..... p - l ,

F,.*i+1,j < ej; i = 1..... k - I ,

(10.20)

and rejected otherwise where the probability of (10.20) holding true when H is true is equal to ( 1 - a). But, this probability is equal to liP= lqfllyz~qj, where

qj=P[~<r/,,+,,j<b/i=l ..... ~ - 1 ,
~' = P[ F/,*.+ld < cj; i-- 1..... k - l ,

I~,]

IHj2]

(10.21) (10.22)

2 2 2 2 When //jl is true, slj/Ooj,., .,si, J aoj are distributed independently as chisquare variables with ( n ~ - j ) ..... (nk - j ) degrees of freedom respectively. So, the probability integral in (10.21) is of the same form as the probability / 0 od+l 2 integral in (8.5). Also, when /-/y+l,l is true, S 2 .,j+l/ is distributed as chi-square with ( n - k - k j) degrees of freedom. In addition, when/-/j+ 1,1A Hi2 is true, the joint distribution of D l 2 j / Oo,j+ 2 l . . . . ,Dk_l,k,j/OO,j+l 2 is the same as the central joint distribution of correlated quadratic forms considered by Krishnaiah (1977). The method described above was discussed in Krishnaiah (1968, 1978a). Using similar methods as discussed above, Krishnaiah (1968, 1978a) also discussed procedures for testing H against A 2 and A 3.

Test procedures.for mean vectors and covariance matrices

661

Appe.dix A+ Computer programs for the largest root, trace and T 2 ~


00100 *

tests

BBI~ 8BI)B 0018O

lliii "l~
00230

881P8

00280 00290

B81|$ 0340

!I!!

00390 oo~o 00440 00~50

c o m D u t a t | o n o f c o n f i d e n c e bounds o s s o c i e t a d w~th l a r g e s t r ~ t ~ t r a a e o and ~(rmex)2 teStS f o r One*wey NANOVA probLom reel a(lS,&5) complex eve& ( l $ ) # e v t r ( 1 S , I 5 ) i n t e g e r D*t dimension thetee(lOa13).e(15.15).b(&S.lO)~ dettee(1$s6S)~ Sv(aSeaS)*eoet(lO*lO)*prod(aS,&$),prodt(&$o45),rv(3O),rs(4502~ |wk(2S$).ev(1S).n(lO) equivalence (eveL(1).rv(1)). (evtr(1.1).rz(1)) open ( 4 . f i L v s ' d a t a t 2 ' . f o r m s e f o r m e t t e d e) reed (4.$) o.k.t S format(lOtS) read (4,$) (n(i),ial,k) dO 8 i - l . k 8 reed (4.10) (thatee(tBj).j=l.p) I0 f o r m a t ( t 0 f 8 . 4 1 do 11 t m l . p 11 r e e d ( 4 . 1 0 2 ( o ( t . j ) . j e l . p ) dO 12 i = l . t 12 r e a d (4,152 ( b ( i . j ) . j a l . k ) 15 f o r m e t ( l O f 5 . 3 ) nnmO do 13 t = 1 . k 13 nnmnn+n(() r e e d (4,141 8 L p h e , i n d 14 f o r m a t ( f 5 . 3 . i 2 ) indicator I n d e l i f c l t.s p r o w t d e d as Impute O o t h e r w i s e if Lind-l) 2.3.2 3 r e e d (A.62 1 6 format (f10.62 go t o 8 2 ~fobsI.B-leLphelt) d28nnokoo*l ceLL i m s L S m d f i ( p r o b . d l . d Z . x . l e r ) clgx*O*(nn-k)ldZ 80 re~d ( 4 . 6 ) c 2 . c 3 w r i t e 46.16) p , k , t + ( n ( 1 ) , i a + ~ k )

018 D

884~8

llU!
05~0 00620 0630

8118

ooi)o 8~8 B:~B 8811B BBFVB 00720 88;~8 00750 760 ~0 02~0 88;~8 00800
00810

8 S x l m n o . Of c o n t r e s t s | ' . t a / $ u + ' S e l p t e w r i t e (6.172 17 f o r m e t ( l l O x . ' e s t t m a t e of t h e r e " l ) do 18 iml.k 18 w r i t e 4 6 . 2 0 ) ( t h e t e e ( i . j ) . j m l . p ) 20 f o r m e t ( S u . 8 f l O . 4 ) write (6*22) 22 f o r m e t ( l l O x B n e r r o r s . s . p , m a t r i x ' I ) do 23 i = I m o

16 f o r m l t ( 5 1 , " n o ,

of verjebtlst'~i4/Sx,+mo,

o f 9rOuOSSm+3m,(4/ l|lesl"+lOi$)

i$i i88~P8
0860
00920 00930 00940

88:18

write (6,27) 2? f o r m l t ( 1 1 0 x . m c m t r e s t m a t r i x " l ) do 28 i 8 1 . t 28 w r i t e (6*301 ( ~ ( I B j ) , j u l B k ) 30 f o r m o t ( S x B l O ( f S . 2 ~ 2 x ) ) do 32 t - l . k do 32 j - l , k 32 e p a i ( + , j ) 8 0 . O do 3& t s l . k 34 a o e | ( i . t ) U l o O I n ( + ) tit;alsk$nluk;iesa$;ibmlO;ic=&S ceLL i m s i S v m u t f f ( b , a D e i o t . l l . n l . t s . i b . p r o d ~ l c o i a r ) Let;mlnk;nlst;iasaS;tbaaS;icm45 ceLL ImsLSvmuLfp ( p r o d . b * t * m l * n l e t a * i b . u . lc.ter) write (6*35) 35 f o r m e t ( / l O x . " W m e t r t x " / ) do 36 i s 1 . ( 36 w r i t e ( 6 . 3 7 ) ( v ( i . j ) . j s l . t ) 37 f Q r m e t ( S x . 6 ( f S . & . 2 x ) ) c1~~.3 e r e c r i t i c a l c o n s ( e a t s fop t ~ e x 2 ~ L ~ r g e s t r o o t ~nd trice tests respetiveiy w r i t e (6o382 I . c 2 . 3 38 f o r m o t ( / $ x * " r i t i c a L constants essocieta~ wtth"tlOx."tm~x2~ &9x.f10.6110x. Largest root .2x.flOo611Ox. trace"~9x.flOo61) 181.*0.5; c2ec2.*0.5;3=3-*0.5 do 40 i 8 1 , t do 40 j e l , p deltee (j~t):O.O dO &O k k s l . k AO d e t t e e ( j o l ) ~ d a t t e e ( j o i ) b ( i . k k ) * t h ~ t e e ( k k . j ) write (6.42) ~2 f o r n e t ( / l O x . " e s t i m e t e of deLta ( p x t ) n l ) do 43 ~ u t . p 43 w r i t e ( 6 , 4 5 ) ( d e & t a e ( i . j ) . j e l . t ) &5 f o r m e t ( $ x . 7 ( f 8 . ~ . 2 x ) ) do 188 j j , do si l-.lt. p

++ + rite (6,25) (a(i.j),+e+,+) vormlt(sm,afYO.4)

662

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

00970 00980 00990 01000 01010 01020 01030 01040 01050 01060 01070 01080 01090 01100 * 01110 01120 01130 01140 01150 01160 01170 01180 01190 * 01200 01210 01220 01230 01240 01250 01260 * 01270 01280 01290 01300

dO 50 i=Isp do 50 j=l,t 50 m(i,j)=O m(ii,jj)=1~ 51 write (6,5 52 format(/1Ox,"M matrix of order pxt"/) do 53 i=1,p 53 write (6,55} (m(i,j)~j=1,t) 55 format(Sx,10(fS.3,2x)) est=O.O do 60 i=1,p do 60 j=1,t 60 est=est+deltae(i,))~m(i,j) bounds associated with t(max)2 hwdth1=0.0 do 70 i=1,t dmem=O.O do 75 j=Isp do 75 k1=lop 75 dmem=dmeme(j,kl)*m(j,i)*m(kl,i) dmem=(dmem*w(i,i)/(nn-k))**0.5 70 hwdth1=hwdth1+dmem*cl COmputatign of MWM' l=p;m1=t;n1=t;ia=15~ib=45;io~45 call imstSvmulff(m,w,l,ml,nl,ia,ib,prod,ic,ier) l=p;m1=t;n1=p;ia=45;ib=15;ic=45 call imslSvmulfp(prod,m,l,ml,nl,ia,ib,prodt,ic,ier) l=p;m1=p;n1=p;ia=45;ib=15;ic=45 call imsl$vmulff(prodt,e,l,ml,nl,ia,ib,prod,ic,ier) computation of ei~envalues of MWM'E n1=p;ia=45;ijob=0,iz=450 call imslSeigrf(prod,nl,ia,ijob,rw,rz,iz,wk,ier) do 90 i=1,p 90 ev(i)=reai(eval(i))

1310 * 01330 01340 01350 01360 01370


1320

bounds a s s o c i a t e d
hwdth2=O.O

with

largest

root

and t r a c e

tests

hwdth3=O.O do 105 i = t , p hwdth2=hwdth2sqrt(ev(i)) 105 h w d t h ~ = h v d t h 3 + e v ( i ) hwdth3=~3*sqrt(hwdth3) write (6,110) est,hwdthl,hwdth2,hwdth3 110 f o r m a t ( 1 2 3 x , " h s t f - w i d t h of c@nfidence interval"l13x,~estimate &4x,,t(max)Z",3x,"largestroot *3x,"trace"/lOx,4(3xsfS. 4))
hwdth2=hwdth2*c2

01390 01400 Q1410

01380

".

01420 01430 01440 01450 01460 01470

01480 01520

100 continue write (6,210) 210 format(5x,Nan~ more values for M?type yes or no") read (5,215) Info 215 format(a3) if(info.eq."yes") go to Z16

01490 01500 01510 01530 01540

go to 150 216 write (6,115) 118 r e a d (5,120)

115 formst(5x,"input do 118 i-1,o

for M (pxt) matrix

(row by row)")

(m(iBj).j=l,t)

01550 01560 015?0

150 c l o s e ( 4 ) stop end

120 format (16f5.3) go to 51

rOOtS n o . O~ v a r i a b l e s : no. of groups: no. of contrasts: sampte sizes: 62

4 6

54

33

15

estimate
0.7581 0.3519 0.9091 0.5333

of

theta
0.8065 0.5185 0.8485 0.6667

0.6290 0.6061 0.5333

0.9074

error 30.1460 -0.2537 8.4570

s.s.p,

matrix 8.4570 3.8379 30.7350

-0.2537 30.6170 3.8379

Test procedures for mean vectors and covariance matrices

663

contrast I.00 1.00 1.00 0.00 0.00 0.00 -I.00 0.00 0.00 1.00 1.00 0o00 W matrix

matrix 0.00 -I.00 0.00 -1.00 0.00 1.00 0.00 0.00 -I.00 0.00 -1.00 -1.00

o.o346 o.o161
0.0161 ~0.0185 -0.0185

o.o161 o .o464
0.0161 0.0303 0.0000 -0.0303

8:161 0161
0.0828 0.0000 0.0667 0.0667

~o.o~85

0.0000

0.0303 O. 0000 0.0488 0.0185 -00303

~0.0185 0.0000 0.0667 0.0185 0o085~ 0.0667

-0.0303

0o0000

00667 -0,0303 0.066? 0,0970

cr~ticat constants tmax2 Largest r o o t trace estimate -0.4062 -0.2784 0.2880

essoc~eted with 12.300033 0.086956 0110280 (pxt) 0.2248 0.095? 0.1398 pxt
o.ooo

of delta

-0.1510 0.0229 -0.0420 of order

-0.55?2 0.3013 -0.3300

-0.1814 0374~ - 0 . 1482

03?58 0.0728 0.1818

M matrix 0.000 0.000

1.ooo 0 o.ooo 8:888 0.000 o.ooo o.ooo 0.000 .000


0.000 0.000 0.000 0o000
estimate 0.4062 H matrix of half-width t(max)2 0.2834 order pxt

0.000 0.000
interval trace 0.3394

of confidence largestroot 0.3014

0.000 0,000 0.000

1.000 0.000 0.000

0,000

0.000 0.000

~ 000 u:O00 0.000

0.000 0.000 0o000

0.000 0,000 0.000

ettimate -0.1510 M aatrix of

half-width t(max)2 0.3280 order pxt

of confidence interval largestroot trace 0.3489 0.3929

8:888 8:888 :ooo 000 o.ooo 0000


0.000 0.000 0.000 0.000
estimate 02248 M matrix of half-width t(max)2 0.4380 order pxt

o .ooo 0.000 0.000

o.o o 0.0~0 0.000


inCerval trace 0.5246

of confidence largestroot 0=4659

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

1.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 O. 000


interval trace 0.4029

estimate -0.5572

ha(f-width t(max)2 0.3364

of confidence targestroot 0.3577

M matrix o f o r d e r 0000 0.000 0.000 0o000 0.000 0.000 0.000 0.000 0.000

pxt 1.000 0.000 0.000 0.000 0.000 0.000

~ 000 u1000 0.000

664

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

e=ti.ite -0.1814

half-width t(mex)2 0.4443

of confidence lergestroot 0.4?25

interval t~ece 0.5322

N mItrix of o r d e r o.ooo o.ooo 0.000 o.ooo o.ooo 0o000 o.ooo o.ooo 0.000

pxt o.ooo o.ooo 0.000 o.ooo o.ooo 0.000 ~.ooo o, ooo 0o000
interval trace 0.5678

estimate 0.3758 R =itrix of

half-vidth t(max)2 0.47&1 order pxt

of confidence lergestroot 0.5042

0.000 I~000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000


interval trace 0.3420

e|t~sate -0.2784

half-uidth t{max)2 0.2856 pxt

of confidence tlrgestroot 0.3037

M aatrix of o c d ~ r 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0000

0o000 0.000 0.000


intervll

estim~ tt O. OZZ9 M ~itrix

half-uidth t(mex)2 0.3306 pxt

of confidenc~ lergestroot 0.3516

0.3959

trace

of o ~ d e r 0.000 , .000 0.000

o.ooo 0.000
0.000

.00 8.008 0.000

.00 8.008 0.000

o.ooo 0.000
0~000

0.000 0.000 0.000


interval trace 0,5287

t imlte eSO. 0957 M matrix of

half-width t(me=)2 0.4414 order pxt

of confidence imrgestroot 0.4695 0.000 0o000 0o000

0.000 0.000 0.000

0.000 0o000 0.000

0.000 O.QO0 0.000

0.000 1.000 0.000

0.000 0.000 0.000

estimat ~ 0.3013 M satrix of

half-uidth o f confidence intervll t(mex)2 largestroot trice 0.3390 0.3605 0.4060


order pxt

8:888
0.000

o.ooo 0.000

~~ '

o.ooo o.ooo
0.000

o.ooo 0.000

o.ooo

,.ooo 0.000

o.0oo

o.ooo 0.000

o.ooo
interval trace 0.5363

estisate 0.3741 R eltrix of

half-width t(max)2 0.4478 order pxt

of confidenc~ lsrgestroot 04?62

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

0.000 0.000 0.000

0000 0.000 0.000

O. OOO 1.000 0000 interval tr o.sTzz

estimate o.o?z8

half-width t(max)2 0.4777

of confidence llrgestroot o.so81

Test procedures for mean vectors and covariance matrices

665

M matrix 0o000 0,000 0.000 0.000 0o000 1.000

of

order

pxt 0.000 ~ 000 1000 0.000 0.000 0.000 0.000 0.000 0.000 interval trace 0.3967

0.000 0.000 0.000

estimate -0.0420 M mItrix 0.000 0.000 0o000 0.000 0.000 0.000 of

half-width t(max)2 0.3312


order

of confidence largestroot 03523 0.000 0.000 0.000 0.000 0.000 0.000

pxt 0.000 0.000 0.000

0o000 0.000 1.000

estimate 0.1398 N Mitrix 0.000 0+000 0.000 0.000 0.000 0.000 es0 t i. ~ M matrix of of

half-width t(maxXZ 0+4423 order pit 0.000 0.000 1.000

of c o n f i d e n c e targestroot 0.4704 0.000 0.000 0.000 0.000 0.000 0.000

interval trace 0.$29?

0.000 0.000 0.000

half-width t(max)2 0.3396 order pxt 0.000 0.000 0.000 0.000 0.000 0.000

of confidence lirgest root 0o3612 0.000 0000 1.000 0.000 0.000 0.000

interval trice 0.4068

o.ooo 8.008
0.000 0.000

0.000

.00

estimate -0o1482 R matrix 0.000 0.000 0.000 0.000 0.000 0.000 of

half-width t(alx)2 0.4486


order

of confidence llrgestroot 0.4771 0.000 0.000 0.000 0.000 0.000 1.000

interval trice 0.5373

pxt

0.000 0.000 0.000

0.000 0,000 0.000

no STOP

Iny

half-width of confidence estimate t(max)2 Largestroot 0~1818 0.4787 0o5091 more v a l u e s f o r R ? t y p e y e s o r no

intervl[ tr~ce 0.5733

666

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah

Appendix Bo Computer programs for finite intersection and step-down procedures


00100 * 0110 * 0120 00130 00140 00150 00160 0170 0180 0190 0200 00210 00220 computation of test statistics ~nd c o ~ f i d e n c ~ bounds associated with finite intersection and s t e p - d o w n p r o c e d u r e s real lm(15.15).tinv(15,15) integer p~t dimension thetee(10~lS)pe(15ol$)ob(45plO)tapai(lO~10)o &prod(65~45)sw(45,45)tbthete(45,lS),z(tS~45)ove(120)eul(120)o Cwkaree(270)sfit(15,45),h(15,15),wi(45,45),te(15,15),v(15,15)o fstp(15)*c(15).d(15).est(15~65).n(10) open (6,file='datafit'.form~formetted *) r e a d (4*5) p . k o t . n i C 5 format(lOiS) read ( 4 , 5 ) (n(i).i81sk) do 8 i=l,k 8 read 44,105 (thetae(i~j),j=1~p) 10 format(1Ofa.4) do 11 i=l~p 11 read (4,10) (e(iej)~ja1~p) do 12 i=1,t 12 read 44,15) (b(i,j),j=l~k) 15 format(1Ofi.3) nnzO do 13 i = l , k 13 n n m n n + n ( i ) write 46,165 p ~ k + ~ n i c * ( n ( i ) e i m l a k ) 16 format (5xo"noo of voriablet:".i4/S~o"no~ O~ groups ="~ &i?/ix,"no. of contra$~sz",i415xa"no, of independent contrasts: * &oi415x."seeole sizes:",lOii) write (6.17) 17 f o r m a t (llOx~"estimate of thett~l) do 18 i = l . k 18 write (6,20) (thetae(i.j)oj=l~p) 20 formet(ix~?flO,6) write (6,22) 22 f o r m e t ( l l O x . " e r r o r S.S.p. matrix"/) do 23 i = l , p 23 w r i t e 46*25) (e(i,j)ej=l,p) 25 f o r m a t ( S x ~ ? f l O . 4 ) write (6*27) 27 foriat(/lOx,"ontr~st m a t r ~ " l ) do 28 i=l,t 28 write (6,30) (b(i*j)*iml*k) 30 formit(ix,10(fi. 2,2x) computation of WmB(A*AI)B = do 32 i = l , k " do 32 j = l . k 32 a p a i ( i , j ) m O . O do 34 i = l ~ k 36 apoi(i,i)~1.0/n(i) l=t;mlmk;nlml;iam45;ib=10;ic=4~ call i m s l $ v m u l f f ( b ~ a p e i * l , m 1 ~ n l , i ~ i b ~ p r o d # i , i e r ) lmt;mlmk;nlmt;ilm65;ibm65;icm45 call imsl$vmulfp (prod~b,l,ml,nl,ie,ib,u,ic,ier) print i mitre= write (6,35) 35 format(/1Ox,"~ matrix"/) do 36 i = l , t . .. 36 w r i t e (6*37) (w(1.j;,j~l~t) 3? f o r m o t ( i x , 6 ( f a 6 ~ 2 x ) ) computation of B(THETAE),L*LI,Z lat;mlmk;n1=p;iem45;ib=10;i=45 call imstSvmulff(b,thetae~l~ml~nl,i=oibobthet~oioier) nl=p;iaml$ cell imslSvcwtf$ (e#nl,ia,ve) nlsp COL[ i m s L S t u d e c p ( v e , u t , n l ~ d l ~ d ~ i ~ ) nl=p;~b=l$ Cat( imsltwcvtsf (ut.n1~tm. ib) do 30 i = l . p - I do ~0 jai~l.p ~0 li(i*j)mO do 52 i m l . p 52 I m ( i * i ) = 1 O / i ~ ( i ~ i ) cat( i~atStinvZf(l~l*t~,t~voidet~w~er~ier) Imp;il=p;nlmt$iam15;ibs45;i~15 cat, iastSvmulfp(tinv,btheta,t,ml,nl,ie, c o m o u t a t ~ o n Of f i n i t e intertection test do 70 i = l , p do ?0 ) = l , t fit(i,J)a~(i.j)*z(i,j)*(nn-~-i*l) drew(j,j) if(i-l) ?4.76,76 74 do ?5 k 1 = 1 , i - 1 75 d r ~ d r e z ( k l , j ) = z ( k l o j ) ?6 f i t ( i , j ) i f i t ( i , j ) / d r ib,z,ic,i~r) statistics

8gL' 00250
00260 002?0 00280 00290 00300

00350

88 98 00380
00390 00400 00410 0420 0430 00440 45 004?0 00510 52

88468 88 ;8 00500

* 88538 00560 00550 00590

88 98 00580
0 6 o86 8 00620 * 00630 00660 006?0 00680 * 00690 00700

8 ;18 00?30 88; 8 00?60

Iiiii
00810

~0 BBIIB 00890
00940

830 0860 *

Test procedures for mean vectors and covarianee matrices

667

~O~SO

o899 00980
0990

01020

i !!!i!
1110

lO O

70 c o n t i n u e w r i t e (6,821 8Z f o r e o t { l l O x , ' f i . i t a intersection t~st stati~tis'/lOx~ &~F(lej),j~l....,t; i:l.oo,Pm/) do 80 t * 1 , o ~Q y r t t o (65811 ~ , { f i t ( i s i ) , j s 1 ~ t } U1 ~ o r a o t ( $ 1 o " ~ * i Z s i m + a f a . 4 1 5 ( $ a ~ a f a . 4 / ) ) oiputation of step-down statistics iag45;idgt~O Cats im1(gLinvEf(woniceia.uleidgtowk~reaeier) cell

tmni;mlmp;nlmnt;iam45;ibm45;icmaS imsLivauLfm(btheta.wi.L.al.nleia.ib,grod.i#~r) tRp;mlmnicJnlmp;iam45;ibm45;icm15 at~ i w s L $ v m u & f f ( p r o d , b t h a t a , L , m l , n l , i a o t b , h , i c , i a r ) do 105 i51.0


do 105 j m l , p

i 1198
811 8
l]t3

nlmp;]em15 Ca{/ imsLSvcvtfs(tm,nl~ia,ve) nlap 1150 call iistStudecp (ve,ul+nlodl~d2,ier) nlsp;ibml~vcvtsfluL,nl,v~ib)cal/iaSL do 120 i m l o p - 1 1180 do 120 j s i * l . o 01190 120 v ( t ~ i ) ' O do 122 I - l . p 122 v ( l e t ) a 1 . 0 / v ( i s i ) do 130 imle~ OIZ30 fstp(i)m(v(Isi)llm(i~i))**2.0 130 f s t p ( i ) s ( f s t p ( i ) - l . 0 ) * ( n n k - t + l ) / n t w r i t e (6,1341 w r i t e (6,1531 ( f s t p ( i ) , t m l , p ) 1 4 format(/lOx,mstap-down statistics F(1)oial,.o~op"/) 1~5 f o r l l t (10.,?f10.41 * critical COnStants f o r f i n i t e intersection end st@p~odGvn t@~tS r o o d (4e14) a t p h o . i n d 01310 14 f o r m ~ t ( f S ] . ~ ) : indicator ands1 i f the c r i t t c a t c o n s t a n t = ~or f . i + t ~ s ~ s era provided~ 0 otherwise a|phots1.0o((1.0-atpha)**(1.0tp)) do 140 l a 1 * 0 I~60 groblsl.0-(atphoilt) 1370 dlla1.0; d2~(nn*k-i+l) caLL t a s l S m d f i ( p r o b 1 , d l l ~ d ~ , ( i ) ~ i e r ) ~r9b2~1.0-alohei d12-nic; d2a(nn-k-tal) 1410 celt tnslS~dfi(prob~,dl~,dZ~d(t)*t~) 140 c o n t i n u a if(andl) 2,3#~ 3 reed (~.6) ((i)*i~I.0) 1450 6 f o r m a t ( a f l O . 6) continue vrlte (6.14~) ~rite (6,1471 ( { i ) . i ~ 1 , 0 1 w r ~ t e (6.1461 vr~te (6,147) (d(i),t~l,p) 145 f o r m j t ( / l O s . ' r i t i c a L constants f(i) a s s o t a C v d ~ I t h ~~.~t 810x* t~St S t l t t l t t s F(|,j). I=l,..,p'I) 146 f o r m o t ( / l O x , ' c r i t t c a t constants f(t) associated with'! 810.,"ste~-doun test statistics f(i),~ml...,p~l) 11550 147 f o r m l t ( S x , ? f l O . 4 ) 11560 dO 150 t ~ 1 , D d2-(nn-k-l+l) c({)~(t)ldZ d(i)~nic*d(t)l(nn-k-~l) 150 c o n t i n u e * c o n f i d e n c e bounds on o n d l t l o n a L m a a ~ w r i t e (6,15214 w r i t e ( 6 , 1 5 3 ) I$2 f o r s o t ( l l O x . " c o n f t d o n c e bounds on ~ n d i t t o n a l ~ a ~ a ~ ~ r s ~) 153 f o r s l t ( l l x , " v l r " * 2 x e ~ g r ~ u p C~ntrast"*34u,~hatf-vldths"l 1650 141x~est~late'.4x,~f.t.-,&x, stop-do~n~/) do lS$ t ~ l , p do 155 j ~ l , t 153 e s t ( i . j ) ~ l ( t + j ) * t m ( ~ , i ) 01690 do 160 i~1.p

!!!!

105 ta(i.j)lh(i.j)+e(i.j)

15|o h148 II'o ldB

hudth~w(j,j) if(i-l) 163.164+163 ~:~ do 165 k 1 " 1 , i - 1 hwdthmhudthal(k+j)*z(k+j) 164 h w d t h l s l m ( i ~ t ) * S q r t ( h w d t h ~ ( t ) ) 1760 hudth2ita(~,i)*sqrt(hwdth*d(i)) 1770 ~ r t t o (6#1701 i * ( b { j * k l ) ~ k l m l o ~ ) 01780 yrito (6o1~1~ e s t ( i , j ) , h w d t h l , h u d t h ~ 170 f o r a l t ( l x * x *IZ*10f6.2) 171 f o r w o t ( 4 O x , ] ( f a . 4 , 2 x ) ) 160 c o n t i n u e 01820 01830 * c o n f i d e n c e b o u n d s on original m~anS 01840 do 175 i = 2 ~

do 160 ~ml.t

1740 175~

1730

668
01850 01860 01870 01880 01890 01900 01910 01920

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah


Sfal; df=l do 176 j = l , i - I
CfmCf*c(j) 176 ~fsdf+d()) c(i)=c(i)*cf 175 d ( i ) = d ( i ) * d f write (6,~72) write (6,153)

01930 01940
01950 01960 01970

172 f o r m a t ( l l O x , " c o n f i d e n c e

bounds

on

origina(

~ean~ ~ )

01980 01990 02000 02010 02020 02030 02040 02050 02060


02070

do 185 i = l e p hwdth180;hwdth2=O dO 190 k l = l s i hwdthl=hwdthl~(abs(lm(i,kl 190 h w d t h 2 ~ h w d t h 2 ( a b s ( l m ( , i . k l ~ do 185 j s l , t hwl=hwdthl*sqrt(w(].~)) hw2mhwdth2*sqrt(w~]*];)

))~$qrt(c(kl)) ))*$qrt (dlkl))

write (6,170) i*(bIj,kl)ok1=Isk)

write (6,171) btheta(j,i).hwIBhw2 185 continue close(4) stop end

fit

no.
no.

of variables:
of groups :

3
6

no. of contrasts: no. of independent sample sizes: 62 estimate 0.7581 0.3519 0.9091 0.5333 error 30o1460 -0.2537 8.4570
contrast

6 contrasts: 54 33

3 15

of theta 0.6290 0.9074 0.6061 0.5333 0.8065 0.5185 0.8485 0.6667

S.S.p. matrix

-0.2537 30o6170 3.8379


matrix

8.4570 3.8379 30.7350

8:88
0.00

1.00 1.00 1.00

-1.00 0.00 0.00 0.00

.oo -I.OO o.oo .oo o.oo -i.oo


1.00 -I.00 0.0161 0.0464 0.0161 0.0303
0.0000

0.00 -1.00 0.00

0.00 0o00 -1.00

W matrix 0.0346 0.0161 0.0161 -0.0185 -0.0185 0.0000 0.0161 0.0161 0.0828 0.0000 0.0667 0.0667 ~0.0185 0.0303 0.0000 0.0488 0.0185 -0.0303 ~0.0185 O.O00Q 0.0667 0.0185 0~0852 0.0667 0.0000 -0,0303 0.066? -0.0303 0.0667 0.0970

-0.0303

finite i n t e r s e c t i o n test s t a t i s t i c s F ( i o j ) o j m l s . . . e t ; i~1...op x I x 2 x 3 25.2754 9.7882 5.8245 step-down 13.9795 2.6063 0.0515 0.0007 3.2395 0.5856 0.2763 33.7521 7.7286 4.0939 2.0502 8.3559 1.3108 7.7298 0.2948 0.2475

statistics 4.8299

F(i),i=1,...,p 2.1654 a s s b c i a t e d with im1,..oplOx, f.i~

critical c o n s t a n t s f(i) test s t a t i s t i c s F(iPj), 9.1983 9.2012

9.2041

Testproceduresformeanvec~rsandc~ar~ncema~es
critica~ c o n s t a n t s f(i) a s s o c i a t e d with s t e p - d o w n test s t a t i s t i c s F ( i ) s i ~ I o . . ~ p 3.4991 3.4996 3.5002

669

confidence

bounds

on c o n d i t i o n a ~ estimate

parameters half-widths foi= step~dowt~ 0.2450 0~2837 0.3788 0.2909 0.3842 0.4099 0.2478 0.2868 0.3830 0.2941 0.3885 0.4145 0.2371 0.2765 0.3666 0.2815 03718 0.3967 0.2618 0.3030 0.4047 0.3107 0.4105 04379 0.2647 0.3064 0.4091 0.3142 0.4150 0.4427 0.2533 0.2932 0.3916 0.3007 0.3972 0.4238

vat
x I x 1 x I x I
x 1

group

contrast
0.00 -1o00 0.00 -1.00 0.00 1.00 0.00 0.00 0.00 ~I.00 0.00 -1.00 =I.00 0.00

1.00 ~ I . 0 0 1.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00

0.4062
-0~1510 0=2248 -0.5572 -0.1814 0.3758

x I x 2
x 2

1.00 - I . 0 0

-0.2750
0.0~16 0.0976 0.2966 0.3726

1.00
1.00 0.00 0.00 0.00

0.00
0.00 1.00 1.00 0.00 -1.00

-1.00 0.00
-I.00 0.00 1.00

0o00
-Io00 0.00

X 2 x 2 x 2
x

-i.00 -I.00
0.00

0.0760
0.2092

x 3 x 3
x 3

1.00 1.00
1.00

0.00 -1.00
0.00

0.00
0.00

0.00 -1.00 0.00


-I.00 -1.00

-0.002~
0.0643 -0.2116

x 3 x 3
x 3

0.00
0.00 0.00

1.00
1.00 0.00

-1.00
0.00 1.00

-0.1449

0.0667

confidence vat x I x 1 x 1 x 1 x I x I x 2 x 2 group 1.00 1.00 1.00 0.00 0.00 0.00 1.00 1.00 contrast -1.00 0o00 0.00 1.00 1~00 0.00 -1.00 0.00

bounds

on o r i g i n a l

means half=widths f.i step=down 0.2450 0.2837 0.3788 0.2909 0.3842 0.4099 0.2568 0.2973 0.2618 0.3030 0.4047 0.3107 0.4105 0.4379 0.2754 0.3188

estimate 0.00 0.00 -I.00 0.00 -1.00 -1o00 0.00 0.00 0~4062 -0o1510 0.2248 -0.5572 -0.18!4 0.3758 -0.278& 0.0229

0.00 -I.00 0~00 -1.00 0.00 1.00 0.00 -1.00

670

P. R. Krishnaiah, G. S. Mudholkar and P. Subbaiah


confidence

bounds

~n o r i g i n a l

meams(contd.)
estimate

half=widths
step~down

vat

grouD

contrast

f.i.

x 2
x 2 x 2 x 2 x 3 x 3 x 3 x 3 x 3 x 3 STOP

1.00
0.00 0.00 0.00 1.00 1.00 1.00 0o00 0.00 0.00

0.00
1.00 1.00 000 -1.00 0.00 0.00 1.00 1.00 0.00

0.00
-I.00 0.00 1.00 000 -1.00 0.00 ~1.00 0~00 1.00

-I.00
0.00 -I.00 -1.00 0.00 0.00 -1.00 0.00 -1.00 -1.00

0.0957
0.3013 0.3741 0.0728 0.2880 -0.0420 0~1398 -0.3300 -0.1482 0.1818

0,3970
0.3049 0.4027 0.4297 0~3523 04076 0.5443 0.4180 0.5521 0.5890

0.4257
03269 0.4318 0.4607 0.3783 0.4379 0.5848 0.4490 0.5932 0.6329

References Cox, C. M., Fang., C. and Boudreau, R. (1979). Computer program for Krishnaiah's finite intersection tests for multiple comparisons of mean vectors. Unpublished manuscript. Cox, C. M., Krishnaiah, P. R., Lee, J. C., Reising, J. and Schuurmann, F. J. (1980). A study on finite intersection tests for multiple comparisons of means. In: P. R. Krishnaiah, ed., Multivariate Analysis- V. North-Ho)land, Amsterdam. David, H. A. (1952). Upper 5 and 1% points of the maximum F ratio. Biometrika 39, 422-424 Dunnett, C. W. (1955). A multiple comparisons procedure for comparing several treatments with a control. J. Amer. Statist. Assoc..50, 1096-1121. Dunnett, C. W. and Sobel, M. (1956). A bivariate generalization of Student's t distribution, with tables for certain cases. Biometrika 41, 153-169. Gnanadesikan, R. (1959). Equality of more than two variances and of more than two dispersion matrices against certain alternatives. Ann. Math. Statist. 30, 177-184; correction 31, 227-228 (1959). Harter, H. L. (1960). Tables of range and studentized range. Ann. Math. Statist. 31, 1122-1147. Hartley, H. O. (1950). The maximum F-ratio as a short-cut test for heterogeneity of variances. Biometrika 37, 308-312. Kaskey, G., Krishnaiah, P. R. and Azzari, A. J. (1962). Cluster formation and diagnostic significance in psychiatric symptom evaluation. Fall Joint Comput. Confer. Proc. 285-302. Krishnaiah, P. R. (1965a). On the simultaneous ANOVA and MANOVA tests. Ann. Inst. Statist. Math. 17, 35-53. Krishnaiah, P. R. (1965b). Simultaneous tests for the equality of variances against certain alternatives. Austral. J. Statist. 7, 105-109; correction 10, 43 (1968). Krishnaiah, P. R. (1968). Simultaneous tests for the equality of covariance matrices against certain alternatives. Ann. Math. Statist. 39, 1303-1309. Krishnaiah, P. R. (1969). Simultaneous test procedures under general MANOVA models. In: P. R. Krishnaiah, ed., Multivariate Analysis-ll, pp. 121-143. Academic Press, New York.

Test procedures for mean veclors and eovariance matrices

671

Krishnaiah, P. R. (1977). On generalized gamma type distributions and their applications in reliability. In: C. P. Tsokos and I. N. Shimi, eds., The Theory and Applications of Reliability, Vol. 1, pp. 475-494, Academic Press, New York. Krishnaiah, P. R. (1978a). Further results on simultaneous tests for the equality of covariance matrices against certain alternatives. I M S Bulletin. Krishnaiah, P. R. (1978b). Some developments on real multivariate distributions, in: P. R. Krishnaiah, ed., Developments in Statistics, Vol. 1, pp. 135-169. Academic Press, New York. Krishnaiah, P. R. (1979). Some developments on simultaneous test procedures. In: P. R. Krishnaiah, ed., Developments in Statistics, Vol. 2, pp. 157-201. Miller, R. G., Jr. (1966). Simultaneous Statistical Inference. McGraw-Hill, New York. Mudholkar, G. S. and Subbaiah, P. (1975). A note on MANOVA multiple comparisons based upon step-down procedures. Sankhya, Ser. B, 37, 300-307. Mudholkar, G. S. and Subbaiah, P. (1979). MANOVA multiple comparisons associated witli finite intersection tests. In: P. R. Krishnaiah, ed., Multivariate Analysis- V, North-Holland, Amsterdam. Ramachandran, K. V. (1956). On the simultaneous analysis of variance test. Ann. Math. Statist. 27, 521-528. Rao, C. R. (1964). The use and interpretation of principal component analysis in applied research. Sankhya, Ser. B, 26, 329-358. Roy, J. (1958). Step-down procedure in multivariate analysis. Ann. Math. Statist. 29, 1177-1187. Roy, S. N. (1957). Some Aspects of Multivariate Analysis. Wiley, New York. Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. Ann. Math. Statist. 24, 513-536. Shaffer, J. P. (1977). Multiple comparisons emphasizing selected contrasts: an extension and generalization of Dunnett's procedure. Biometrics 33, 293-303. Scheff~, H. (1953). A method for judging all contrasts in the analysis of variance. Biometrika 40, 87-104. Scheffe, H. (1959). The Analysis of Variance. Wiley, New York. Stoline, M. (1978). Tables of the studentized augmented range and applications to problems of multiple comparisons. J. Amer. Statist. Assoc. 37, 656-660. Tukey, J. W. (1953). The Problem of Multiple Comparisons. Unpublished Manuscript. von Neumann, J. (1937). Some matrix inequalities and metrization of matric spaces. Tomsk. Univ. Rev. 1, 286-300.

P. R. Krishnaiah, ed., Handbook of Statistics, VoL 1 North-Holland Publishing Company (1980) 673-702

.,(~ Z=a

f~ /~

Nonparametric Simultaneous Inference for some M A N O V A Models


Pranab Kumar Sen

1.

Introduction

In a (multivariate) analysis of variance (MANOVA) model, one confronts a set of (estimable)parameters and desires to draw valid statistical inference on them. The two basic problems in this context are to make suitable tests of significance on certain parametric functions and to provide confidence regions for the same. Traditionally, a statistician is inclined towards framing an overall null hypothesis and testing it against suitable alternatives in such a way that the type I error probability (level of significance) of the test is equal to some specified a ( 0 < a < 1) and the power of the test is as high as possible. The test statistics used in this context is often usable to provide a confidence region for some parameters with the basic property that the coverage probability of the confidence region is equal to some preassigned number 1 - a ( 0 < a < 1). [Usually, a is taken as 0.01, 0.05, or 0.10.] Quite often, rejection of an overall hypothesis leads one to examining various component hypotheses and detailed testing for these components involved multiple testing on the same set of data and thereby may increase the overall level of significance of these simultaneous tests. Indeed, no matter, how small is the individual level of significance of the component tests, a large number of components may push up the overall level of significance considerably and some care must be taken to guarantee that the overall level of significance does not exceed some preassigned number a ( 0 < a < 1). Similarly, a simultaneous confidence region should provide confidence intervals for various parameters such that the overall coverage probability is least equal to some preassigned number 1 - a ( 0 < a < 1). I n this framework, a variety of simultaneous inference procedures is available in the literature. These procedures can broadly be categorized as parametric and nonparametric ones. In the parametric case, the underlying distributions of the random variables associated with the MANOVA models are assumed to be of known functional forms (mostly, normal or 673

674

Pranab Kumar Sen

multinormal distributions) involving unknown parameters and with reference to these underlying distributions, one needs to choose an inference procedure having some optimality properties. Indeed, in m a n y cases, such optimal procedures do exist. Unfortunately, such an optimal parametric procedure may not only lose its optimality but also its validity if the assumed forms of the underlying distributions do not agree with their actual ones or some other assumption is not met in reality. Or, in other words, parametric procedures m a y not be very robust against departures from the basic assumptions underlying the M A N O V A model. On the other hand, nonparametric procedures make relatively less restrictive assumptions, and theieby, remain valid (and robust) for a broader class of underlying distributions. We shall be concerned here w i t h nonparametric methods only. To fix ideas, we start with the univariate analysis of variance models and then to pass on to the multivariate problems. Section 2 is devoted to the study relating to one-criterion A N O V A models. Parallel results for blocked experimental setups are considered in Section 3. Simultaneous procedures relating to the M A N O V A models are considered in Section 4. Section 5 deals with the multivariate analysis of covariance ( M A N O C O V A ) models. The concluding section is devoted to sequential procedures as well as to some general observationsl

2.

Simultaneous inference for the one-way A N O V A models

Consider c (/> 2) independent samples, the i th sample comprising of n i independent random variables X~l..... Xi~ with a c o m m o n distribution function (dr) F/(x), - o c < x < m, for i = 1..... c. In a one-way A N O V A model, we assume that
Fi(x)=F(x-O~),

i = 1 ..... c

(-c~<x<m),

(2.1)

where 0 = (01 . . . . . Oc)' is a (parameter) vector of real, unknown quantities. The overall hypothesis to be tested is Ho: 0 = 0 against H : 0v~0. (2.2)

In a simultaneous inference problem, we desire to test more detailed hypotheses concerning two or more elements of O and also to draw simultaneous confidence regions for such components of 0. In this context, we define a contrast by
~=1'0

where

i = ( l , . . . . . lc)'~O

and

I'1=0.

(2.3)

Nonparametric simultaneous inference for some MANOVA models

6"15

Specifically, we are interested in m a k i n g simultaneous tests for a (finite or infinite) set of contrasts and also in providing a simultaneous confidence region for them. 2.1, Nonparametric T - m e t h o d

By analogy with the p a r a m e t r i c case, for the n o n p a r a m e t r i c Tukeyprocedures, we impose the basic condition that n 1 . . . . . n c = n. Let then X, = (X~, . . . . . Xi,,)', Tic= T(X,.,X~,), i = 1 .... ,c; i~i'= 1 . . . . . c, (2.4) (2.5)

where T(X,.,X/,) is a t w o - s a m p l e rank order statistic (based on the (i,i') th samples), which we define below. Let a z v ( 1 ) , . . , , a N ( N ) be suitable scores, defined for every N ( > 1),
N N

a u = N -1 ~. aN(i),
i=I

A~v = N - l ~. a2v(i) _ d2u,


i=1

(2.6)

and let Ri{ it) . . . . . Ri~ic) be the ranks of Xii . . . . . )(in a m o n g Xil . . . . . Xi., Xiq . . . . . Xc.. Then, we take
"]'ii" = n - I k
j=l

[ [ R O" ii'] l_a2n\ ]-- a2. ].

(2.7)

We shall find it convenient to choose aN(l)---- ~ ~

('I

i = 1. . . . . N;

N > 1,

(2.8)

where ~ is some appropriate function. For example, if we let ~ ( u ) = 1 or 0 according as u is < or > ~ l , (2.7) reduces to the B r o w n - M o o d type median statistic; for ~ ( u ) = u , 0 < u < l , it is the Wilcoxon rank s u m statistic, for q~(u) = - 1 - log(1 - u ) it is the exponential scores (Savage-) statistic, a n d for ~ ( u ) = q ) - l ( u ) , the inverse of the s t a n d a r d n o r m a l df, (2.7) is the normal scores statistic. Slight modifications in the definitions of the scores in (2.8) are often made: See H/tjek a n d Sidak (1967) or Puri and Sen (1971). Let then Wn= max l <i <i" <c

{2n'/2A~llT,.,,[}.

(2.9)

Let %,~ and qc,, be respectively the u p p e r 100a% point of the distribution of W n (when H 0 in (2.2) holds) a n d the range of a sample of size c from the

676

Pranab Kumar Sen

standard normal distribution. For small values of n (and c), o%~ can be obtained by direct enumeration of the permutation distribution of the combined sample observations--this becomes prohibitively laborious for large n (or c). However, o~.,. can be approximated by G,~ [see Sen (1966)]. Now compute T/i, for all 1-<<i<i' <<c [note that T / i , = - Ti, i]. Referred to (2.1), regard those 0~- 0i, = A,, to be significantly different from 0 for which [T/i'l >/~n, aAzn/2X/ n" If we assume that aN(l ) < " <<a~v(N) for every N/> 1 and denote by T/i,(b)= T(X,.+ bl,X/,), - ~ <b < co, then Tic(b ) is also non-decreasing in b. Thus, if we let

~.',L = i n f ( b : T/r(b) > -w.,.AeJ2k/n}~


~.'u = sup(b: then we have

Tic(b ) < o%~A2./2~/ n },

1 <-Ki=/:i ' <-Kc,

(2.10) (2.11)

P(~ii',L

~ " l"t } Aii,<Aii,,U, V l,

l--a,

(2.12)

which provides a simultaneous confidence region for all (Air }, Let us now extend these simultaneous tests and confidence regions to the case of general q~'s, defined by (2.3). Let A . , , = i n f ( b : T/,.(b)>0}, /~.,,2=sup{b: Zii,(b)<O}; (2.13) (2.14) (2.15) (2.16)

~ii'= ~(Ai/',l+Aii',2),
~i "=C-1 ~ ~ii"
i'=1 1

1 <<,i~i' <c;
<i <c (where ~ii-~-Aii=O, U i);

Zij=~i.--~j
Finally, let H.~= max ' 1<i~i'*;c

for i,j= 1,...,c.

{supl-lZ.,--Aii, l:~icL<KA.,K~ii,,v]}. L

(2.17)

Then, it follows from Sen (1966) that

I
i=1

1
i=1 i=1

<

i=l

l/X,+~n.,,~

i=l

IZil, Vl~=O,l'l=O >~1-~

(2.18)

Nonparametrie simultaneous' inference for some M A N O VA models

677

which provides a simultaneous confidence region for (g,} in (2.3). Also, a simultaneous test [based on (2.18)] relates to rejecting those contrasts (i.e, accepting ~b~ 0) for which the upper and the lower boundaries in (2.18) are either both positive or both negative. So far, we have assumed that n~ . . . . . nc = n. There is no operational difficulty if the sample sizes are not equal. However, then %,~ will depend on n~..... n .and may not converge to q~,~. The modifications, suggested for the parametric case, by Spjotvoll and Stoline (1973) and Hochberg (1974) also work out well for this nonparametric case.

2.2. Nonparametric S-method


Let N=-n l + ' . . + n C and let R,y be the rank of sample of size N for 1 K j ~<n~, 1 ~<i ~<c,
ni

Xo. in

the combined

7;=ng-l Z aw(Ro.)-d N, l<~i<~c,


j~l

(2.19)

~N=[(N-1)/NA2w] ~ niTi2.
i=l

(2.20)

Let f f ~ 2 , and Xc-1,a be respectively the upper 100a% point of the distribution of E~v (under H 0 in (2.2) and the chi-square df with c - 1 degrees of freedom (DF). For small n I..... nc, values of E~v,~ are tabulated (for various special scores) in various statistical tables, while for large values of nl .... ,no, ~v,~ can be approximated by X~2_l,~. Then, by (2.20) and some little algebra, we have

~/2 P{ITg - T/,[ ~< ~N,,~AN(N/(N-V 1 <~i~i'<clHo} Oi~Or] for


~> 1 - a

1))'/2[ni-'+ni-'] '/2,
(2.21)

so that a simultaneous test consists in rejecting those which i T / - Ti,] exceeds

(i,i')

[i.e, accepting

A discouraging aspect of this procedure is that T, and T,., both depend on the remaining ( c - 2) sample observations, and hence, the values of 0j for these populations may affect the distribution of T~- T~, even when 0, = 0i,

678

Pranab Kumar Sen

but the other 0j are not all equal to 0. Also, for general contrasts [~:= l li ~]i i= ill = 0], the picture becomes more complicated. Unlike the parawith YY metric case, if we consider a subset of samples and compute the corresponding EN' (where N ' = n j + . . . +njq, q<c), then EN' need not be smaller than Ely, and hence, the full generality of the S-method is not achieved here. Some modifications will be considered in the general multivariate case in Section 4.

2.3.

Treatments vs. control procedures

With reference to the model (2.1), we conceive of an additional sample X 0 = ( X m ..... X0no) from the distribution Fo(x)=F(x ), so that 0 stands for the vector of the treatment effects of the c samples. In this case, we desire to test hypotheses on q~=i'O or to estimate tp, without imposing the restriction that ! ' 1 = 0 . As in (2.7), we define T0i (based on Xo, X i and the sample sizes no, ni, so that 2n is replaced by no+ n/) for i = 1..... c. Let then T+=
l<i,~c

max Toi

and

T * = m a x ]T0il.
l<i<c

(2.22)

We denote by T, + and T*, the upper 100a% point of the df's of T + and T* when H 0 in (2.2) holds. Then a simultaneous test consists in rejecting those i (i.e., accepting 0i to be different from 0) for which T0i is > T~+ (one-sided case) or [Toi I > T* (two-sided case). Also, as in (2.10), (2.11) and (2.12), we can provide a simultaneous confidence region for 0. For small n o..... nc, the exact values of T ~ , T* can be obtained by enumerating the permutation distribution of ranks of all the n 0 + . . - + n~ observations. The task becomes prohibitively laborious for large sample sizes. However, if the ni are equal, i.e. n 1. . . . .

n~=an o for some O < a < oc,

(2.23)

then, writing n = no+ n l,

[ no(a + 1 ) / a ]'/2A; IT+-~rn+c,

,._._~ , [ n o ( a + l ) / a ] 1 / 2 A. --1 T2 m~,,~

(2.24) where rn~ c and m*,c are respectively the upper 100a% point of the distributions of the m a x i m u m a n d m a x i m u m absolute values of ('r 1. . . . . %) following a normal distribution with m e a n O, all variances equal to 1 and a c o m m o n covariance a / ( a + 1); G u p t a (1963) has some tables pertaining to these. A procedure analogous to the S-method is also given in Sen (1966).

Nonparametric simultaneous inference for some MANOVA models

679

3.

Simultaneous inference in two-way ANOVA models

We conceive of n blocks, each block containing c (>t 2) plots where c different treatments are applied. The response of t h e j th plot in the ith block is denoted by X~/and we write

Xij=lz+t~i+Oj+e~,

l<j<c,

l<i<n,

(3.1)

where/~ is the mean effect, /3i are the block effects, 0/ are the treatment effects and the eU are the residuals, assumed to be independent and identically distributed with a continuous df F, defined on ( - ~ , oe). As in (2.2), we like to test H0:0=0 vs. H: 0:/:0. (3.2)

In a simultaneous inference problem, our interest centers around {tp}, defined by (2.3).

3.1. Nonparametric T-method


To eliminate the nuisance parameters, consider the aligned observations

Yij=Xij-c -1 ~ Xi,, j = l .... ,c, i = 1 ..... n;


l=1

(3.3) (3.4)

Yj = ( Ylj ..... Y,,j), 1 <j < c,

and define T~,= T(Yj, Yj.,), 1 <j,j' <c as in (2.5)-(2.7). Also, define A~, as in (2.6)i Let then

Wn= l<j~j'<cmax {2nl/ZA~l(1-c-')l/Z[Tii,[}.

(3.5)

Let %,, be the upper 100a% point of the distribution of Wn under H o in (3.2); this can be obtained by enumeration (of (c!) n) of rank permutations when n is small. For large n, it follows from Sen (1969) that %,~ < qc,,. (3.6)

As such, we can proceed as in (2.9) through (2.18) with the remark that by virtue of (3.6), if we replace ~on, ~ by qc,~, then for large n, in (2.12) and (2.18), we have the coverage probability /> l - a , while for the simultaneous tests, the level of significance is < a.

680

Pranab Kumar Sen

3.2. Nonparametric S-method


Both intra-block ranking and inter-block ranking procedures are available. Consider first the Friedman type statistics:

Sj=n -1 ~ J(Nij),
i=l

1 <-<j<c,

(3.7)

where J(l)~< .-. <J(c) are given numbers [ J ( i ) ~ i relates to the Friedman test, while J(i)-~ 1 or 0 according as i is <~ or > ( e + 1)/2 relates to the Brown-Mood test], A2(J) = (c -- 1)- Is'c~i=ltrJtDt J - f]2 and J = c-IX~= 1J(i). Let then
$ =n y -- ( S j - J ) j=l c 2

/ A 2 (J),

(3.8)

and let g~ be the upper lOOa% point of the distribution of S when H o in (3.2) holds; for small values of (n,c), these are tabulated in Statistical Tables for some typical J, while for large n, $~ can be approximated by 2 Xc-l,~. Then, parallel to (2.21), we have

e{Isj- s~l < SL/ZA(J)(2/n) '/2, V 1 <.j <k <cln0} >/1 -- a, (3.9)
so that a simultaneous test for the AjK (-----0j- Ok) consists in rejecting those Ajk (i.e., accepting Ajk 4:0) for which [Sj-Ski > SI,/2A(J)(2/n) 1/2. For the inter-block ranking case, we use the aligned observations Y/ in (3.3)-(3.4) and define the Tj as in (2.19) (based on the Yj.) with N = nc and R0.=rank of Y/j among Yll ..... Ya....... Ynl..... Ync, i.e.

Tj=n -1 ~_, aN(Ro)--dN,


i=1

f o r j = l ..... c.

(3.10)

Also, let

,,(c~- l)
i=l j=l

a~(R,j)

1
C j=l

(3.11)

~N =n

E
j=l

f~.~.

(3.12)

Nonparametric simultaneous inference for some M A N O VA models

681

Then, parallel to (3.9), we have

P(]Tj-

Zk[ ~ ~-~N,a~n\~ t'] .,](l/2[')/n'll/2, V 1 <j<k<<clHo} i> l - a ,

(3.13)

where, it follows from Sen (1968) that E;v ~ can be approximated by ~ 2 The rest of the simultaneous testing procedure is similar to the intra-btock ranking case. Both these procedures have the same shortcoming as of the S-method in Section 2.2.

3.3. Treatments vs. control procedures


As in Section 2.3, we conceive here n blocks of (c + 1) plots each, where a control and c treatments are applied. Thus, in (3.1), we a l l o w j to run between 0 and c and frame the null hypothesis as 0vs0 (where 00=0 ). In the simultaneous inference, we are interested in the set of alternatives {tp=l'O--/=O} where 1,#0 (but !'1 need not be equal to 0). Define the Toj= T(Y o, Yj), 1 <j<c as in (2.5)-(2.7)[but based on the Yj, 0 < j < c ] and let T += max Toj and 1< j ( c T * = max 1 <j<c

ITd.

(3.14)

Then, we may proceed as in Section 2.3 and provide simultaneous tests and confidence regions for all 0j, 1 < j <c. Since, here all the sample sizes are equal, (2.24) will reduce to

[2n/(1-1))]i/2A~nlT+---~m+ c and

[2n/(1-O)]
(3.15)

where m + and m* are respectively the upper 100a% point of the distributions of the maximum and maximum absolute values of (~h ..... %) following a multinormal df with mean vector 0 and cov(~-i,~j)= 1 or I according as i=j or not. Further, it follows from Sen (1968) that the constant ~ is /> - 1 / ( c - 1), so that 1/(1 -O)>~ ( c - I)/c, and hence, from (3.15), we have large n,

[2n(1--c-1)]l/ZA~ 1T+ <<-m+~,c and

[2n(1-c-')]l/2A-1T*2,.~ <m~,c*
(3.16)

and using tables by Krishnaiah and Armitage (1965) and (3.16), simultaneous tests and confidence regions can be provided. Alternatively, ~ can also be estimated [viz., Puri and Sen (1968)] from the sample and the estimator be substituted in (3.15) to get the approximate values of

682

Pranab Kumar Sen

(T~+, 1~). It may also be remarked that instead of the so called linear rank statistics Toj, 1 <j <<c, one m a y also use the signed rank statistics
t/

Lj=n -1 ~_, sgn(Y~)au(Rif ),


i=1

I <j<c

(3.17)

where Y~= Yio- Yo, 1 < j < c , 1 < i < n and R + is the rank of IY~I a m o n g I Y~-I ..... [ Y.}i. The procedure remains the same; for some special cases, see Hollander (1966).

4.

Simultaneous inference for the M A N O V A models

As in the A N O V A models, we shall deal here separately with the different problems arising in M A N O V A models.

4.1.

Paired-sample problems

Let (Y/,Z/), i = 1..... n be n independent 2p-vectors (for s o m e p / > 1) and in a variety of problems (e.g., Y = b e f o r e treatment response, Z - - a f t e r treatment response), we are interested in the set of differences X~ = Y~- Zi, i = 1..... n. We assume that X 1..... X, are independent p-vectors with a c o m m o n df F0(x ), defined on the p-dimensional Euclidean space, where 0'= (01 ..... Op) is the location vector. It is assumed that

Fo(x ) =--F(x - 0),

(4.1)

where F is diagonally symmetric about 0. Our goal is to develop some simultaneous tests for Ho: 0 = 0 (vs. H : 04=0) and also to provide simultaneous confidence regions for 0. The theory is adapted from Sen and Purl (1967) and Puri and Sen (1968). Let S = (S 1..... Sp)' where

Sj = !

n i=1

~ (sgnX~i)a~j(Ro+),

1 <.j<p,

(4.2)

and X/=(Xil ..... Xip), 1 <i<~n, Ri~ = r a n k of IX~jl among ISvl ..... the scores a~j(i), i= 1..... n are so chosen that

IX.:l

and

a*j(i)=an,j((n+i+ 1)//2),

1 <<.i<n,

1 <~j<p,

(4.3)

where for each j ( = 1..... p), a n , j ( / ) = - an,j(n- i+ 1), l < i < [(n + 1)/2], so

Nonparametric simultaneous inference for some M A N O V A models

683

that tT.,j= n - 1]~7=l a,,,j(i)= O. Also, let Vn = ((v.jk)) be defined by

Vnjk -- n - ' ~ ai~,j(Rij* + )a,,k(Rik*


i=1

+) sgnXo.sgnXi~,"

(4.4)

for j, k = 1..... p. Finally, let


~n = I'lS: V n - S ,

(4.5)

where V,- stands for a generalized inverse of V,. Special cases of Sj in (4.2) are the Sign Statistics (where a*,j.(/)= 1 for all i,j), Wilcoxon signed-rank statistics (where an,:(i ) = ( i - (n + 1)/2)/(n + 1) V i,j) and the normal scores statistics (where a*d(i ) is the expected value of the ith smallest observation of a sample of size n drawn from the chi distribution with 1 DF). Note that by definition in (4.5),

~.>

1 ,~j,~p

max {nSj2/v.~),

(4.6)

and further [see Sen and Puri (1967)], . has asymptotically (under H0:

0--O) chi-square distribution with p DF, so that

e{,, I/2

1//2

v 1

<<.j<p[Ho} >1 l - a ,

(4.7)

when n is not small. [For small n, permutation distribution of ~, is available with Sen and Purl (1967)]. A simultaneous test for 0 = 0 consists in rejecting those components (i.e., 0 j = 0 ) for which [Sjl >/ Xp, avln/2/ ~/ rt, 1 < j < p. Suppose now that in (4.2), we replace the X o. by X 0 - b , for 1 ~<i < n, denote the corresponding ranks (of the absolute values) by Ro+(b) and the resulting statistic by Sj(b), for j = 1..... b. Then, when anj(1 ) < "-" <a,d(n ), we have Sj(b) is non-increasing in b ( - m <b < m), 1 < j <p. (4.8) Thus, as in (2.10)-(2.11), if we let 1/2/ ~ / n } , ~ , L = i n f { b : Sj.(b) <-< Xp,~%j: 1/2 4 , i = s u p ( b : Sj(b)/> --Xp,aVnO. / ~ / n } , then from (4.7), we obtain that far large n, (4.9)

P(4,L <0j <4,o, Vl <j-<pl0} >: l-a,


which provides the desired simultaneous confidence region for 0.

(4.10)

684

Pranab Kumar Sen

The large sample feature of (4.7) can be eliminated by the use of the

Bonferroni inequality. Note that under 0 = 0, marginally each Sj is a


distribution-free statistic with a distribution symmetric about the origin. Let a* = alp and choose Sff ), such that

P{ISjl ~s~10j=0} > 1-~*,

v 1<j<p.

(4.11)

Then, P{lSjl~<s~), v l<j<p[Ho}>~ l - a , and for the computation of S(j), we may refer to standard Statistical Tables. Thus, we may proceed as in (4.7) through (4.10), replacing everywhere Xp,,~vl/2/~/n by S ~ ), l~<j <p. This will be a strictly distribution-free procedure. Both the procedures described above have one common feature: they provide simultaneous inference for all the coordinates of O, but not on all possible {lO, l.~O}. This drawback is primarily due to the lack of invariance of ranking (in the multivariate case) under transformations: X* = BX, when B is not diagonal. However, a large sample solution can be provided, which meets this requirement. As in (2.13)-(2.14), we equateSj(b) to 0 and denote the estimator by 0., 1 < j < ^p " Also, let ^Si..~--Xi.--O, for 1 <i<n, ^ J Y Y^ l < j < p and let R ~ = r a n k of Ixo I among Ixul ..... [x.+l, for l < i < n ; 1 < j <p. In (4.4), we replace X 0 and R + by )~/j a n d / ~ + , respectively, and denote the resulting quantities by ~3,jk;j,k = 1.... ,p. Let then 1/2 Bj=2Xp,,vn~y/(n'/2(~,,-~,L)),
t=(('Yjk))ffi((~/

l <~j<p;

(4.12) (4.13)

BjBk))j,k=l ..... p.

Then [cf. Puri and Sen (1968)], it follows that for large n, @ x~, 2 n(6- 0 ) ' t - ( 0 - 0) ~ where 0 = ( 0 p . . ., 0p). ^ ' Hence, as in the Scheffe-method, for large n, (4.14)

e {It'(0(l'Fl)l/2xp, a.

0)1 < (rtt)'/2x,,o, v l ~ o } ~. 1 - ~,

(4.15)

and (4.15) provides a simultaneous confidence region for all (l'0}. Also, it provides a simultaneous test: reject those {1'0=0} for which [!'0[>

4.2. One-way layout MANOVA models


As in Section 2, we conceive here of independent random vectors Xtl ..... Xi,~ having a common p-variate df F i, for i = 1..... e (1> 2), where
Ft.(x)=F(x-Oi) ,

i = 1 ..... c;

Fcontinuous.

(4.16)

N o n p a r a m e t r i c s i m u l t a n e o u s inference f o r s o m e M A N O V A

models

685

The null hypothesis to be tested is Ho: O~. . . . . alternatives that at least one O_/is non-null. We denote Xo.=(X~I) . . . . . X~e)) ' for l < j < n i , X}*)=(X/(i0 . . . . . X/(~)) and

O~ = 0 against the set of I < i < c . Consider the sets (4.17)

X}:)=(X~{ )..... X(i:2,)

and as in (2.7), define the corresponding two-sample statistic by T/(/0 for s = 1..... p,

1-~,l-7/=t <c.

(4.18)

Note that here the c o m b i n e d sample is of size hi+ n i, and the score function an:(i ) m a y d e p e n d on s ( = b. .p). Further, we let

t?ss"ii" = h i + h i '
n i,

j=l

~, a~+,,.,,(R O. )a,~+,~.,(Rij )

n~

(~)

(s')

+ s=l y'

a ,~+,,,s~ [R(~)~a (R ()~1 o : ,~+,,,,,'~ i~ : ]

- ff~,+,~,(s) a~ +,~,(s')

(4.19)

for s,s'= 1..... p, where R,:/(: ) is the rank of X,:}(: ) and Ri~') is the rank of X,~ ) in this combined sample of size n i + n~, (on the S th character). Also, let V,c = ((v,.~c)) for 1 < i < i ' < c. First, we consider the case of equal sample sizes i.e., n~ . . . . . n= n. N o t e that in this case, v, .=,,(n) depends only o n n , for l < s < p , (4.20)

where v}, ~) are non-stochastic quantities. (a) Procedures based on the Bonferroni inequality As in (2.9), we let W~S)= max
l<i<i,<c

(2n'/2[T(i[,)l/[v~')]l/2 I,
~ ~ :

l<.s<<p,

(4.21) (4.22)

W * = m a x { W . ( ' ) : 1 <s<.p} Then, as in Section 2.1,

and

a*=a/p.

P { W~~) < %.~., V 1 < s <P]H0} > 1 - a,

(4.23)

where %,~., as defined after (2.9), can be approximated by qc,~* (also defined there). Thus, if we let 0j=(0j (1)..... 0j(P)), 1 <j<<c, then a simulta-

686

Pranab K u m a r Sen

neous test consists in rejecting those components (i.e., accepting 0~(~)5a0~9 )) 1 ,.~ h~ ~ ( n )/,,J / . l l / 2 Moreover, replacing c~ by c~* in for which T(S) -~i .... ~-~ ,,,.t (2.10)-(2.11) and repeating the steps (2.10) through (2.16) for each character s and denoting the resulting quantity in (2.17) (for the s th character) by H~(;'~., s = 1..... p, we have parallel to (2.18)

,{

n,ot*
i=1 i=1 i=1

(4.24)

<
i=l

z,~!~)+~-..~.
i=1

IZ,l, V l l , s = l

..... p

>1-~,

which provide a simultaneous confidence region for all possible {Z~=~liOi:/I}; a dual simultaneous test consists in rejecting those cone 1 liOi=/=0) for which IX~.=1li/~}. l r_/(s) N'c trasts (i.e., accepting ~"i= ~)1t> ~ n , et.~..,i= ll/i[. (b) Roy-Bose type procedures Define
~ii,= 2rl-1) Tz~,ViiT1Tii ,, l -.~.t < l
~<, (4.25)

where T/i,=(T~ 1), ,,, ,

, T.(.Ph ' . , , Let then max


~ii'.

~*=

1 < i < i ' ,~c

(4.26)

Note that the f~ii' are not all independent, so that E * is really the maximum of a set of correlated quadratic forms. For small values of n, one may adopt the Chatterjee-Sen rank permutation principle as in Puri and Sen (1971, Ch. 5) and venture to enumerate the (null hypothesis) permutation distribution of ~*. The procedure becomes prohibitively laborious as n C becomes large. However, for large n, { Eii,, 1 < i < i' < c } have a ( 2 ,)-variate correlated chi-square distribution, and hence, one may use the recent results of Khatri, Krishnaiah and Sen (1977) to evaluate E*, the upper 100a% point of the null distribution of E*. Some nice bounds for the same may also be obtained as in Siotani (1959, 1960). Once E* is obtained, we may proceed as in the Bonferroni case, where we need to replace con,~, (or q,,.) by E*. Note that E*>~maxl<s<pmaXx<i<i,<c[(2n-1)(T/~9}Z/v}~)], and hence, (4.23) and (4.24) can readily be modified. For typical values of a and c, E* is usually smaller than c0~,~. (or q 2 . ) , a * = a/c, so that, at least for large sample sizes, the width of the confidence regions will be smaller for this procedure than in the earlier one. Similarly, (b) should yield a more powerful test than in (a).

Nonparametric simultaneous inference for some M A N O VA models

687

Both (a) and (b) suffer from a common drawback that if we have in mind more general contrasts of the form

~liBOi,
i=l

I_L1,

B non-singular,

(4.27)

then the simultaneous tests or confidence regions apply only to the case where B is diagonal. However, a large sample solution for general {B} may be obtained as follows. (c) Schefj~-lype procedures Suppose that in (4.17) and (4.18), we work with X~(~)+ b l and X,.! "~ and denote the corresponding statistic as T~},~)(b) for s = l ..... p, 1 <<.i<i' <<.c, - c e < b < oc. If the scores a,,s(i ) are non-decreasing in i (1 < i <n), then as in Section 2.3, T~}:)(b) is non-decreasing in b ( - ~ < b < m), and hence, as in (2.13) and (2.14), equating T,.~,')(b) to 0, we obtain the estimator ~}~,) of A~])= Oi(~)- Oi!~), for i v~i ' = 1..... c and s = 1.... ,p. Also, we proceed as in , I: (n)/ ~I/2 (2.10)-(2.11) and, equating T,.5:)(b) to z%,~.~tv L /n) , obtain the confidence limits /~}:!L and /~(0.,v for 1 < s < p and 1 < i < i ' < c . Let then 2 (4.28)

B s = e ( c - - l)

'~ Oan,a.[l)~n)]l/2/{.1/2[~(s),, l <i <i" <c

~'tii', U-~(s)ii',L]J~l

when n I . . . . . no-- n. Otherwise, we replace v,(~ ') by vss,ic and in (4.28) as well as in the construction of ~}]),L and -ic, ~'~ v we replace %,~. by ~, the upper 50a% point of the standard normal df. Also, let

l ~i<i" ~c
=

..... .;

(4.30)

Then [cf. Puri and Sen (1971)], it follows that for large N, (4.31)

~/ (Ai. -- 0/(s))(/~ :v') -- 0/(s')) -"> d ( c - 1 ) , i=l s~l s'=l

where ((~ s,')) = ~,- ~ and /~!:>=c-' ~ ~!:;)(/~!/>=0),


i'= 1

forl<<s<p,

l<i<c.

(4.32)

688

Pranab Kumar Sen

F r o m (4.31) and (4.32), we have for large n l . . . . . n~,

sup sup

iJ_l b=/=O i~l

lib(Ai.-Oi) / ~ li2/ni
i=l
=
i=l

[b'Fb] '/z

=
1),.,

ni E

.'~'~'(h(s),-i. --

s=l s'=l

0}~))(/~!-s')--0}#)) ~<Xp(c 2

(4.33) with probability 1 - - a . Hence, a v a n a t e as well as sample wise simultaneous confidence region is

libt(~i
i=1

< Xp(c-l),a

E -/i2 i=1 ni

1/2 {b'i~b}

1/2,

V tl,

l,~O,
(4.34)

and the same can be used to test simultaneously for a n y n u m b e r of contrasts of the f o r m
P

~= "~ ~', libsO} ~


i=1 s = l

where

~. li=O
i=1

and

b~O,

rejecting those ~ (as different f r o m 0) for which

I~l = ~ lib"/~t
i~l

> Xp(c-1),a

~ i=l

li2//ni

)1j2 btrb)l/2o

(d) Treatment vs. control procedures As in Section 2.3, we m a y also consider multivariate procedures, where we introduce a control sample X01 . . . . . X0n with a df Fo(x)=-F(x), so that 0~,...,0 c stand for the vectors of the treatment effects. In this context, we desire to test the hypothesis that 01 . . . . . 0 c = 0 vs. at least one of 0~. . . . . 0 c being non-null, and, in general, for some

~= ~. ~. lib,O(iS~4=O where
i=1 s ~ l

i=/=0,

b=/=O.

(4.35)

As in (4.17) a n d (4.18), we c o m p u t e T0i, i = 1 . . . . . c (4.36)

Nonparametric simultaneous inference for some M A N O VA models

689

and define the ~0i, i = 1..... c by


~ i ~- [/'/0(/70 + ni -- 1 ) / n i l Toi VoTToD

] < i <c.

(4.37)

Also, let E*=max{~;: 1<i<c}. (4.38)

As in the Roy and Bose type procedures, f~* is the maximum of a set of correlated quadratic forms, where the results of Khatri, Krishnaiah and Sen (1977) may be applied to approximate the value of E*, the upper 100~% point of the distribution of E * (under H0). Alternately, we may also use the Bonferroni inequality and use the statistics in (2.22) for each s ( = 1..... p); this will result in a simpler solution. But, both of these will workout for diagonal B only For general B, we need to use the Scheff6type procedures, as has been outlined in (c). We need to replace in (4.33), X);(c-l),~2 by Xpc,,2 and also ~i. by {Ji' defined before (4.28).

4.3.

Two-factor MANOVA models

As in Section 3, we conceive here n blocks of c (>/2) plots each where c different treatments are applied. The response of t h e j tn plot in the ith block is (a p-vector) denoted by X~j and we have the model (3.1) where/~, ~i' Oj and e0 are all p-vectors and the distribution of %. is ~. We want to test H0:01 . . . . . Oc=0

vs. at least one Oj ~:O,

l<j<c.

(4.39)

Like the one-way layout problem, we consider here the following. (a) Bonferroni procedures Denote --y X/. = (Xig) y , "'" , X )p)) for l < i < n, 1 < j -<<c and as in (3.3), (3.4), let y/}s)= X,.}S)_c-l~t=xXi} s), l<<.j<<.c, 1 < i < n ; Yj(s)=(YI(})..... Y(S)) for 1 <<.j<c, 1 <<.s<<.p. Define then Tj~)=T(Yj(s),Yj(,~)),

l <j=/=j' <c,

l <~s<~p,

(4.40)

as in (2.5)-(2.7). Let then W~ (s), 1 <s < p and IV* be defined as in (4.21), (4.22). Then by (3.6) and (4.23),
P { m ~ n <G3n,ct.lHo} ~ l - o L ol*=ol/p.

(441)

Thus, virtually, we may repeat the steps in (a) following (423). (b) Roy-Bose procedures With the definition of the T,.~, ~) in (4.40) and V,, as in (4.19) (but based on the Yj(~)), let V = ( 2 ) -~
C 1

~
1<i<i'<c

~j,.

(4.42)

690

Pranab Kumar Sen

Let then ~ f = (2n - 1) 7~.,V - T~,, l < j < j ' < c ; ~* = max(~j.: 1 < . j < j ' < c ) . (4.43) (4.44)

Then, we may proceed as in (b) of Section 4.2. (c) Scheff6-type procedures. We proceed as in (c) of Section 4.2. Here, we need to replace X,.(s) by Yff) = (~(~) ..... y9), for 1 < j < c, 1 -K< s < p, define T~(.:)(b) as in there, and then virtually repeat the subsequent steps. The only difference here ties in (4.31). Actually, here the left hand side will be dominated by a random variable which has asymptotically chi-square distribution w i t h p ( e - 1) DF. As such, in (4.33), the probability is /> 1 - a, so that the rest remains the same. (d) Treatments vs. control procedures. We proceed as in Section 4.2(d) and, based on the Yj, O<<j<<.e, define the T0j, l<<.j<<.c. Here (4.37) simplifies further as n o . . . . . n = n, while we may replace V0; by V, defined by (4.42). The rest of the procedures remain the same as in there.
4.4. G a b r i e l - S e n procedures

In Section 2.2 dealing with the univariate problem, we have observed that the over all ranking, used in the construction of the T~ in (2.19) depends on all the samples. Thus, in (2.21), even when we compare T~ and T;,, T~- T;, may be affected by the remaining c - 2 sample observations and the same feature is true for any subset of samples as well as in the multivariate case. To compare a subset of samples, we desire to use a statistic which should not be affected by the sample observations not belonging to this subset. For this purpose, we define the T,, as in Section 4.2, for 1 < i < i ' < e and also let
1 n~

Dss"i

= ~ii j = ~ l a hi'S\ (R(.:)~a ( R ~ : ' ) ~ - a,~,~a~:,, Y ] ni,s"k tj ]

(4.45)

for s , s ' = 1 . . . . . p where R'/~ ~) is the rank of X~") among Xi?), ,"in,J(~), for 1 <<.j<Ji, 1 < s < p and i = 1..... c. Let then
Vi=((G~,,i))

for i-- 1..... c.

(4.46)

Denote the group of all c samples by GO and any subgroup containing ce (1 < c e <e) samples by G e. Also, let S Obe the group of all t h e p variates and let s. be any subgroup containing Pa (1 < p . < p) variates. Let
Ne ~" E ni

i~G~

Nonparametric simultaneous inferencefor some MANO VA models

691

and V/(~ = minor of V~containing only the rows and columns E S~. Further, let He~ stand for the hypothesis that with respect to the p~ variates (E S~), the c~ samples ( E G~) have the same treatment effects. To test for this hypothesis, we use the statistic (4.48)

e.a =
where

(i,i')EG~ (s,s')~S a

ni'2N

T ! J ) ' 3 SS'T-(~ ')

((v~"))=inverse of Vi ("),

i = 1..... Co

(4.49)

Note that ff~ is completely unaffected by the within sample variabilities as well as by the samples not belonging to Ge or the variates not belonging to Sa. Moreover, if Gf c_ G~ and S b C S,, it can be shown that ~ t> ~. Thus, we have ~, < ~0. (4.50)

It follows from Gabriel and Sen (1968) that under H0, ~ has asymptotically chi-square df with p ( c - 1) DF and for small n 1..... no, the permutation distribution of ~ can be enumerated by using the Chatterjee-Sen rank permutation principle. Thus, there exists an o , such that P{0~<~,~lH)=l-a where ~,~--~X~(c_l),~. (4.51)

Then, a simultaneous test procedure of level e~ consists in accepting or rejecting each H~ according as ~e ~ is -<< or >~o,~. (4.52)

Operationally, the procedure is simple and it has the advantage of having the flexibility of adjusting to any number of samples or variateso The discouraging side of this procedure is that under H~, ( N / N e ) ~ (but not ~) has asymptotically chi-square distribution with p~,(ce - 1 ) DF and under local alternatives, it has a non-central chi-square df with the same DF and an appropriate noncentrality parameter. Since N / N e > 1 for every proper subset of Go, power-wise this procedure may not be very good. This drawback is primarily due to the fact that if in (4.48) we replace N by Ne and denote the resulting statistic by ~ , then (4.50) may not hold.

692

Pranab Kumar Sen

The procedure described above can be extended to the two-way M A N O V A models as well. For this case, define the T~(.: ') as in (4.40) and ~ as in (4.19) (but based on the Yff)). Then, (4.48) simplies further as n 1. . . . . nc=n, while, it can be shown that (4.51) holds with P{E0< ~,,[H0} > 1 - a . The rest of the procedure remains the same. In either case, ~ may also be used to provide a simultaneous confidence region for all treatments. 5. Simultaneous inference in MANOCOVA problems

The procedures developed in Sections 2, 3, and 4 for the A N O V A and M A N O V A problems are extended here to cover the M A N O C O V A problems. The theory is mainly adapted from Sen and Krishnaiah (1974)o

5.1.

MANOCOVA models in one-way layouts

Let zi(k) = ( Yi(k), Xi(k))' = ,, [ y(k)li , ' ' ' , Ypi(k), X li(k),..., Xq(ik)), i = 1,.. ., nk be n~ independent random vectors with a c o m m o n (p + q)-variate df Hk(z ), for k--1 ..... c ( > 2 ) ; all these N = n ~ + . . . +n~ vectors are assumed to be independent. The q-variate marginal df of Xg(k) is denoted by Gk(x ) and the p-variate conditional df of y(k) given X~(k)=x is denoted by Fk(y[x ), k = 1..... c. We assume that

Gk(x)=G(x )

and

Fk(ylx)=F(y--Ok]X),

l<k<c

(5.1)

where F and G are unknown continuous dffs and 0~..... 0~ are the unknown treatment effects (p-vectors). The assumption that G~ = G insures that the covariates are not affected by the treatment--which is a basic assumption in the MANOCOVA. As in Section 4.2, our problem is to test for H0:01 . . . . . 0~ = 0 (5.2)

against H: 0 j ~ 0 for at least one j ( = 1 .... ,c) and we like to develop simultaneous testing procedures for this problem (as well as to provide the allied simultaneous confidence regions). Based on y(s) (=:v(~) v(~)~ and Yy), we denote the corresponding ~*il , ' ' ' ' ~ i n iff rank statistic [as defined by (2.7)] by T~9, f o r s = l ..... p and

l<i<i'<c.

(5:3)

Also, based on X (r) ( = tv(r) by T,.,*.: r), f o r r = l ..... q

v(r)'~ and Xi!r), the rank statistic is denoted


and

l<i<i'<c.

(5.4)

N o n p a r a m e t r i c s i m u l t a n e o u s inference f o r s o m e M A N O V A

models

693

Also, as in (4.45), for the ith sample, we obtain a covariance matrix

(5.5)
for the entire set of (p + q) rankings. Let then

(5.6)
where VN,U is of o r d e r p p , VN,12= V),21 is of o r d e r p q a n d V-N, 22 is of order q q. Finally, let

(5.7)
Then, the covariate-adjusted rank order statistics are defined by

(5.8)
where T/i,=(T/}, 1), (P)' T/i, ) and /~.-..~=tT ii" ~ *O) ii,, . . . . . T*(.q.)Y ,',, analogous to (4.21), we define
i<~t<t-~c.

Now,

(5.9)

(5.10)
Then, it follows from Sen and Krishnaiah (1974) that (4.23) holds. As such, we may proceed as in Section 4.2, i.e., a simultaneous test consists in 1 . it:.,/n~l/2 rejecting the equality of those 0 i and 0 c, for which [Tffs)] > iwn,~.tv, s/ ) , for some 1 <s <p. To obtain results parallel to (4.24), we proceed as follows. Suppose that we replace Y/(') by Y~(S)+bl and compute T~},~)(b) [as in before (2.10)], for 1 < s < p , 1 < i < i ' < c . Let then

(5.1 1) where T i i , ( b ) = ( T i } , l ) ( b l ) . . . . . Zi},P)(bp))', for l < i < i ' < c . We proceed as in (2.10) through (2.18) and replace everywhere T , , ( b ) by TffS)(bs) and denote the resulting quantities by ~ ~(s) /~(0 (s) for s = 1,...,p. i i ' , t ' ~.~ tt', U' ~ u ' ~ /~.~) and H n,a~

694

Pranab Kumar Sen

These are the covariate-adjusted estimates. With these covariate-adjusted estimates, (4.24) also holds for our M A N O C O V A model. For the R o y - B o s e type procedures, we define

Eii" = ( 2 n - 1)7;'V*--T/~,

1 <i<i' <c,

(5.12) (5.13)

E*=max{~ii,: 1 <i<i' < c ) ,

then the rest follows as in Section 4.2(b). We need to replace there T/~, ~) by T,.~~) and v~ ) by ~*. For the Scheff6-type procedure too, we proceed as in Section 4.2(c), where we need to use the covariate-adjusted estimators [as defined after (5.11)] and, in (4.28) and (4.29), we need to replace v~;) by 6". The rest remains the same. The modifications for the treatment vs. control procedures are the same as in the case of the procedure based on the Bonferroni inequality. Finally, the modifications for the Gabriel-Sen type of procedures are also apparent: replace V/by V/*= Vi, ll- V/,12Vi,22 V/,21 and the T,, by T~ , 1 <~i<i'<c.

5.2. MANOVA models in two-way layouts


We conceive here of n blocks of c plots each where c different treatments are applied and the response Z~j of the jth plot in the ith block is a (p + q)-vector, consisting of p primary variates (Y/j) and q covariates (X0.), for l < j < c , 1 <<.i<<.n, where we assume that for each i, X~I..... Xic are independent and identically distributed, while the conditional df of Y~j given that X/j = x is given by F/j(YlX) where

F,j(ylx)=F~(y-Ojlx ), l<j<c,

l<i<n

(5.14)

and, we desire to test H0: 0~ . . . . . 0 c = 0 against the set of alternatives that 0j 4=0 for at least one 1 < j < c. Under the additional assumption of the additivity of the block effects, we have

F,.(y-Oj[x)= F(y-[~,-Oj[x),
In this case, we define

V i,j.

(5.15)

j=l

Yij,Xij),

1 <~j <<c, 1 <~i <<n,

(5.16)

The purpose of this alignment is to eliminate the block effects from the primary as well as the concomitant variates. Based on these aligned

Nonparametric simultaneous inference for some MANOVA models

695

vectors, we define as in (5.3), (5.4) and (5.5), ~.},~)= T ( ~ ('), ~f)), T~*(r)-ii, T(P~!r),Xi(r)) and z(0 define then I?~ as in (5.7) with V~ in (5.6) replaced by ~., 1 < i < c and in (5.8)-(5.9), replace T~ by ~0 based on T,.c and T~;*,,for 1 < i < i ' < c . Similar replacements are necessary with (5.11), (5.12) and elsewhere. With these modifications, the rest of the procedures sketched in Section 5.1 remain the same for the two-way layout as well.
-

6.

Some general remarks

In this concluding section, we like to touch upon two additional type of simultaneous inference procedures along with certain general remarks. One of the flexibilities of the S- or T-methods of multiple comparisons [as has been pursued in earlier sections] is their ability to include infinite number of contrasts for making simultaneous tests or providing confidence regions. On the other hand, this flexibility sometimes makes the simultaneous tests somewhat less powerful (or the width of the confidence intervals somewhat larger) than the Bonferroni procedures. Indeed, in many practical problems, we may not be interested in the totality of all possible contrasts, but in a finite set of (linearly independent) contrasts. For such a case, alternative simultaneous inference procedures can be provided which are more powerful. For the parametric model, we refer to Krishnaiah (1969) for a useful discussion of these procedures. We shall consider here some nonparametric analogues. Secondly, we shall also study some sequential procedures, essentially due to Ghosh and Sen (1973). 6.1. Sen-Krishnaiah tests

We illustrate these procedures with reference to the M A N O C O V A model in Section 5.1; the modifications for the two-factor layouts are similar to those in Section 5.2. Also, the M A N O V A model follows as a special case (taking a null set of covariates) and the A N O V A model relates t o p = 1. With reference to (5.1), consider a set of r (/> 1) linearly independent contrasts (vectors) ~(!,), s = 1..... r, where

~p(/s)= ~ lskO~,
k~l

lsl ,

s = l ....

,r,

(6.1)

and let H

l n

" n

g r

and A = A l U " " U A r where As:tp(Is)@O, l<s<r. (6.2)

Hs:~b(l~)=O

and

696

Pranab Kumar Sen

Define z~i., 1 <~i<c as in after (5.11) and let ~(ls) = ~


k=l

ts~&~., l < s < r .

(6.3)

Further, as in (4.28), we define the B s, 1 <s <p, but in this case we use the covariate-adjusted statistics TffS)(b) to derive the estimates/~(s) ~ii', U and/~(~ ~ii', L 1 < i < i' < c. Also, define V~ as in (5.7) and let I'* = (('~*,)) =

((~*,/BsBs.));

(6.4) (6.5)

C=((C~.))=((~=l~kls.k/n~) ).
Consider then the quadratic forms

O,=([~(,,)]'(F*)-'[~(lp)])/c~s~

1<s<r.

(6.6)

Then, as in Sen and Krishnaiah (1974), we conclude that Q=(Q1 ..... Q~)' has asymptotically a (r-) multivariate chi-square distribution, and hence, there exists a Q*, such that

P{ Qs<Q*, V 1 <<.s<r[H} >~ 1-c~.

(6.7)

[See Khatri, Krishnaiah and Sen (1977) for the df of Q.] Then a simultaneous test consists in rejecting H~ in favor of A s for those s only for which Q,>Q*; otherwise, accept H~, 1 <s<r. The total hypothesis is accepted only when all the H, (1 <s <r) are accepted. The associated simultaneous confidence intervals for t'~(l~), t~=O are

[t'[~(ls)-Lp(l,)][<{Q*c,,(t't*t)} 1/2,

V l<s<r,

t4=0

(6.8)

We may also use a step-down procedure. Let (~) be the principal minor of C [in (6.5)] comprising of the firstj rows and columns, and f o r j > 2, let

Cj=

C~

e f f = c i / - ~ ' ,~.Z]cj_ ,, e~'~=c,v Let then ~=N~/2[~(tO-~(t,),...,~(IA~p(/)]', l<j<<.r. Then, W 1 is asymptotically normal with mean 0 and dispersion matrix (Nc,,I'*), while given Wj_,, N 1/z(~(Ij.)- ~p(/fl) is asymptotically normal with (conditional) mean (~'_,Cj21I)Wj_, and disper-

Nonparametric simultaneous inference for some M A N O VA models"

697

sion matrix (Ncy~['*), for j = 2 ..... r. Thus, if we let

~j=~p(ly)-(cj_,CjS_ll)[~b(l,)', .... ~b(/j_O' 1, 2 <~j <.Nr,

(6.9)
then H and A in (6.2) may be equivalently written as

H=H~r~...NH*

where

l-ij*:~lj=O,

l<j<r,

(6.10)

and A-A ~ u ... u A~* where As-*: ~lj vaO, I <<. j <.< r. We let ,ij=~(~)-(~' ,C~_,X)[~(I,) ', .... ~(~._,)'],

2<j<r,

= 4;(t,);
(6.11) (6.12)

-,',-1 Q?=(njr

1 <~j<~r.

Then, under H, Qf ..... Q* are asymptotically independent, each having chi-square df with p DF, so that a Q** exists for which

P{Q: <Q**,

V 1 <<.j<<.r[H} - , 1 - a,

(6.13)

and hence, an asymptotic simultaneous test consists in rejecting H~" if , if Q~'-<<Q~ **, then proceeding on to Hi': if Q~' > Q2 Qf >Q~**" **, rejecting HJ', otherwise, proceeding on to Hi', and so on. Here, the ordering or the indices (1 ..... r) is important. Also, we have the following simultaneous confidence intervals

it.(.b_.)

t.<< [Q~* * .cjj(tF t)]

1-<<j<r,
.

t~=O,

(6.14)

which have an asymptotic coverage probability 1 - a .

6.2.

Sequential simultaneous confidence intervals

In the parametric case, the problem of bounded length simultaneous confidence intervals has been studied in detail by Healy (1956), Chatterjee (1962), and others. Chow and Robbins (1965) have developed an asymptotic method which remains valid for a wider class of distributions. Sen and Ghosh (1971) have considered a similar procedure based on rank order statistics. Ghosh and Sen (1973) have studied the problem of simultaneous confidence intervals for certain ANOVA models. We present here their results in a slightly general framework. We explain this procedure by reference to the one-way MANOVA model in Section 4.1. Modifications

698

Pranab Kumar Sen

for the two-way layouts or the M A N O C O V A models are quite apparent, while the case of the A N O V A models follows by letting p = 1. As in (2.17), we compute H~(,~], for 1 < s < p , where a * = a l p [See (4.22)]. Then, in the T u k e y - B o n f e r r o n i scheme, we conceive of a stopping-variable N = N(d), defined by

N(d)=min{n:

l<s<p

m a x H(S'.<<.d},
n,a

d>0.

(6.15)

We m a y note that for n = N(d), the width of the confidence interval for c 1 c Y'i~ l liOi in (4.24) is ~<(2d)(~Y.i= l[lil), V lvaO, and by arguments similar to those in Ghosh and Sen (1973), it can be shown that the coverage probability for this sequential confidence region is > 1 - a when d is chosen small. A similar modification can be made for the S-method, treated in Section 4.2(c). To m a k e the dependence of the estimate F on n [see (4.28)-(4.30)], we denote it by ]~n, where we take n I . . . . . nc = n. Let then ^n* T - -Note that sup{ (b'Fnb)/b'b: b =/=0} = ~'*. Thus, if we let (6.17) largest characteristic root of

['n.

(6.16)

N = N ( d ) = m i n { n : n -'Xp(c2

I),aYn^* <d2},

d>O,

(6.18)

n = N(d) (and n 1 . . . . .

then from (4.34) and (6.18), we note that the right hand side of (4.34) for nc), is b o u n d e d f r o m above by

(l'l)'/2(b'b)~/2d,

V I~0

bvaO,

(6.19)

so that in (4.33), the maximum-width of any confidence interval is

2d(l'l) 1/2 . (b'b) 1/2. Again, along the lines of Ghosh and Sen (1973), it
follows that the overall confidence coefficient for this procedure (for small d > 0) is close to the preassigned 1 - a.

6.3. Some general remarks


Throughout this Chapter, we have presented simultaneous inference procedures based on general rank order statistics and derived estimators. One of the basic reasons for advocating the use of rank statistics and

Nonparametric simultaneous inference for some MANOVA models

699

estimates is their good robustness against outliers and gross-errors. The traditional parametric procedures are comparatively more seriously affected by the presence of outliers. Besides, the assumption of normality (or other specified forms) of the underlying distributions, as is customarily made in a parametric analysis, is often questionable in practical applications. The parametric procedures are generally not very robust against departures from the basic assumptions lying their valid applicability, and thus, are of usually limited scope in applicability. On the other hand, for the nonparametric procedures considered here, we do not need any specific form of the underlying distributions (only continuity or sometimes symmetry suffices), and hence, they enjoy a broader scope of applicability. They are robust. But, to meet this scope in full, one needs to compute the exact percentile points of the null distributions of various nonparametric statistics on which the simultaneous procedures rest. Indeed for many common form of nonparametric statistics, tabulations of these distributions for small sample sizes are available in the literature [viz., Owen (1962)]. For general rank statistics, these distributions can be enumerated by reference to suitable permutational invariance structures. However, these computations require so extensive labor that, even, with the advent of the modern computational facilities, such a task seems to be prohibitively laborious when the sample sizes are not small. For this reason, throughout the earlier sections, asymptotic values of these percentile points are mentioned side by side, so that in applications these asymptotic values provide the required solutions. One encouraging aspect (based on extensive simulation studies) is a common characteristic of the distributions of various nonparametric statistics: these are usually dominated by their asymptotic forms, so that the use of asymptotic percentile points results in conservatism [i.e., the actual level of significance is usually smaller than the specified a or the actual coverage probability is greater than the specified 1 - a] and does not affect the validity of these procedures. On the other hand, it leads to some loss in efficiency (or power); this loss is usually very small when the sample sizes are large. The computations involved in the parametric procedures are usually comparatively simpler than in the nonparametric case. For the nonparao metric procedures, we have used estimators (viz., A,, or A}.~)) based on suitable rank statistics. In some of the simplest rank statistics cases [ViZo, Median or the Wilcoxon statistic], exact expressions for these estimators are known. However, in general, these estimators are to be obtained by iterative procedures. A general rule is to start with the classical least squares estimators and to employ the Newton-Raphson iteration procedure to solve for the nonparametric ones [viz., (2.10), (2.11), (2.13), and elsewhere]. Usually, only a few iterations are needed to obtain these estimators up to the desired level of accuracy.

700

Pranab Kumar Sen

T h r o u g h o u t the C h a p t e r , we h a v e a s s u m e d that the u n d e r l y i n g d i s t r i b u tions a r e all c o n t i n u o u s (so t h a t ties c a n b e neglected, in p r o b a b i l i t y ) . I n practice, however, d a t a relate to ties, arising either due to r o u n d i n g off p r o c e d u r e s or due to the g e n u i n e discreteness of the u n d e r l y i n g d i s t r i b u tions. F o r such a case, again, the exact d i s t r i b u t i o n - f r e e n a t u r e of the n o n p a r a m e t r i c p r o c e d u r e s m a y n o t hold. H o w e v e r , b y the usual m i d - r a n k or a v e r a g e - s c o r i n g p r o c e d u r e s , m o d i f i e d r a n k statistics c a n b e c o n s t r u c t e d a n d the s a m e p r o c e d u r e s b e a p p l i e d . H e r e also ties i n d u c e conservatisrno Nevertheless, the v a l i d i t y is n o t affected. F i n a l l y , tables for the p e r c e n t i l e p o i n t s of the various statistics c o n s i d ered a r e only p a r t i a l l y a v a i l a b l e , even in the a s y m p t o t i c case. T h e r e is a definite need to g e n e r a t e m o r e tables, p a r t i c u l a r l y for the M A N O V A p r o c e d u r e s . T a b l e s for the p a r a m e t r i c cases, as r e p o r t e d in K r i s h n a i a h (1969) [or C h a p t e r 24 of this b o o k ] p r o v i d e the a s y m p t o t i c s o l u t i o n s for the n o n p a r a m e t r i c cases too. Extensive b i b l i o g r a p h i e s on the topics i n c l u d e d here are a v a i l a b l e with Miller (1966, 1977), a n d hence, we p r o v i d e h e r e w i t h a selected b i b l i o g r a phy, h a v i n g direct r e l e v a n c e to the results d e s c r i b e d so far.

Acknowledgments
This w o r k was s u p p o r t e d b y t h e A i r F o r c e Office of Scientific R e s e a r c h , U . S . A . F . , A.F.S.C., C o n t r a c t N o . A F O S R - 7 4 - 2 7 3 6 . T h a n k s are also d u e to the referee for his v a l u a b l e c o m m e n t s on the m a n u s c r i p t .

Selected bibliography
Chatterjee, S. K. (1962). Sequential inference procedures of Steins' type for a class of multivariate regression problems. Ann. Math. Statist. 33, 1039-1062. Chow, Y. S. and Robbins, H. (1965). On the asymptotic theory of fixed-width sequential confidence intervals for the mean. Ann. Math. Statist. 36, 457-461. Dwass, M. (1960). Some k-sample rank order tests. In: Olkin et al. eds., Contributions to Probability and Statistics, Stanford University Press. Gabriel, K. R. and Sen, P. K. (1968). Simultaneous test procedures in one-way ANOVA and MANOVA based on rank scores. Sankhyd Ser. A, 30, 303-312. Geertsema, J. C. (1970). Sequential confidence intervals based on rank tests. Ann. Math. Statist. 41, 1016-1026. Ghosh, M. and Sen, P. K. (t973). On some sequential simultaneous confidence intervals procedures. Ann. lnst. Statist. Math. 25, 123-134. Gupta, S. C. (1963). Probability integrals of multivariate normal and multivariate t. Ann. Math. Statist. 34, 792-828. Hfijek, J. and Siditk, Z. (1967). Theory of Rank Tests. Academic Press, New York. Healy, W. C., Jr. (1956). Two-sample procedures in simultaneous estimation. Ann. Math.

Nonparametric simultaneous inference for some M A N O VA models

701

Statist. 27, 687-702. Heck, D. L. (1960). Charts of some upper percentage points of the distribution of the largest characteristic root. Ann. Math. Statist. 31, 625-642. Hochberg, Y. (1974). Some conservative general~ations of the T-method in simultaneous inference. J. Multivar. Anal. 4, 214-234. Hollander, M. (1966). An asymptotically distribution-free multiple comparison procedure, treatment vs. control. Ann. Math. Statist. 37, 735. Hu~kov/t, M. (1975). Multivariate rank statistics for testing randomness concerning some marginal distribution. J. Multivar. Anal. 5, 487-496. Jensen, D. R. (1974). The joint distribution of Friedman's X~ statistics. Ann. Statist. 2, 311-323. Khatri, C. G., Krishnaiah, P. R., and Sen, P. K. (1977). A note on the joint distribution of correlated quadratic forms. J. Statist. Planning Inference 1, 299-307. Krishnaiah, P. R. (1969). Simultaneous tests procedures under general MANOVA models. In: P. R. Krishnaiah, ed., Multivariate Analysis--II; Academic Press, New York, t21-143. Krishnaiah, P. R. and Armitage, J. V. (1965). Tables for the distribution of the maximum of correlated chi-square variates with one degree of freedom. ARL-65-136, Wright-Patterson Air Force Base, Ohio. Krishnaiah, P. R. and Sen, P. K. (1971). Some asymptotic simultaneous tests for multivariate moving average processes. Sankhyd Ser. A. 33, 81-90. Miller, R. G., Jr. (1966). Simultaneous Statistical Inference. McGraw-Hill, New York. Miller, R, G., Jr. (1977). Developments in multiple comparisons. J. Amer. Statist. Assoc. "12, 779-788. Nemenyi, P. (1963). Distribution-free multiple comparisons. Unpublished Doctoral Dissertation, Princeton Univ. Princeton, N.J. Owen, D. B. (1962). Handbook of Statistical Tables. Addison-Wesley, Reading, MA. Pillai, K. C. S. (1957). Concise Tables for Statisticians. Statistical Center, Univ. Phillipins, Manila, P.I. Purl, M. L. and Sen, P. K. (1968). Nonparametrlc confidence regions for some multivariate location problems. J. Amer. Statist. Assoc. 63, 1373-1378. Purl, M. L. and Sen, P. K. (1969): Analysis of covarlance based on general rank scores. Ann. Math. Statist. 40, 610-618. Purl, M. L. and Sen, P. K. (1971). Nonparametric Methods in Multivariate Analysis. Wiley, New York. Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. Ann. Math. Statist. 24, 513-536. Scheffr, H. (1953). A method for judging all contrasts in the analysis of variance. Biometrika 40, 87-104. Sen, P. K. (1966). On nonparametric simultaneous confidence regions and tests for the one-criterion analysis of variance problem. Ann. Inst. Statist. Math. 18, 319-336. Sen, P. K. (1968). On a class of aligned rank order tests in two-way layouts. Ann. Math. Statist. 39, 1115-1124. Sen, P. K. (1969). On nonparametric T-method of multiple comparisons for randomized blocks. Ann. Inst. Statist. Math. 21, 329-333. Sen, P. K. and Ghosh, M. (1971). On bounded length sequential confidence intervals based on one-sample rank order statistics. Ann. Math. Statist. 42, 189-203. Sen, P. K. and Krlshnaiah, P. R. (1974). On a class of simultaneous rank order tests in MANOCOVA. Ann. Inst. Statist. Math. 26, 135-145. Sen, P. K. and Purl, M. L. (1967). On the theory of rank order tests for location in the multivariate one-sample problem. Ann. Math. Statist. 38, 1216-1228. Sen, P. K. and Purl, M. L. (1970). Asymptotic theory of likelihood ratio and rank order tests

702

Pranab Kumar Sen

in some multivariate linear models. Ann. Math. Statist. 41, 87-100. Siotani, M. (1959). On the range in multivariate case. Proc. Inst. Statist. Math. 6, 155-156. Siotani, M. (1960). Notes on multivariate confidence bounds. Ann. Inst. Statist. Math. I1, 167-182. Spjotvoll, E. and Stoline, M. R. (1973). An extension of the T-method of multiple comparisons to include the case with unequal sample sizes. J. Amer. Statist. Assoc. 68, 975-978. Steel, R. G. D. (1959). A multiple comparison sign test: treatment vs. control. J. Amer. Statist. Assoc. 54, 767-775. Steel, R. G. D. (1959). A multiple comparison rank sum test: treatments vs. control. Biometrics 15, 560-572. Steel, R. G. D. (1961). Some rank sum multiple comparison tests. Biometrics 17, 539-552. Tukey, J. W. (1953). The problem of multiple comparisons. Unpublished manuscript.

P. R. Krishnaiah, ed., Handbook of Statistics, Iiol. 1 North-Holland Publishing Company (1980) 703-744

~'~ "~

Comparison of Some Computer Programs for Univariate and Multivariate Analysis of Variance 1
R. Darrell B o c k and David Brandt
When applied to a suitably balanced experimental design, the Fisherian analysis of variance is unsurpassed in the unity of method, clarity of interpretation and efficiency of calculation it brings to data analysis. If the observations are limited to a single response variable, even large and complex designs can be easily analyzed with no other computing aid than a hand calculator. Moreover, the precision of the parameter estimates does not deteriorate as the number of effects in the design increase, and the interpretation of effects from different sources is not confounded. For a survey of the many uses to which this most elegant and practical tool of statistical analysis can be put, perhaps the best source is Cochran and Cox (1957). With the appearance of high speed, large capacity electronic computers in the 1960's, it became possible to entertain the idea of extending the analysis of variance technique both to studies with multiple response variables and to unbalanced designs requiring a general nonorthogonal solution. Most of the statistical theory required for this extension was already available in the work of Hotelling (1936), Wilks (1932), Bartlett (1947), Roy (1957), Anderson (1958), Rao (1952), and others (see Bock, 1975). With the formulation of the multivariate general linear model and its computer implementation, the analysis of variance technique became available in fields of research such as behavioral science, education, ecology, etc., where balanced designs are the exception and the occurrence of more than one outcome variable is the rule. For the present day statistical analyst who does not have the resources to prepare his own computing routines, a number of broadly useful general purpose programs (or procedures in larger data analysis systems) are now
ISupported in part by National Science Foundation Grant BNS76-02849 and a grant from the Social Science Research Committee, University of Chicago.

703

704

R. Darrell Bock and David Brandt

routinely distributed. Developed by different individuals and organizations, these programs differ in m a n y ways, some representing merely "cosmetic" aspects of input or output style, others bearing on convenience or generality of use, and still others actually affecting the numerical results produced. On the premise that the user should understand these differences before applying the programs (and certainly before he acts on the results or interprets them to others), our purpose in this chapter is to discuss and illustrate some of the major analysis of variance programs capable of nonorthogonal solutions. The names, functions, authors and distributors of the programs available to us for comparison (in the autumn of 1978 at the University of Chicago) are shown in Table 1. Considering the rapid pace of developments in this field, the reader who needs more up-to-date information is advised to contact the program distributors at the addresses listed in Table 1. In particular, we have been informed that an SPSS multivariate analysis of variance program is in preparation, but it was not available to use at the time this paper was prepared. Thus, we have included only the SPSS univariate anova procedure in this review. In Section 1, we discuss in general terms the different approaches that the programs take to estimation and hypothesis testing in the nonorthogonal analysis of variance. In Section 2 we consider in a similar vein the analysis of covariance and repeated measures analysis. Section 3 is devoted
Table 1 Computer programs reviewed Univariate programs Analysis of variance and covariance including repeated measures. Robert Jennrich and Paul Sampson Health Sciences Computing Facility CHS Bldg., AV-111 University of California Los Angeles, CA 90024 SPSS Procedure ANOVA Function: Analysis of variance and covariance for crossed designs with up to five factors. Author: Jae-On Kim Distributor: SPSS, Inc. 6030 South Ellis Avenue Chicago, IL 60637 3. BMDP3V Function: Estimation of mixed model fixed effects and variance components by the method of maximum likelihood. Robert Jennrich and Paul Sampson Authors: Distributor: (same as BMDP2V) 1. BMDP2V Function: Author: Distributor:

Univariate and multivariate analysis of variance

705

Multivariate programs (specializing to univariate when the number of response variables is set to 1) 4. MULTIVARIANCE VI Function: Univariate and multivariate analysis of variance, covariance, regression and repeated measures. Author: Jeremy D. Finn Distributor: International Educational Services 1525 E. 53rd Street, Suite 829 Chicago, IL 60615 MANOVA II Function: Univariate and multivariate analysis of variance, covariance and regression. Author: Elliot M. Cramer Distributor: Elliot M. Cramer The L. L. Thurstone Psychometric Laboratory Davie Hall University of North Carolina at Chapel Hill Chapel Hill, N.C. 27514 OSIRIS Procedure MANOVA Function: (Same as MANOVA) Author: Charles Hall and Elliot M. Cramer Distributor: ICPSR Institute for Social Research University of Michigan Ann Arbor, MI 48106 SAS Procedure GLM Function: Least-square fitting of univariate and multivariate models for regression and analysis of variance and covariance. Author: James H. Goodnight Distributor: SAS Institute, Inc. P. O. Box 10066 Raleigh, N.C. 27605 ACOVSM Function: Analysis of covariance structures including generalized manova. Karl G. J6reskog, M. van Thillo, and Gunner Gruvaeus Authors: Distributor: (same as MULTIVARIANCE VI)

5.

6.

7.

8.

to two special purpose programs, BMDP3V and ACOVSM. Section 4 then describes each program under the following headings: (1) Method of solution, (2) Tests of hypotheses, (3) Use of the program, (4) Limitations, and (5) Documentation. The programs are discussed in the same order that they appear in Table 1. Finally, Section 4 contains the results of a number of sample problems analyzed with these programs on the IBM 370/168 at the University of Chicago Computation Center.

706

R. Darrell Bock and David Brandt

It is important for the reader to understand that work reported here was done on a limited budget and is by no means an exhaustive analysis of these programs. In particular, we could test only their main features with a few small test problems and were unable to obtain any detailed information about relative computing speeds. 1. Remarks on estimation and tests of hypotheses in non-orthogonal analysis of variance To move from balanced to unbalanced designs in analysis of variance is not only to lose the ease of computation of the orthogonal solution, but also the intuitively appealing equivalence of the observed marginal means to the least-squares estimates of effects and, perhaps more important, also the uniqueness of the additive partition of the total sum of squares: The "one best way" of performing the calculations via the marginal means no longer exists, and in complex designs there can be as m a n y different partitions of the sum of squares as the n u m b e r of orders in which the analyst wishes to eliminate confounded effects. The power of the computer can overcome the added complexity of computation, but it cannot m a k e the logical choice between effects that can be ignored or must be eliminated in the analysis of an unbalanced design. Accepting the easy alternative of eliminating all effects in all possible orders hopelessly compromises the error rates in the procedure and cannot be r e c o m m e n d e d as a suitable default. H o w to overcome the non-uniqueness of the partition of sum of squares is perhaps the most difficult problem presented by non-orthogonal analysis of variance.

1.1.

Estimation

N o n e of the programs in Table 1 give the user much control over the method of estimation. One exception is M U L T I V A R I A N C E , which allows the user to specialize to the m u c h more rapid orthogonal solution when the design is balanced. One of the package programs, SAS, contains a subprogram, ANOVA, which is identical to their general procedure G L M except that it assumes a completely orthogonal design. Only the diagonal elements of the matrix of cross products a m o n g the d u m m y variables are stored. The other packages--SPSS, B M D P , and O S I R I S - - c o n t a i n subprograms for univariate one-way designs in addition to the more general programs reviewed here. N o n e of the programs in Table 1 use the classical method of restricting unknowns in order to solve the normal equations for design models. The symmetric treatment of parameters in each subspace that would result is

Univariate and multivariate analysis of variance

707

extremely convenient in the orthogonal case but has no special advantage in the non-orthogonal case. In fact, it is disadvantageous from the point of view of computing efficiency. Because the system of normal equations for, say, m effects of which l<rn are independently estimable, is brought to full rank by the addition of the restriction matrix times its transpose to the coefficients of the normal equations, the system to be solved is of order m when, by transformation to independent parameters, it could be order I. In large models with many (but not all possible) interaction terms, I can be much less than m, and a solution of order l rather than m saves considerable core memory and processing time. One of the programs, SAS Procedure GLM, uses a form of generalized inverse to solve the rn normal equations (see Searle, 1971a). In the computer, this is typically done as follows: D u m m y variables are constructed for each effect (i.e., the value 1 is assigned to those subclasses of the design that can experience the effect and 0 to those that cannot). For a design with n subclasses and a model with m effects, this matrix is of order n m. The m m symmetric matrix of sums of cross products with respect to columns is then constructed and the unique elements stored as a triangular matrix containing m(m+l)/2 elements. This cross-product matrix is then "pivoted" on its diagonal elements from upper left to lower right (see Bock, 1975, p. 47). When the pivoting reaches a linearly dependent variable corresponding to a redundant parameter in one of the subspaces of effects, the value of pivot becomes zero within the precision of the calculations. The corresponding row and column of the cross-product matrix (or row of the triangular form) is then set to zero and the pivoting continues. By the time the last dummy variable is pivoted, the number of non-zero rows and columns remaining is l, the number of independent effects. The m m matrix thus obtained is a generalized inverse. If the vector of constant terms in the normal equations is premultiplied by this matrix, a particular solution of the consistent equations results. In this method, the non-uniqueness of the general solution is resolved, not by restricting unknowns, but by setting the last parameter of each subspace to zero. Since a linear transformation of a least-square estimate is a least-square estimate, this particular solution can be transformed as required for obtaining contrasts of parameters or testing specific hypotheses. In addition, subspaces can be removed from the model by repivoting the corresponding non-zero rows and columns of the generalized inverse. If during the initial pivoting a variable is confounded because of missing data in some of the subclasses, the solution remains valid provided the number of degrees of freedom in the subspace is reduced by one. SAS G L M does this automatically and reports zero degrees of freedom for the inestimable effects.

708

R. Darrell Bock and David Brandt

The pivoting method is attractive from many points of view: it is easy to program, the calculations can be accommodated by adjoining the covariates to the dummy design variables. Its disadvantages are that the system to be solved is, as in the linear restriction method, of order m rather than l, that additional calculations are required to transform the effect estimates to meaningful contrasts, and that errors due to rounding accumulate more rapidly than in alternative procedures such as Cholesky decomposition or modified G r a m - S c h m i d t orthogonalization (see Bock, 1975, pp. 85 and 32). All remaining programs employ a reparameterization of the model that leads to a system of l independent equations in ! unknowns when the design is complete, (or if confounded effects in incomplete designs are deleted beforehand). This system can then be solved by any convenient method for nonhomogeneous linear equations in which the matrix of coefficients is non-singular. There is now fairly universal agreement that the method of choice from point of view of speed, stability in the presence of rounding error, and economy of core storage is a Cholesky decomposition of the matrix of coefficients bordered by the constant terms, followed by a conventional back solution. Alternatively, the Cholesky factor of the matrix of coefficients, which is a true triangular matrix, can be inverted in place and postmultiplied by the constant term. In M U L T I V A R I A N C E and MANOVA, the method of reparametrization introduced in Bock (1963a) (see also Bock, 1975, Ch. 5) is used to generate directly a basis for the design model that will yield whatever contrasts among main effects tile user may desire. The more popular options, such as contrasts of effects of each of several treatments with that of a control (Control Contrasts), contrasts of each effect with the mean of all effects in that way of classification (Deviation Contrasts), contrasts of each effect in turn with the mean of the remaining effects in some specified order (Helmert Contrasts), and Fisher-Tchebycheff orthogonal polynomial contrasts (Polynomial Contrasts) are provided by these programs. Arbitrary contrasts may be supplied by the user. Interaction contrasts are generated automatically by forming Kronecker products of the main-class contrasts. In addition to the absolutely minimal computation and core memory requirements of the reparameterization approach, it has the advantage when orthogonal contrasts, such as the Helmert or Polynomial contrasts, are used in the construction of the basis that the analysis of balanced designs can be performed merely by postmultiplication of the basis by the vector of observed subclass means and dividing by the respective normalizing constants (lengths of the corresponding basis vectors). (See Bock, 1975, p. 244.) Explicit solution of the normal equations is n o t then required. Because computation of sums of products is almost as efficient as addition

Univariate and multivariate analysis of variance

709

in floating point arithmetic, this method of performing orthogonal analysis of variance compares favorably with the traditional method even in large problems. Two programs, BMDP2V and SPSS ANOVA, impose restrictions on the original parameters, either by assuming that all parameters involving one level of each way of classification equal zero or by requiring that the effects for each way of classification sum to zero. Both approaches are actually a form of reparameterization and result in a matrix of cross products of full rank. However, users of SPSS ANOVA and BMDP2V have no control over the type of contrast to be estimated since the contrasts among the original parameters are implied by the form of the restrictions imposed. Thus, neither program can be specialized to handle an orthogonal problem more efficiently, nor can contrasts of interest to the user be requested. These two programs do not print parameter estimates or estimated means. If the basis matrix corresponding to selected main-class contrasts is generated by rows, the reparameterization approach can be programmed in the same manner as the generalized inverse approach. The generation of basis rows takes the place of the assignment of 1 and 0 d u m m y variables and is only slightly more time consuming. If desired, pivoting can be used to solve the normal equations and does not require subsequent transforma~ tions to put the estimates in a useful form. If the solution is restricted to complete designs, it is not necessary to provide for zero pivots, as they will not occur. But a more general program results if the possibility of unanticipated vacant subclasses is provided for by zero-pivot detection and subsequent adjustment of degrees of freedom in the tests of hypotheses. Similar provisions can be incorporated in the Cholesky decomposition method. Because of important savings of core storage that are possible in large analysis of various problems, the M U L T I V A R I A N C E and MANOVA programs generate the basis for the design by columns, rather than by rows, and store these columns on disk memory. If the design is unbalanced, these column vectors are orthonormalized in the metric of the subclass numbers by modified G r a m - S c h m i d t as they are generated and stored. The inner-products of these orthogonalized vectors and the vector subclass means give the so-called "semi-partial" regression coefficients whose sums of squares provide an additive partition of the total sum of squares. As a by-product of the orthonormalization, an upper triangular matrix is built up that permits any leading subset of these regression coefficients to be transformed into estimates of the selected effect contrasts. (See Bock, 1975, p. 289.) This matrix is also involved in the calculation of standard errors for the effect estimators. As a final remark on estimation of effects in nonorthogonal analysis of

710

R. Darrell Bock and David Brandt

variance, we offer the observation that, while most of the ultimate consumers of the results can readily interpret estimated main-class effects and contrasts, they usually have difficulty with interaction effects, however plausibly parameterized. We would suggest that, when the model requires interaction terms, the programs should supply, with standard errors, the joint marginal means that would be predicted for the corresponding ways of classification if the design were balanced. In other words, the practice of computing adjusted treatment means, which has always been part of the analysis of incomplete block designs, should also be applied to two-way or higher-way margins when interactions involving them are judged to be present. These predicted means can be displayed for purposes of interpretation in the type of interaction plot introduced by McNemar (1962) and familiar to substantively oriented investigators.

1.2. Tests of hypotheses


Early discussions of nonorthogonal analysis of variance (Rao, 1952; Roy, 1957; Corsten, 1958) did not foresee the extent of the logical difficulties that would arise out of the lack of uniqueness of the partition of sum squares. In the orthogonal analysis of, say, a balanced three-way cross classification A B C, the sums of squares for the various subspaces of effects and interactions are the same regardless of the order in which their additive partition is carried out. Typically, an arbitrary lexical ordering is employed: I,A,B, C, AB,AC, BC, ABC (where I is the term for the general mean). To obtain an additive partition of the total sum of squares in the nonorthogonal case, on the other hand, the design variable must be transformed so as to be linearly independent, which is to say that the basis vectors of the various effect subspaces must be made space-wise orthogonal. While it is true that this can be done uniquely and symmetrically by the singular value decomposition (i.e., transformation to principal vectors), the reparametrization of effects that is implied has no useful interpretation in analysis of variance. Interpretative considerations dictate that a socalled "triangular orthogonalization" (Householder, 1964) be employed: In some order, the projection of vectors in a given subspace on all subspaces earlier in the ordering is subtracted from the given subspace. The resulting "residual" vectors are space-wise, or "block-wise" orthogonal, and the sums of squares for regression of the observational vector on these orthogonalized vectors (i.e., the square length of the projections of the observational vector on the orthogonal subspaces) add to the total sum of squares. The successive residualizing implied in this process, accomplished by pivoting of the cross-product matrix or modified G r a m - S c h m i d t orthogonalization of the basis, is expressed in the familiar notation for sums of

Univariate and multivariate analysis of variance

711

squares in non-orthogonal analyses. For example, A

B/A C/A,B AB/A,B,C AC/A,B,C, AB BC/A,B,C, AB,AC ABC/A,B,C, AB, AC, BC


designates an additive partition of effects taken in lexical order (correction for the grand mean is understood). This type of partition is usually called hierarchical. It implements a model fitting approach in which terms are discarded from bottom to top (back to front) if there is n o evidence of their significance. The hierarchical partition is the only reasonable solution for unbalanced nested designs, and it also has a meaningful interpretation in crossed designs if there is a priority of interest in the effects of the various ways of classification and their interactions. If A, B and C were to represent in an educational experiment, age, sex and alternative instructional methods, for example, the above hierarchical partition would be appropriate for analysis of a measure of achievement. Tests of hypotheses using, respectively, the 7th, 6th, 5th and 3rd sums of squares, with the pooled within-group sum of squares as the error term, would answer in turn the questions "Does the method of instruction interact with age and sex jointly?", "If not, does method interact with sex?", or "If neither, does it interact with age?", or, "If none of these interactions are present, does method of instruction affect achievement independent of the sex and age of the subjects?". All of these tests would be free of the confounding effects of disproportionate numbers of subjects in the age by sex by method subclasses. A hierarchical partition in one order would not, however, be appropriate if A, B and C represented three classes of instructional methods equal in priority of interest--e.g., in a study of the effects of alternative explanatory material, examples, and exercises on achievement. In this case, three hierarchical orderings could be carried out and the following sums of squares extracted:

A/B,C B/A,C
C/A,B AB/A,B,C, AC, BC AC/A,B,C, AB, BC BC/A,B,C, AB,AC ABC/A,B,C, AB,AC, BC

712

R. DarrellBockandDavidBrandt

This approach to computing sums of squares in nonorthogonal analysis of variance is sometimes called the "experimental design" solution. It retains the ordering of main effects and successively higher order interactions that characterizes the model fitting approach, but treats symmetrically the effects within each of these sets of spaces. The sums of squares are not a partition, however, and do not sum to the corrected total. But by not requiring the arbitrary ordering of the ways of classification assumed in the hierarchical approach, they lead to tests of hypotheses that may be interpreted in essentially the same way as those in the orthogonal analysis of a balanced design. The remaining major approach to testing hypotheses in nonorthogonal analysis of variance is the space-wise analogue of t-tests of partial regression coefficients. Effects in each space are tested eliminating all other potential effects. The required sums of squares are:

A / B, C, AB,AC, BC, ABC B / A , C, AB, AC, BC, ABC C/B,A,AB, AC, BC, ABC A B / A , B, C,A C, BC, ABC A C / A , B , C, AB, BC, ABC BC/A, B, C, AB,A C, ABC A B C / A , B, C, AB,A C, BC
We will call this the "full-regression" approach. Needless to say, it is not an additive partition of the corrected total sum of squares. Although this approach has its advocates (Kutner, 1974; Carlson and Timm, 1974; Speed and Hocking, 1976), the reasoning involved is not as clear as in the other approaches. It violates the principle that main class effects in the fixed effects model are not interpretable in the presence of interactions. It also incurs the same objections that are raised against t-tests of regression coefficients, namely, that the t for each of two coefficients can be nonsignificant, but that for either variable will become significant if the other is dropped out of the model. This cannot happen in the hierarchical approach if, as is the practice, spaces are dropped from back to front only. At the level of the orders of effects (i.e., main effects, two-way interactions, three-way interactions, etc.), this is also true of the experimental design solution. There is an important property of hierarchical partitions that is often overlooked in discussions of non-orthogonal analysis of variance. If the assumptions of the analyses of variance hold, the sums of squares in the hierarchical partition are statistically independent and the corresponding F-tests are independent, given the error term ("quasi independent"). Thus,

Univariate and multivariate analysis of variance

713

the probability of committing at least one type 1 error among n such c~-level F-tests is readily calculated as a n = 1 - ( 1 - cOL Thus, the a level of individual tests can be adjusted to keep the overall type 1 error at the nominal level. In the experimental design and full regression approach, these calculations are not generally possible because the sums of squares are dependent in complex ways. As a result, the statistical justification of the latter types of solution is compromised. This consideration suggests that the analyst should try to exploit priorities among ways of classification in order to justify a hierarchical partition when designs are unbalanced. The computer programs in Table 1 resolve the non-uniqueness of the partition sum of squares in non-orthogonal analysis of variance in different ways. M U L T I V A R I A N C E and M A N O V A give the user the option of reordering hierarchical partitions as many ways as desired in the same problem run. If, for example, only two of three ways of classification are to be treated symmetrically, only two hierarchical orderings would be required for an experimental design solution. SAS and SPSS give the user the option of the type of solution. Thus, the SAS Type I solution gives the hierarchical partition, Type II gives the "experimental design" solution, Type III and IV give full regression solutions for complete and incomplete designs. SPSS A N O V A provides a variety of optional orders of elimination, including hierarchical (Option 10), experimental design (default), and full-regression (Option 9). Both SAS TYPE I and SPSS Option 10 permit the user to reorder effects in separate runs, although neither estimates parameters based on a reduced rank model. SPSS 10 appears to be the less versatile program since interactions are always generated automatically, making it impossible to request certain orders. The BMD program, on the other hand, invariantly employs the full regression solution and provides no other options. These different approaches to non-orthogonal analyses are discussed in more detail in the program description in Section 4 and the test problem results in Section 5o
2. Remarks on analysis of covariance and repeated measures analysis

2.1. Analysis of covariance


R. A. Fisher showed (see Fisher, 1967, p. 272 ff.) that the sensitivity of a randomized experiment could be increased by eliminating from the error estimate any variation that had a nonzero regression on any independent measures available for the observational units. H e called these measures "concomitant variables" (now shortened to "covariables") and exemplified them by plot yields in years prior to an agricultural field experiment.

714

R. Darrell Bock and David Brandt

In the general linear model, the covariables are adjoined to the design variables, or their basis, and coefficients of the regressions of the response variables on them are estimated jointly with the design effects by least squares. If there are l independent design effects and q covariables, then the normal equations comprise l+ q independent equations in l + q unknowns, the columns of covariable values being assumed linearly independent. The covariables are not assumed pairwise orthogonal or mutually orthogonal to the design variables, however, and a non-orthogonal solution is required even when the experimental design is balanced. Fisher showed that the solution could be obtained expeditiously by calculating the regression coefficients from the residual sums of squares-and-products of the covariables and response variables jointly, then using these estimated coefficients to adjust estimates of the design effects obtained in the usual way, ignoring the covariables. Two hierarchical programs, MULTIVARIANCE and MANOVA, make use of the fully nonorthogonal generalization of this procedure (see Bock 1975, Section 5.5), while SAS Procedure GLM solves directly the augmented normal equations. In SPSS the covariates are not adjoined to the design matrix but appear separately in the linear model as coefficients of the vector of regression parameters. Tests of hypotheses are associated with both the regression analysis and the covariance adjustment step of the analysis. Consider, for example, an A B design with covariable (or covariables) symbolized by X. Then, the regression sum of squares is designated

XlA,B,AB
when all possible design effects have been eliminated in calculating the residual sum of squares and cross-products in Step 1. The "reduced" residual for the response variables with X included in the model is the error term for testing the hypothesis of no regression by means of the above sum of squares. If it is accepted that X should be included in the model, the various design effects can be tested using the corresponding "adjusted" sums of squares. If the design is unbalanced, these adjusted sums of squares will depend upon the type of elimination used in the analysis. The possibilities are as follows: Hierarchical

AIX BIA,X ABIA,B,X

Univariate and multivariate analysis of variance

715

Experimental design

AIB,X BIA,X ABIA,B,X


Full regression

AIB,AB,X BIA,AB, X ABIA,B,X


Of these, only the hierarchical solution yields an additive partition and independent sums of squares. Note, however, that because this part of the analysis and the regression analysis represent two different orders of elimination in the larger nonorthogonal model comprising both design and covariables, the tests of hypotheses in the two parts are not in general statistically independent. For the programs we are discussing here, the number of different types of tests of hypotheses is even further increased by other conventions of eliminating regression and design effects. The MULTIVARIANCE and MANOVA programs in any one ordering of effect spaces produces the straightforward regression analysis eliminating all design effects and a hierarchical partition of design effects eliminating X, as above. But MULo TIVARIANCE goes further with option to partition the regression sum of squares in a user specified order of the covariables. For example, for two covariates X 1 and X2, mean squares and stepwise regression tests based
on

X1]A,B,AB X21A,B,AB, X1
are computed. Both programs give the complete preliminary joint analysis of variance of the response variables and covariables and permit tests of hypothesis on response variables ignoring the covariables. The analysis of covariance, as implemented in SAS and SPSS, differs fundamentally from the Fisherian approach. The crucial difference is that covariates and design variables are treated identically in SAS and SPSS, while MULTIVARIANCE/MANOVA assumes that the covariates may only explain residual (or within-subclass) variation. Thus, the analysis of regression in SAS and SPSS is not, in general, confined to the error space

716

R. Darrell Bock and David Brandt

and different types of covariate adjustments (or possibly none) are used in testing design effects. SAS G L M TYPE I, in adjoining the covariates to the design variables, permits either the covariates or the design variables to appear first in the order of pivoting. When the design effects are ordered first, the analysis of regression agrees with the Fisherian analysis, but the design effects are not adjusted for the covariates even though they are tested using the reduced error term. Presumably, this part of the analysis would not be interpreted. When the covariates are ordered first, the regression analysis is a one-sample analysis in which the subgroup structure is ignored. The tests of the adjusted design effects will not, in general, agree with the Fisherian analysis because the covariate adjustment is not uniformly based on a within-groups regression. For the first main effect, the one-sample regression is used in computing the adjusted sums of squares. For the second main effect, a regression eliminating group means associated with the first way of classification is used, etc. This analysis is also available as SPSS Option 10. SPSS ANOVA Options 8, 10 and 7, 10 eliminate only the design main effects from the sums of squares for regression, i.e.,
XIA,B

The main effects are tested without adjusting for the covariates (as in SAS procedure G L M TYPE I with covariates last), and the interactions are tested after eliminating within-group regression effects. The SPSS "experimental design" Options--7, 8 and d e f a u l t - - d o not report the within-groups regression. SPSS Options 7 and 8 give regressions eliminating design main effects only, and SPSS default reports a one-sample regression. All three options test the interaction after adjusting on the basis of the within-groups regression. The default and Option 7 test main effects after adjusting on the basis of a regression eliminating the remaining main effects, but Option 8 tests main effects without any covariate adjustment. SAS G L M TYPE II is identical to SPSS default except that a within-groups regression is reported. These non-Fisherian analyses share a logical difficulty. The solutions which report one-sample regressions or regressions eliminating main effects only are assuming the nullity of certain (or all) design effects. However, they then proceed to test these effects using a covariate adjustment which fails to take into account mean differences associated with the effect under test. If a user wishes to assume that certain effects are not significant, a more rational procedure would be to pool the appropriate sums of squares into the error term and proceed with a Fisherian solution. This option is available in M U L T I V A R I A N C E and M A N O V A by using the residual

Univariate and multivariate analysis of variance

717

from the model as the error term instead of the c o m m o n within-group error estimate. Following the same philosophy as in analysis of variance, B M D P 2 V employs the full regression solution without options. This solution is also available as SAS III and IV and SPSS 9. The test of regression eliminates all design effects, and the tests of given design effects eliminate the covariables and all possible other design effects. All of the programs except M A N O V A provide individual tests of covariables in the regression analysis. Various optional orders of elimination of covariabtes and design effects are illustrated in Example 2 of Section 4. 2.2. Repeated measures" analysis"

When subjects are measured more than once with respect to the same response variable under different conditions, the study can be regarded either as a univariate mixed-model design with subjects as the random way of classification or as a multivariate fixed-effects design with the repeated measures representing a special class of vector response with commensurate components. Although the two formulations can be made to yield the same tests of the differences between conditions, of differences between classes of subjects or of interactions of conditions and classes of subjects, the multivariate treatment is more general because it allows an arbitrary correlation of residuals over the repeated measures. This includes in particular the possibility of an autoregressive structure of these correlations when the residuals are due to failure of fit of a trend model. The multivariate approach to repeated measures is based on S. N. Roy's (1957) formulation of the multivariate linear model in which the estimate multivariate linear parametric functions have the structure 0 = L~S where L is, say, an l m "prefactor" of rank I and S is, say, a p s "postfactor" of rank r, p being the n u m b e r of repeated measures. If r = p , the minimum variance unbiased estimator of O is 6=(K'DK)-IDY. T ( T ' T ) -1

~ ) = ( K ' D K ) - 1 D Y . T -1, where K L = A, an n m design matrix of rank l, D is a diagonal matrix of numbers of subjects in subclasses of the design corresponding to the rows A, Y, is the n p matrix of observed subclass means, and S T = B is a p p matrix of a response model for the repeated measures (the "design" on the

7t8

R. DarrellBock and David Brandt

variates). In particular, if the response model is one of polynomial trend, then a suitable choice of T is such that T - 1 = P', where P is the matrix Fisher-Tchebycheff orthogonal polynomials of degree p - l. Merely by transforming the vector observations before performing multivariate analysis of variance of the sample group differences gives mean squares for the constant, linear, quadratic, etc. trend variables that equal exactly those of the univariate mixed model mean squares when the latter are resolved into single degree of freedom trend components. (See Timm, 1975; Bock, 1963b, 1975, Ch. 7; Bock, 1979.) Alternatively, if the mixed-model assumptions are not met (i.e., if the repeated measures residuals do not become uncorrelated when transformed by P'), the multivariate test statistics may be used in place of the univariate F's to test differences in mean trend between sample groups, and possibly also overall mean trend. (See Bock, 1975, Ch. 7, and Bock, 1979.) For s <p, however, this approach is only fully efficient when the last p - r transformed residuals are uncorrelated with the s transformed residuals retained in the trend analysis. Following Potthoff and Roy (1964), we may achieve full small-sample efficiency, however, by using the estimator 6 = ( K ' D K ) - ' K ' D Y.Y~ 1T(T'Y,~-'T)- 1, where X, is the population residual covariance matrix. Unfortunately, Z~ is not generally known, and the best we can do in most cases is settle for large-sample efficiency by substituting the estimated residual covariance for Z,, thus obtaining the maximum likelihood estimator of O (see Khatri, 1966). In effect, this is a large-sample weighted repeated measure analysis in contrast to the preceding small-sample unweighted analysis. Among the computer programs, BMDP2V is set up to perform the univariate mixed-model analysis, including polynomial trend analysis. The mixed-model analysis can also be done by M U L T I V A R I A N C E , SAS GLM, and MANOVA, but the calculations can be time consuming in the nonorthogonal case if the random dimension is larger than about 50 (subjects). In the multivariate treatment, however, the subjects are merely replications within sample groups, and their number may be all but unlimited with very little effect on computing time. Because of its extensive facilities for generating the variate transformation matrix, the M U L T I V A R I A N C E program is the most convenient for the general multivariate repeated measures analysis. With provisions for extracting the univariate mean squares, M U L T I V A R I A N C E VI also efficiently obtains the univariate solution from the multivariate results when the test of no association in the transformed residual matrix, provided by the program, indicates

Univariate and multivariate analysis of variance

719

that the mixed-model assumptions obtain. Because BMDP2V employs the full regression solution, users who favor the hierarchical or experimental design methods have at present no choice but to obtain the univariate results from a multivariate analysis in M U L T I V A R I A N C E if mixed~ model results are desired. The weighted multivariate repeated measures analysis can be performed by M U L T I V A R I A N C E VI or by ACOVSM. A comparison of a weighted and unweighted analysis appears in Bock (1979).

3.

Comments on two special purpose programs

Before proceeding to the summaries of the programs in Table 1, we comment briefly on the numerical methods used by the two special purpose programs, BMDP3V and ACOVSM.
3.1. BMDP3 V

The BMDP3V program now in the B I M E D P package is an attempt to fill a long-standing need for estimation of fixed effects and variance components in unbalanced mixed-model designs. Many solutions of this difficult problem have been proposed, and the authors have chosen to implement the maximum likelihood (ML) solution along lines developed by Hemmerle and Hartley (1973), Harville (1977), Corbeil and Searle (1976), and others. The algorithm used is described in Dixon and Brown (1977), and Jennrich and Sampson (1976). A theoretical disadvantage of ML estimation is that it does not account for the loss of degrees of freedom due to estimating fixed effects. As a result, the ML estimators are biased downward and do not agree with ordinary least-squares (OLS) (i.e., unbiased quadratic) estimates. This problem has been eliminated by the development of restricted maximum likelihood estimation (REML) by Patterson and Thompson (1974). The R E M L estimates agree with OLS estimates in the orthogonal case. The general model is the same considered by Hemmerle and Hartley (1973) and Harville (1977):
y=A~+ Ulbl + . . . + U,.bc+e,

where y is an N 1 response vector, X is an N p design matrix, ct is a p vector of unknown parameters corresponding to the fixed effects in the model,

720

R. Darrell Bock and David Brandt

U 1 is an N qi design matrix corresponding to the random effect to be estimated, b i is a qi vector of unobservable r a n d o m effects from N (0,o2), and e is an N I D (0, 0 2) residual error. The random vectors b l , b 2. . . . . bc and e are assumed independent. To be estimated are oiz and o z, the variance components, and a l, a 2 , . . . , ap, the fixed effects. The A matrix is generated automatically from instructions provided in the control cards. The first column of A is a vector of ones; the next columns of A are generated by the program. For a main effect with a levels, the program will generate a - l mathematical variables contrasting the first a - l levels with the last class. The program will, if requested, also generate the mathematical variables corresponding to the interactions. The last columns of A may be covariates which must be read in as data. The columns of the matrices U/, consist of dummy (0, 1) variables indicating group membership. If A is a random way of classification, a dummy variables will be generated. The program also has the ability of generating appropriate matrices for crossed and nested designs of any complexity. The maximum likelihood procedure for estimating the fixed effects and variance components is based on a combination of Fisher's scoring method and the N e w t o n - R a p h s o n algorithm. The initial scoring steps are continued until the change in likelihood between steps becomes less than 1. At that point the N e w t o n - R a p h s o n steps begin. At each stage, boundary constraints are imposed to restrict the o2 and 0 .2 to be non-negative. R E M L estimates are computed in two stages: First the variance components are estimated by maximizing the likelihood of the least squares residuals obtained from the regression of y on A. This likelihood does not depend on the a~, which are estimated by maximizing the likelihood respect to a, holding the variance components fixed at the values obtained at the first stage.
3.2. ACOVSM

ACOVSM (Analysis of COVarious Structures including generalized MANOVA) is an extremely general multivariate program that permits both a linear fixed-effects model to be assumed for the matrix of subclass means and a wide class of component structures to be assumed for the error covariance matrix. In those cases where the number of components estimated in the covariance structure is less than the number of distinct variances and covariances ( p ( p + l ) / 2 ) , it is to be expected that the ACOVSM estimates of the design fixed effects will be more efficient (large-sample efficient) than the corresponding M A N O V A estimates.

Univariate and multivariate analysis of variance

721

The main virtue of ACOVSM is its ability to handle special, unusual, and nonstandard analyses that are impossible on other programs. It is ideally suited for two closely related kinds of applications connected with the analysis of variance. The first application, called analysis of covariance structures, is appliable when there is a factorial combination of treatments in the within-subjects part of the design. The fixed effects are estimated in Z and the variance components, in the diagonal of qs. The error variance is estimated in the matrix 02 , which may be general diagonal or constrained to be of the form a2I. The matrix q5 may be either diagonal or symmetric. Analyses of this kind are discussed in Bock and Bargmann (1966), J6reskog (1970a and b, 1973, 1974, 1979), Schiefley and Schmidt (1978), and Wiley, Schmidt and Bramble (1973). They amount to the multivariate analogue of mixed-model estimation of variance components, but ACOVSM allows the user to make weaker assumptions about the form of the model. The second application is in the area of growth curve analysis. If the residuals from the growth curve follow a simplex structure, this covariance structure may be estimated in ACOVSM using the methods given in J6reskog (1970b, 1973) while simultaneously estimating the fixed effects in ~. Section 5.4 gives an example of such an analysis. Method of Solution. The ACOVSM program considers a general data matrix, Y, containing N observations on p (p < 15) variates. The rows of Y are assumed independent and multinormally distributed with E(Y)=AZB, where A (N g) and B (h p ) are known "design" matrices of ranks g and h, respectively; ~ is a matrix of parameters, and X has the form Z = T(A~A + q,2) T ' + 0 2 where the matrices T (p q), A (q r), the symmetric matrix q~(q q) and the diagonal matrices 't'z (q q) and 0 2 (p p) are parameter matrices. If Z and Y are unconstrained, the ACOVSM model reduces to the conventional manova. In this case (designated by J6reskog as the "standard" case), hypotheses of the form

L~,S=O
may be tested, where L (s x g) and S (h t) are known matrices of ranks l and s, respectively. A unique feature of ACOVSM is that the program may also be used to test hypotheses concerning the mean vector and covariance matrix when ,~ and//or Y~are constrained to be of a particular form. In the standard case, hypothesis testing proceeds by reading in one or more sets

722

R. Darrell Bock and David Brandt

of L and S matrices in a single run of the program. ACOVSM then computes the solution analytically. In the general case, hypothesis testing is done by comparing chi-square statistics from separate runs of the program. Here the program solves a non-linear optimization problem using quasi-Newton methods. In the general case, parameters m a y be of three kinds: (1) free parameters which are to be estimated from the data, (2) fixed parameters which are assigned values prior to optimization, and (3) constrained parameters which are specified to be equal to one or more other parameters to be estimated. The references specifically dealing with ACOVSM are (J6reskog, 1970a, 1973, 1974, 1979). J6reskog (1970b) deals with the estimation of simplex models while ignoring the mean vector. These models may be combined with various structures on the means in the case of growth curve analysis.

4.

Program summaries

These summaries sketch the main features of the programs. For greater detail, the reader will have to refer to the program user's guides. 4.1. BMDP2V

This univariate general linear model program includes analysis of covariance and analysis of repeated measures for crossed classifications only. Cell frequencies may be equal, unequal or zero. In the latter case, the user must indicate which degrees of freedom are inestimable. Covariates may be input and tested in the regression analysis. The test of parallelism of regression planes is not included. Multiple dependent variables may be read in, but only univariate test statistics are computed. Standard output includes the observed cell means and standard deviations and the A N O V A table. For repeated measures data the program will optionally test the assumption of compound symmetry using the sphericity test (Anderson, 1958, p. 259). If requested, the program will do a trend analysis on all within-subjects factors; if spacing is unequal, the user may supply a metric. There is no provision for user selected marginal means corresponding to main effects Or lower order interactions, estimated means based on fitting a reduced rank model, or parameter estimates corresponding to the design effects. Method of Solution. The design matrix, including dummy variables corresponding to all possible interactions, is generated automatically from the design specifications. Sums of cross-products of covariates and design variables are pivoted and sums of squares for hypotheses are obtained from differences of residual sums of squares with effects in and out of the model. (See Searle, 1971a.)

Univariate and multivariate analysis of variance

723

Tests of Hypotheses'. The author of P2V apparently has taken the position that the only correct tests for nonorthogonal designs are the full regression tests, i.e., all other effects in the model are eliminated prior to testing the effect in question (see Section 1.2). The manual refers to Kutner (1974) and Speed and Hocking (1976) in support of this choice. Use of the Program. The program uses the free-format control statements in which keywords describing the problem are grouped into "paragraphs". The design is indicated in the D E S I G N paragraph and may be expressed in one of two ways. Variables may be designated as G R O U P I N G (i.e., ways of classification), D E P E N D E N T , or COVARIATES. For repeated measures the user must also indicate the number of LEVELS of the within-subjects factors. For example, a one-way between, one-way within design with four levels of the within-subjects factor would be designated:
DESIGN G R O U P I N G IS 1. D E P E N D E N T IS 2 TO 5. LEVELS ARE 4 . /

The program would then automatically treat variables 2 to 5 as levels of a single within-subjects factor. If the levels keyword were omitted, P2V would regard these four variables as response variables and carry out four separate one-way analyses. Group membership must be coded as variables; the program will not define the groups according to the order in which the data are read in. Alternatively, the design may be designated by the keyword F O R M on the design card. The above example would be described as DESIGN F O R M IS ' G , 4 ( Y ) ' . /

This indicates that the first variable is a Grouping variable and the next four are dependent variables. The parentheses indicate repeated measures. If the statement were DESIGN F O R M IS ' G , 4 Y ' . / ,

the program would regard the four measures as distinct and do separate univariate analyses. For a one-way-between, two-way-within design with two and three levels of the within-subjects factors, the design formula would be: DESIGN F O R M IS 'G, 2(3(Y))'./

Incomplete designs may be analyzed by first generating the full model,

724

R. Darrell Boek and David Brandt

including inestimable effects, then telling the program to delete degrees of freedom by inserting an I N C L U D E D or an EXCLUDED keyword listing estimable and inestimable effects, respectively. Limitations. Missing from the program is an option to compute selected marginal means and parameter estimates. Another important limitation is the absence of any way to specify nesting relationships between factors or to indicate random ways of classification. Documentation. Dixon and Brown (1977).

4.2. SPSS ANOVA


This univariate general linear model program is able to analyze crossed designs with up to five factors and covariates, and up to five dependent variables (but computes only univariate analyses). Cell frequencies may be unequal but all cells must be non-vacant. The program does not handle repeated measures or nested factors, and, in the case of covariance analysis, does not test for parallelism of regression lines. Method of Solution. ANOVA is a general linear model program that brings the design matrix up to full column rank via an implicit reparameterization (see Section 1.1). The program then solves the normal equations and obtains the necessary sums of squares by subtracting the sum of squares due to a given restricted model from the sum of squares due to the full model. Test of Hypotheses. The program appears to have been influenced mainly by the work of Overall and Spiegel (1969), which is the only reference on the topic given in the manual. ANOVA offers all three of the types of analyses of unbalanced designs described in Section 2.2. For the analysis of covariance, SPSS ANOVA offers 7 possible solutions obtained by taking various permutations of the order of main effects, interactions, and covariates, and combining them with either the full regression, experimental design, or hierarchical methods of testing effects. Many of these solutions would have to be described as "non-standard." (See Section 5.2.) Use of the Program. The procedure is called up with an ANOVA card containing the list of dependent variables, the list of factors, and (possibly) the covariates. Following this card, the user may insert an OPTIONS card, and possibly a STATISTICS card. These two cards are used to request optional output, specify how to handle missing data, and how to handle non-orthogonal designs a n d / o r covariates. Each of these options is indicated by a numerical code. ANOVA is one of the easiest programs to set up, but two input conventions detract from the overall convenience. The input is not free format (variable lists and options must be entered on or after column 16 on the appropriate control card), and program options are

Univariate and multivariate analysis of variance

725

requested by entering a numerical code on the appropriate card, rather than by some menmonic device. For example, observed means are requested by the command STATISTICS 3

(with the "3" in column 16) and the full regression and hierarchical solutions are requested by OPTIONS and OPTIONS 10, 9

respectively. Limitations. The greatest limitations are the small problem size permitted and the lack of a clear explanation of the rationale for the optional tests of hypotheses in analysis of covariance. The program does not report parameter estimates, and reorderings must be accomplished in separate runs. We understand that SPSS is working on much improved MANOVA program which will be able to handle repeated measures and multivariate outcomes, but this program and its documentation were not available to us at the time of this review. Documentation. Nie et al. (1975).

4.3. BMDP3 V
This special purpose mixed-model univariate program estimates fixed effects and variance components by the method of maximum likelihood, including restricted maximum likelihood (REML). The program handles the non-orthogonal case and vacant subclasses. Restrictions are imposed which insure that the variance component estimates are non-negative. The motivation for developing a maximum likelihood program appears to be to handle the non-orthogonal random effects case. The program is also useful for orthogonal cases when one or more variance component estimates would be negative if the conventional mean-square estimators were used. Method of Solution. See Section 3.1. Tests of Hypotheses. Hypothesis testing proceedes by fitting alternative models in which one or more parameters are constrained to equal zero. The likelihood ratio is then tested using a chi-square approximation. Use of the Program. The analysis of variance design is described on a D E S I G N paragraph. The various restricted models to be fitted are specified on H Y P O T H E S I S paragraphs.

726

R. Darrell Bock and David Brandt

The design paragraph is used to identify the dependent variable, fixed and random factors, covariates (if any) and to generate the lines in the ANOVA table. Factors are labelled either F I X E D or R A N D O M and a separate designation is needed for every effect to be estimated (i.e., interactions are n o t automatically generated). The following design paragraph designates a one-way between, one-way-within ANOVA: D E S I G N D E P E N D E N T 1S RT. F I X E D IS DOSE. F I X E D IS TRIAL. F I X E D IS DOSE, TRIAL. F N A M E S A R E DOSE, TRIAL, 'DOSE*TRL'. R A N D O M IS DOSE, SUBJECT. R N A M E IS ' S U B / D O S E ' . / The keywords F N A M E S and R N A M E S are used only to label lines in the ANOVA table; it is the F I X E D and R A N D O M instructions that actually generate the design matrices. As is evident, there is no design formula as such; all terms tested must be given explicitly and (optionally) labelled. To specify a restricted model the user includes an H Y P O T H E S I S paragraph which contains the name of the fixed or random effect to be set to zero. For example, a test of the nullity of the dose trial interaction would be requested by the card H Y P O T H E S I S F I X E D IS ' D O S E , T R L ' . / In the version of the program we tested, the H Y P O T H E S I S paragraph could not be used if R E M L estimates were being computed. In this case, hypothesis testing would have to be done in multiple runs. Limitations. The program requires that each data point be entered on a separate data record and be identified by the values of the between and within subjects factors. That is, the program does not accept multiple scores per data record. This is an annoying convention that typically forces the user to prepare a separate data deck before analysis with P3V. Although a general maximum likelihood solution for the estimation of variance components is much needed, it appears that P3V must be regarded as a preliminary version of what could eventually become an extremely useful program. Two cosmetic features need attention: The lack of any design formula makes it tedius to specify standard designs, and the requirement that lines in the A N O V A table be labeled independently of the design specifications seems an unnecessary complication. The requirement that only one score per data record be entered makes the input

Univariate and multivariate analysis, of variance

727

format incompatible with P2V and MULTIVARIANCE (for example) and seems out of place in the BMDP package. Most importantly, even in the small test problem (see Problem 3, Section 5) the cost of the general maximum likelihood solutions seem exorbitant in comparison with the other analyses. Improvements in the efficiency of the iterative procedure are needed before the program can be used routinely for the analysis of problem of moderate size. Documentation. Dixon and Brown (1977), Jennrich and Sampson (1976). 4.4. MULTIVARIANCE This program is actually an integrated multivariate package oriented around analysis of regression, analysis of variance, and covariance analysis. In addition to these functions, the program has facilities for recoding and transforming variables, computing and punching descriptive statistics, performing principal components analysis on the error covariance matrix, discriminant analysis on each between-group hypothesis, canonical correlation between the dependent and concomitant variables, tests of parallelism of regressions, tests of independence among multiple variables, and weighted and unweighted repeated measures analysis. Method of Solution. MULTIVARIANCE brings the design matrix up to full column rank by reparameterization. The original design matrix is factored into the product of a row and a column basis. The row basis contains user selected contrasts among the original parameters. Estimation is by modified Gram-Schmidt orthogonalization of the basis rather than by pivoting of cross-products (see Section 2.1). If covariates are input, the program does the non-orthogonal extension of the Fisherian approach to analysis of covariance (see Bock, 1975, Chapter 5). The test of parallelism of regression hyperplanes is performed if requested. The program is written in dynamic storage and can handle up to ten factors with any number of levels, and any number of dependent variables and covariates. Any number of empty cells is permitted, but the user must delete inestimable degrees of freedom manually. MULTIVARIANCE includes a very complete "estimation phase" which estimates parameters based on fitting a model of user specified rank. It also reports t-tests on the parameter estimates and (optionally) gives estimated cell and marginal means. If a covariance analysis is requested, the estimation phase is repeated after adjusting for the concomitant variables. If the rank of the model to be estimated is less than the rank of the basis, the program will optionally compute cell residuals and their standard errors.

728

R. DarrellBoek and David Brandt

MULTIVARIANCE contains special features that facilitate the multivariate analysis of repeated measures data (Bock, 1975, Chapter 7; Finn, 1969). The user may have the program automatically generate design matrices corresponding to the design on the sample and respon3es, respectively. The program will then regard multiple scores on the same data record as having arisen from multiple levels of one or more within-subjects factors and proceed with the general multivariate analysis. The program also contains options for testing compound symmetry and extracting the mixed-model results from a multivariate analysis. The program may be used to compute correlation and covariance matrices for each group, weighted or unweighted marginal means, singledegree-of-freedom tests of user selected contrasts, Roy-Bargmann "stepdown" F-statistics, rotation of canonical coefficients by the method of Cliff and Krus (1976) and provisions for doing a large-sample weighted least squares analysis of repeated measures data (see Section 1.1). Use of the Program. The MULTIVARIANCE program itself requires fixed format control cards. However, a preprocesser known as MULTISTAT has been prepared which generates the MULTIVARIANCE deck setup from free-format control cards. We will discuss the use of MULTIVARIANCE through MULTISTAT. The program offers three ways of specifying tile design. A design formula which follows the rules of Nelder (1965) is, in general, the simplest way to do this, but the user may also indicate the terms in the linear model using the notation of Cronbach et al. (1972). Finally, a notation for the columns of the basis introduced by Bock (1963a, 1975), called "symbolic basis vectors," may also be used. For any given problem, the user only needs to use one of these ways of specifying the design. A Nelder-type design formula is indicated on an G S-DESIGN card. Crossing is denoted by " . " and nesting by " / " . A crossed three-way design would be designated GS-DESIGN
A*B*C;

A completely nested design would be GS-DESIGN


A/B/C;

From instructions of this type, the program will automatically generate columns of the basis corresponding to the main effects and all interactions. By default, the program will do blockwise testing of all terms in the order in which they were generated. Optionally the user may independently reorder columns in the basis a n d / o r partition degrees of freedom any way (e.g., single-degree-of-freedom tests of orthogonal contrasts).

Univariate and multivariate analysis of variance

729

The design may also be conveyed to the program on the CS-MODEL card. Here the effects in the linear model are written down in the order in which significance testing is to be done. The constant is indicated by ' T ' and additions to the model are separated by " + " . A main effects 2-way model would be GS-MODEL I + A + B;

The " , " is used to indicate interaction ( A , B ) , but nesting is denoted by the ":" in the model, which is read "within." For example, the design GS-DESIGN A/B/C;

is indicated on the C S-MODEL card as ~S-MODEL I+A+B:A+C:B:A;

The third way of communicating the design is via tile use of the so-called "Symbolic Basis Vectors" (see Bock, 1975, Chapter 5). This is done by including an CS-BASIS card. Using this notation, each column of the basis is designated by a letter which indicates the type of contrast it corresponds to (C= control; D=deviation; H=helmert; P=polynomial) and a number which indicates the particular degree of freedom. The symbol "0" is used to designate the grand mean. For example, " P I " denotes the first (linear) orthogonal polynomial contrast. SBV's corresponding to each one-way design are written down together so that the column-by-column Kronecker product multiplication needed to construct the basis for the whole design is made explicit. A 2 3 crossed design with helmert contrasts is denoted by H0, H0, H 1, H0, H0,2HI, H1,2H1, Grand mean A effect B effects AB effects

The commas separating SBV's denote in effect the Kronecker product. This method of specifying the design, though more detailed, is the most versatile: any type of standard or nonstandard crossed a n d / o r nested design may be specified using SBV's. For repeated measures analysis, the corresponding GR-DESIGN, CRMODEL, and CR-BASIS options for the within-subjects design are available. The program has internal provisions for computing main effect and interaction tests for the whole plots by crossing the sample and response designs in the appropriate way. Optionally, the user may override this

730

R. Darrell Bock and David Brandt

feature for special cases (e.g., trend analysis via orthogonal polynomials). Tests of Hypotheses. See Sections 1.2 and 2.1. Limitations. The main limitation of the present version of M U L T I V A R I A N C E is the restriction to one choice of error term in any one problem run. This sometimes makes multiple runs necessary and increases the cost of the analysis. Another less noticeable flaw is the inability of the program to generate meaningful contrasts for effects in unbalanced nested designs (because the numbers of subordinate classes vary from one supraordinate class to another). This does not affect the tests of hypotheses, but it renders the estimated contrasts uninterpretable. Admittedly, interpretation of contrasts of nested classes is seldom needed, but it could be provided merely by constructing Helmert contrasts in the reverse direction. Such contrasts would always be defined as the difference of the 2, 3 ..... n class versus the mean of the preceding classes for any n > 2. As to cosmetic features, much of the optional output is controlled by a single parameter on the input description card. It appears that the user could be spared some unwanted and unnecessary output if some items of output could be suppressed or released independently. Documentation. Bock (1963a, 1964), Finn (1974, 1978a, 1978b), Finn and Mattsson (1978). 4.5. MANOVA and O SI R I S M A N O V A MANOVA is another implementation of Bock's (1963a) formulation of the analysis of variance. We discuss two versions: the most recent version of the stand-alone program, MANOVA, written by Elliot Cramer (1974), and the sub-program M A N O V A in the package program known as OSIRIS [Organized Set of Integrated Routines for Investigation with Statistics], distributed by the Institute for Social Research and the associated Inter-University Consortium for Political and Social Research (see Table 1). Method of Solution. (Same as M U L T I V A R I A N C E ) Tests of Hypotheses. (Same as M U L T I V A R I A N C E ) Use of the Program. Cramer's program uses a design formula in which a comma is used to designate crossing and the letter W ("within") used to denote nesting. Interactions are indicated by writing main effects together (e.g., AB). Single degree of freedom tests are indicated by number (e.g., A l, A 2), and the " + " denotes pooling, e.g., A B + A C + BC. Both programs generate deviation or Helmert contrasts automatically; others may be read in manually. OSIRIS assumes factors are crossed unless the user chooses to read in the basis manually. For crossed factors, the number of levels and type of

Univariate and multivariate analysis of variance

731

contrast are input, and the program generates the basis automatically. OSIRIS contains versatile data transformation facilities that make it easy for the user to generate the postmatrix for multivariate analysis of repeated measures, providing the design on the responses is not too large. Cramer's version, on the other hand, has no recoding or transformation facilitiesand is not convenient for analysis of this kind of data. Both programs may be set up with few control cards and do not require extensive documentation. The documentation provided is quite adequate and makes the programs attractive for users who do not want to take the time to master M U L T I V A R I A N C E . The OSIRIS package is unequalled in its ability to manage and analyze large data sets. Researchers who have such data may prefer to keep their data in OSIRIS for all their analyses, since the management of large data sets can be a formidable task. Limitations. Both programs include the main features of M U L T I V A R I A N C E but lack automated provisions for handling repeated measures data and are not written in dynamic storage. However, they do allow for multiple orders of effects within one run of the program. Cramer's version contains estimation facilities while the OSIRIS version does not. Documentation. Cramer (1974); OSIRIS (1974).

4.7. ACOVSM
ACOVSM (Analysis of Covariance Structures including generalized Manova) is a general multivariate program that allows the user to structure both the mean vector and covariance matrix in terms of other parameters to be estimated. The conventional least-squares analyses may be done in ACOVSM, as a special case of the general model, but the program's unique feature is its ability to test hypotheses concerning the structure of Y and E by fixing one or more parameters a n d / o r constraining two or more parameters to be equal to each other. For these applications the program finds the maximum likelihood estimates by the Fletcher-Powell method. The program is now fairly old, having been prepared by J6reskog, van Thillo, and Gruvaeus at ETS almost 10 years ago (J6reskog et al., 1971), and user conveniences are conspicuously lacking. The program uses fixed format control cards, has limited labeling features, few default options for special cases, and the output is relatively poorly labeled--largely in Greek. ACOVSM accepts up to fifteen dependent variables and up to fifteen between group degrees of freedom (independent design effects and covariates). Use of the Program. Data may be entered into the program in two ways: matrices of cross-products may be read in, or the N (p + g) matrix ( Y : A ) may be read in by rows. The program does not permit the user to

732

R, Darrell Bock and David Brandt

read in a design matrix on the means of subclasses. It also has no provisions for generating the matrices L or S symbolically, even for special cases such as I, I ] , or [ O : I ] . The user may not generate orthogonal polynomials automatically. Altogether, these features make it more tedious to use ACOVSM than the other programs. The need to read in all the L and S matrices manually is especially annoying. Documentation. J6reskog et al. (1971).

4.6. SAS GLM (General Linear Model)


This general purpose multivariate analysis of variance program is based on the generalized inverse approach (Searle, 1971). The program handles crossed and nested designs of any complexity. Cell frequencies may be equal or unequal and any number of empty cells is allowed. If input, covariates are adjoined to the design matrix of dummy variables. The test of parallelism of regression hyperplanes is carried out if it is coded in the design matrix as an interaction term. The program will analyze repeated measures data, but no special provisions for either the mixed-model or multivariate approaches have been made. For the mixed-model analysis, each score must be input on a separate data record and the appropriate error terms specified manually. For the multivariate analysis of repeated measures data, the data must be multiplied by the postmatrix externally. Hierarchical designs and random effects models are handled very effectively since different error terms may be used in the same run of the program. However, random effects are not identified explicitly and the program does not automatically choose the appropriate error term. If the residual covariance matrix is not used as the error term, the user must indicate the appropriate matrix. Standard output includes the univariate anova tables for each variable and, if requested, the multivariate test statistics are printed on subsequent pages. The user may instruct the program to set the intercept at zero, print out the X ' X matrix, observed and predicted values for each observation, and a solution to the normal equations. The user may also request that the contrast matrices used in hypothesis testing be printed in the form of a symbolic notation. Method of Solution. The program automatically generates (0, 1) d u m m y variables indicating group membership according to user provided instructions. No reparameterization is done. If covariates are included they are adjoined to the design matrix at whatever point the user indicates. The normal equations X ' X B = X ' Y are solved by computing a generalized inverse. For each effect to be tested, the program generates an L matrix of contrasts among the parameters such that the rows of L are a

Univariate and multivariate analysis of variance

733

linear combination of the rows of X ' X . Tests of the hypothesis:

LB = 0
are made by computing the quantity:

s s ( r B = o) = ( c s ) ' ( L ( x ' x ) -

c')-I(LB).

(See Searle, 197la, p. 188 ff. Note that the L matrix in SAS is Searle's K'.) The error term chosen by the user is then used in computing the test statistics (see also Section l.l). Handling of Non-orthogonal Designs. In the non-orthogonal case different kinds of tests are conducted by generating L by different methods. SAS implements four ways of generating L and labels these options TYPES's I to IV,. However, in general TYPE's III and IV will agree with each other. TYPE I contrasts produce the so-called stepwise or hierarchical solution. Use of this method results in a completely orthogonal decomposition of the total sum of squares and requires an a priori ordering of effects for interpretation. TYPE II Contrasts adjusts main effects for other main effects, two-way interactions for main effects and all other two-way interactions, etc. This has sometimes been referred to as the "experimental design" approach. Of course, these tests could be obtained by reordering TYPE I tests, but SAS does not have any facilities for reordering effects. TYPE III and IV contrasts give the So-called full regression solution. Each effect is adjusted for all other effects in the model. The manual states that "TYPES III and IV hypotheses will coincide for all effects (when there are no missing cells). When there are missing cells, the hypotheses may differ. By using the G L M procedure, it is possible to study the differences in the hypotheses" (p. 316). Use of the Program. The experimental design is indicated on a M O D E L card by simply listing the dependent and predictor variables to the left and right of an equals sign, respectively. A two-way crossed design with dependent variables Y1 and Y2 could be coded: MODEL YI Y 2 = A B A , B SAS has recently introduced the slash (/) to indicate crossed factors. Thus the statement MODEL YI Y 2 = A / B is equivalent to the first statement. Unless this notation is used, all

734

R. Darrell Bock and David Brandt

interactions must be coded explicitly. The order in which effects are written down is the order in which they appear in X. To indicate nesting, the parentheses is used, e.g., M O D E L YI Y 2 = A B(A); Covariates are simply added on the right side of the equation as predictor variables, e.g., M O D E L Y1 Y 2 = A B A.B X1 X2.

The above statement would generate a design matrix with the covariates occupying the last two columns. The TYPE of solution is also requested on the M O D E L card. The error term to be used will be the residual unless a T E S T a n d / o r a M A N O V A card is included. T E S T pertains to univariate statistics; M A N O V A to multivariate. A M A N O V A card must be included to get any multivariate tests. On either of these cards the user m a y pair an effect to be tested with the selected error term, e.g., TEST H=A E=B(A).

There may be any number of T E S T and M A N O V A statements. Standard output includes univariate A N O V A tables for each T Y P E of contrast and, if requested, multivariate test statistics. The univariate tests are grouped together into standard A N O V A tables and the multivariate tests follow on subsequent pages. Tests of each effect are reported on separate pages. Limitations. Several features of limited value are in the program while some more familiar and useful options are missing. The program does not print out observed cell or marginal means, but it will optionally print estimated means based on fitting the full rank model. There is no possibility of obtaining estimated means based on fitting a reduced rank model. The non-orthogonal extension of Fisher's analysis of covariance is not implemented for TYPES I and II hypotheses, and the manual does not explain what hypotheses actually are being tested. The program does not do any related multivariate computations, such as principal c o m p o n e n t analysis, canonical correlation, and discriminant analysis, although other programs in the SAS package do principal component and discriminant analysis. Documentation. Barr et al. (1976).

Univariate and multivariate analysis of variance


5. Test problems

735

5.].

Non-orthogonal univariate analysis of variance

Data for this problem were taken from an unpublished dissertation by Horner (1968). Ninety female college students served as subjects and were measured on two cognitive tasks, anagrams and arithmetic, under three conditions of competition. In the non-competitive condition subjects worked alone, in the mixed-sex condition, subjects competed with male students, and in the same-sex condition, subjects competed with female students. In addition, the subjects were classified as high or low on a personality inventory intended to measure "fear of success," i.e., the tendency to limit one's performance on intellectual tasks so as not to appear overachieving. Our tests of the multivariate features of M U L T I V A R 1 A N C E , M A N O V A , and SAS G L M indicated that the three programs agree as long as exactly the same multivariate hypothesis is being tested. However, there are important differences among the programs in their handling of non-orthogonal designs and covariates. For reasons of simplicity we illustrate these differences on a univariate problem. In Table 5.1, numbers of subjects and mean achievement on the anagrams task are shown for each subclass of the 2 3 design. Although the subclass sample sizes are only moderately disproportionate, the design is sufficiently unbalanced to yield different sums of squares in the various orders of eliminating effects used by the programs. This is apparent in the F-statistics, each using the c o m m o n within-subclass mean square as denominator, shown in Table 5.2. If the purpose of the experiment were to detect interaction of condition of competition and personality type, the three orders of analysis would give the same answer. If both interaction and conditions were of interest, the hierarchical solution would serve. Since there is no evidence to suggest the existence of condition effects, this Table 5.1 Numbers of subjects (above) and means (below) for the anagrams test for subclasses of a 2 3 design B Classes (conditions) Fear of success High Low Noncompetitive 17 54.06 13 44.53 Mixed-sex competitive 19 53.00 11 47.45 Same-sex competitive 20 54.15 10 48.80

736

R. DarrellBock and David Brandt

Table 5.2 Comparison of F-Statistics from univariate analyses of variance of an unbalanced A x B design Type of solution Hierarchical~ A B IA ABIA,B 10.4t 0.45 0.40 "Experimental design',b A]B B Ia ABJA,B 9.98 0.24 0.40 Full regressionc AIB,AB B [A,AB ABIA,B 9.76 0.33 0.40

aMULTIVARIANCE, MANOVA, SAS I, SPSS option 10. bSAS II, SPSS default. CBMDP2V, SPSS option 9. solution also provides a test of the only significant "effect," that of fear-of-success. Inasmuch as main effects in a fixed-effects design cannot be interpreted in the presence of interaction, the full regression solution, except for the test of interaction, seems irrelevant in this example. 5.2. Non-orthogonal univariate analysis of covariance

In the study described in Section 5.1, three background measures were obtained prior to the experiment: X1 = Scrambled Words tests, X 2 = a Test Anxiety measure, and X3 = a N e e d Achievement measure. If these measures are employed as covariables in analyzing the data in Table 5.1, a great variety of different results are obtained by the programs and their various options. In Table 5.3 we show F-statistics for an hierarchical ordered analysis of covariance and, in Table 5.4, the analysis of covariance for an experimental design ordering. Unlike the results in Table 5.2, F-statistics for the same type of hypothesis do not agree because the programs use different methods of computing the regression analysis and different ways of adjusting for the effects of the covariates. The methods are indicated in the footnotes to the tables. The following is, to the best of our belief, the explanation for disparate results from these analyses. The SAS I hierarchical approach in G L M implements Searle's analysis (1971a, pp. 344-345) in which the covariates are adjoined to the design matrix and the cross-products pivoted in order. The covariates m a y be placed either before or after the design d u m m y variables. We have designated as SAS Ia the analysis which results from placing the covariates last in the order and as SAS Ib the results of placing them first. SAS Ia gives the within-groups regression reported by M U L T I V A R I A N C E and M A N O V A , but the tests of the design effects are different since no covariate adjustment is made even though the m e a n square in the

Univariate and multivariate analysis of variance


Table 5.3 Comparison of univariate F ' s computed by different programs: Non-orthogonal A NCOVA-hierarchical method Source MULTIVARIANCEa/ M A N O V A a, b A SAS Ia a 15.77 15.13 6.45 6.21 0.68 0.36 0.25 7.09 6.85 7.09 6.85 0.68 0.36 0.24 0.12 0.60 2.13 15.42 44.13 0.05 2.04 44.18 0.05 2.04 2.13 49.96 0.02 2.39 2.13 17.45 49.96 0.02 2.39 2.13 14.40 41.03 0.30 1.87 0.24 0.12 Program SAS Ib SPSS10 SPSS[8, 10]/[7, 10]d 15.77 15.13

737

AIB
A IX

AIB,X
B B IA

BIX Bl~,x AB[A,B AB]A,B,X


X XI

o.12

X2]X 1 X3IX 1,X2

"Within-groups regression. b M A N O V A does not report stepwise tests of the covariates; it is otherwise identical to MULTIVARIANCE. cOne-sample regression. aRegression eliminating m a i n effects.

denominator is reduced. SAS Ib carries out a one-sample regression, but uses different types of covariate adjustments in testing the design effects. The type of adjustment depends on the order in which the design effects are tested. For the first main effect in the order, the adjustment is based on the reported one-sample regression. For the second main effect, the adjustment is made using a regression eliminating the mean differences associated with the first factor. The test of the highest order interaction is calculated after eliminating all mean differences. In other words, for the example in Table 5.3, three different types of covariate adjustments were made. This means that the order in which design effects are eliminated affects the results even when the design is balanced. This is true because the order affects the type of covariate adjustment being made. SPSS offers the same analysis as SAS Ib as their option 10 and, for all practical purposes, options 7, 10 and 8, 10 are identical. They differ only in that 7, 10 gives a blockwise test of the main effects and covariates jointly, while 8, 10 gives separate blockwise tests of main effects and covariates. The usual tests of main effects, interactions, and covariates agree exactly. Both options test main effects without adjusting for covariates, covariates after adjusting for main effects, and interactions after eliminating main

738 Table 5.4

R. Darrell Bock and David Brandt

Comparison of univariate F ' s computed by different programs: Non-orthogonal A N C O V A - - "experimental" method Source SAS I P SPSS 7b Program SPSS 8b 15.13 SPSS Default c

AIB

AIB,X
B [A

6.85 0.12
2.13 37.52 0.09 2.04

6.85
0.36

6.85 0.12
2.13 14.40 34.27 0.44 1.87 2.13 17.45 43.61 0.08 2.39

BIA,X
ABIA,B,X
X

0.12
2.13 34.27 0.44 1.87

X 11X2,X3
X 2IX 1, X 3

X3[X1,X2

aWithin-groups regression. bRegression eliminating main effects. COne-sample regression.

effects and covariates. This is an especially peculiar analysis since some of the design effects are not adjusted for the covariates and others are. It is possible to obtain this analysis in SAS G L M by using the TYPE I option and putting the covariates in the middle of the design matrix; however we cannot imagine a circumstance in which one would want to do so. In the "Experimental" approach (SAS II; SPSS 7; SPSS 8; SPSS default) in Table 5.4, SAS II uses a within-groups regression but gives full regression rather than stepwise tests of the covariates. The type of covariate adjustment used in testing design effects, in general, does not correspond to the reported regression analysis. Main effects are adjusted for all other main effects, and the covariate adjustment is made using a regression eliminating the remaining main effects. The tests of main effects in SPSS default and SPSS 7 agree with SAS II, but the reported regression analysis is different. SPSS default reports a one-sample regression, and SPSS 7 reports a regression eliminating main effects only. SPSS 8 gives the regression analysis of SPSS 7 but tests main effects without adjusting for covariates. The interaction is tested after eliminating main effects and within-group regression effects. In the "Full regression" approach (BMDP2V, SPSS 9; SAS III and IV), not shown in the tables, there is little room for variations, and all programs agree. A within-groups regression is reported and each design effect is tested after covariate adjustment and after controlling for all other effects in the model. This corresponds to Fisher's analysis when the design is balanced.

Univariate and multivariate analysis of variance

739

5.3.

A components of variance problem run on B M D P 3 V

To investigate the program's ability to handle a random effects model, we chose an orthogonal problem so we could compare P3V's solution to the OLS estimates. The data are from a three-facet generalizability study r e p o r t e d in Cronbach et al. (1972, p. 33). Seven scientists (subjects) were rated on five traits having to do with creative performance b y each of three senior scientists (raters). All facets are considered r a n d o m in these analyses. A comparison of OLS (BMD08V) and R E M L (P3V) estimates is given in Table 5.5. Table 5.6 gives M L estimates for the full and three constrained models.
Table 5.5 Least square a n d R E M L estimates of components: Cronbach example Source Subject (S) Traits (T) Raters (R) S XT SXR TXR error cpu time cost aBMD08V. bBMDP3V. df 6 4 2 24 12 8 48 Least squares ~ 0.44 0.43 - 0.12 0.32 1.41 0.06 1.18 0.36 sec 46g" REML b 0.47 0.43 0 0.32 1.30 0.05 1.18 (0.64) (0.39) (0) (0.22) (0.58) (0.12) (0.24)

37.t7 sec $6.13

Table 5.6 M L Solution to Cronbach's example problem a Starting values 0 0 0 0 0 0 1 Variance component (SE) for full and constrained models Full 0.38 0.37 0. 0.32 1.30 0.06 1.18 (0.51) (0.34) (0) (0.22) (0.59) (0.12) (0.24) SxT=O SxR=O TxR=O

Source Subjects (S) Trait (T) Rater (R) Sx T SxR TR error X~)

df 6 4 2 24 12 8 48

0.44 (0.57) 0.80 (0.54) 0.38 (0.57) 0.42 (0.35) 0.39 (0.35) 0.39 (0.34) o. (0) 0.05 (0.12) o. (0) o. (0) o. (0) 0.30 (0.22)
1.24 (0.58) 0.01 (0.1]) 1.51 (0.25)
3.00

O. (0) O. (0) 2.39 (0.35)


24.47

1.29 (0.58) O. (0) 1.24 (0.23)


0.32

Note: cpu time for this run was 114.62 s on an IBM 370//168. Total charge was $16.84. Cost of a run for the full model alone was $9.73; cpu time was 63.07 s. aThe least squares solution given in Cronbach is incorrect (Gleser, personal communication).

740

R. Darrell Bock and David Brandt

The most obvious feature of these analyses is actually the computing time necessary to reach a solution. This is a rather small problem ( N = 7) and the OLS solution was computed using essentially a trivial amount of cpu time. The R E M L analysis, which only involved one optimization, took 37.17 s of cpu time on the 370/168 and a single M L optimization took 64 s (the results given in Table 5.6 are from one run which performed four optimizations). To generate a non-orthogonal problem of comparable size, we simply added a two level between subjects factor to the Cronbach example and nested the first three subjects in level one of this factor and the remaining four in level two. This added factor was considered fixed. We then estimated the full model (all interactions) and the three constrained models indicated in Table 5.6. Compared to the analyses of the other test problems, the ML estimation in this analysis was unusually expensive. The run took 206.43 s of cpu time and cost $34.37, $29.80 of which was for cpu time.
5.4. An A C O V S M analysis of a quasi-Wiener simplex model with equal error variances

Table 5.7 presents a reanalysis of the growth curve data from Potthoff and Roy (!964). The data consist of a dental measurement representing the distance from the center of the pituitary to the pteryomaxillary fissure, obtained on eleven girls and sixteen boys, at ages 8, 10, 12 and 14. Since the measures are on the same scale and it is reasonable that error variance is constant, the quasi-Wiener simplex with equal error variances was fitted to S (see J6reskog, 1970b). The model is
E = TD, T' + o2I,

where T is a lower triangular matrix of all l's, D s is a diagonal matrix containing the variances of the independent increments, and 02 is the error variance. The model implies that each variable incorporates a new component of growth; thus it is appropriate for cumulative data such as these. The matrices A (transposed) and P were defined as A, [1 0 1 , 0 -o. -.. 1 1 1 l 0 0 1 1] 3 " 0 1 ..--. 0l

and

P=

-3

-1

A preliminary run indicated that the MLE of Ds, was zero, with a standard error of 20,000, i.e., a boundary had been encountered. Setting

Univariale and multivariate analysis of variance

741

Table 5.7 Potthoff-Roy example: Quasi-Wiener simplex with equal en'or variances 1~= TDs, T'+ dl Model 1. 2. 3. 4. X2 df 10 11 ll 12 X2 diff. 8.36" 5.03 I 1.62* df 1 1 2 Slope and intercepts unequal 10.45 Different slopes 18.81 Different intercepts 15.48 Same slope and intercept 22.07 *p < 0.0l Solution for Model 3 [ 22.80 (0.56) 0.66 (0.07) l = [ 24.86 (0.46) 0.66(0.07)
Ds. = [1.66 (0.29), 0.46 (0.71) 0.92 (0.39) 0"]

d=[1.30 (0.13)] ,,~= 20.83 22.14 22.69 24.20 X~ 21.18 22.23 22.87 23.81

23.46 25.52 23.09 25.72

24,78 1 26.84j 24.09 27.47J 4.45 4.66 J Z= 2.76 2.76 2.97 5.51 2.76 2.97 3.82 5.51

5.03 ] S-- 2.51 3.89 3.64 2.70 6.01 2.51 3.07 3.82 4.62

Notes: Subjects are 11 girls and 16 boys measured at ages 8, 10, 12, and 14. The dependent variable is a "certain" dental measurement.

this parameter to zero and proceeding with a test of the parameters in Z, showed that the linear contrast was sufficient to describe growth, so the crucial questions b e c a m e whether the growth curves were parallel or equal. A d o p t i n g an a-level of 0.01, the comparisons of Table 5.7 suggest that model three (parallel lines of significantly different elevations) is the most reasonable. Table 5.7 gives the solution for M o d e l 3. Evidently the sexes show the same pattern of growth, though boys are larger b y a constant value. There appears to be m o r e individual difference variation in initial level than in any of the increments. A C O V S M c o m p u t e d this a n d other example problems remarkably efficiently given the additional complexity of the iterative solution Cpu time on the I B M 3 7 0 / 1 6 8 was on the order of 0 . 6 t o 1.1 seconds for a single run which included both s t a n d a r d and general analyses. Cost of a typical run was on the order of 80 cents, most of which was I / O a n d j o b charges. This excellent p e r f o r m a n c e may, in part, be due to the author's experience in selecting g o o d starting values for the iterations. We have no data on h o w the p r o g r a m behaves when p o o r starting values are used.

742 References

R. Darrell Bock and David Brandt

Anderson, T. W. (1958). An Introduction to Multivariate StatisticalAnalysis. Wiley, New York. Barr, D. J., Goodnight, J. H., Sall, J. P. and Helwig, J. T. (1976). A user's guide to SAS-76. SAS Institute, Raleigh, NC. Bock, R. D. (1963a). Programming univariate and multivariate analysis of variance. Technometrics 5, 95-115. Book, R. D. (1963b). Multivariate analysis of variance of repeated measurements. In: C. W. Harris, ed., Problems in Measuring Change, The University of Wisconsin Press, Madison, 85-103. Bock, R. D. (1964). A computer program for univariate and multivariate analysis of variance. Proceedings of the Scientific Computing Symposium on Statistics, Thomas J. Watson Research Center, Yorktown Heights, NY. Bock, R. D., and Bargmaml, R. E. (1966). Analysis of covariance structures. Psychometrika 31, 507-534. Bock, R. D. (1975). Multivariate Statistical Methods in Behavioral Research. McGraw-Hill, New York. Bock, R. D. (1979). Univariate and multivariate analysis of variance of time-structured data. In: J. R. Nesselroade and P. B. Baltes, eds., Longitudinal Research in Human Development: Design and Analysis. Academic Press, New York. Carlson, J. E., and Timm, N. (1974). Analysis of nonorthogonal fixed-effects designs. Psychol. Bull. 81, 563-570. Cliff, N., and Krus, D. J. (1976). Interpretation of canonical analysis: rotated versus unrotated solutions. Psychometrika 41, 35-42. Cochran, W. G., and Cox, G. M. (1957). Experimental Designs, Wiley, New York, 2nd ed. Corbeil, R. R., and Searle, S. R. (1976). Restricted maximum likelihood (REML) estimation of variance components in the mixed model. Technometrics 18, 31-38. Corsten, L. C. A. (1958). Vectors, a tool in statistical regression theory. Instituut voor Rassenonderzoek van Landbouwgewassen te Wageningen, Wageningen, The Netherlands. Cramer, E. M. (1974). Revised MANOVA program. Thurstone Psychometric Laboratory, University of North Carolina at Chapel Hill. Cronbach, L. J., Gleser, G. C , Nanda, H., and Rajaratnam, N. (1972). The Dependabifity of Behavioral Measurements: Theory of Generalizability of Scores and Profiles. Wiley, New York. Dixon, W. J., and Brown, M. B. (19"17). BMDP: Biomedical Computer Programs--P Series. University of California Press, Los Angeles. Finn, J. D. (1969). Multivariate analysis of repeated measures data. Multivar. Behav. Res. 4, 391-413. Finn, J. D. (1974). A General Model for Multivariate Analysis. Holt, Rinehart and Winston, New York. Finn, J. D. (1978a). Multivariate analysis of variance and covariance. In: K. Enslein, A. Ralston and H. Wilf, eds., Statistical Methods for Digital Computers. Wiley, New York. Finn, J. D. (1978b). M U L T I V A R I A N C E VI: Univariate and multivariate analysis of variance, covariance, regression and repeated measures. International Educational Services, Chicago. Finn, J. D., and Mattsson, I. (1978). Multivariate analysis in educational research: Applications of the MULTIVARIANCEprogram. International Educational Services, Chicago. Fisher, R. A. (1967). Statistical Methods for Research Workers. Hafner, New York, 13th ed. HarviUe, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. J. Amer. Statist. Assoc. 72, 320-338.

Univariate and multivariate analysis of variance

743

Hemmerle, W. J. and Hartley, H. O. (1973). Computing maximum likelihood estimates for the mixed A.O.V. model using the W transformation. Technometrics 15, 819-831. Homer, M. S. (1968). Sex differences in achievement motivation and performance in competitive and non-competitive situations. Unpublished doctoral dissertation, University of Michigan. (University Microfilm 69-12, 135.) Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321-377. Householder, A. S. (1964). The Theory of Matrices in Numerical Analysis. BlaisdeU, Waltham, Mass.. Jennrich, R. I., and Sampson, P. F. (1976). Newton-Raphson and 1elated algorithms for maximum likelihood variance component estimation. Technometrics 18, 11-17. Joreskog, K. G. (1970a). A general method for analysis of covariance structures. Biometrika 57. 239-251 J/~reskog, K. G. (197019). Estimation and testing of simplex models. British J. Math. Statist. Psychol. 23, 121-145. Joreskog, K. G. (1973), Analysis of covariance structures. In: P. R. Krishnaiah, ed., Multivariate Analysis 111. Academic Press, New York. Joreskog, K. G. (1974). Analyzing psychological data by structural analysis of covariance matrices. In: D. Krantz, R. C. Atkinson, R. D. Luce and P. Suppes, eds., Contemporary Developments in Mathematical Psychology, Vol. 2 W. H. Freeman and Co., San Francisco. Jtreskog, K. G. (1979). Statistical estimation of structural models in longitudinal-developmental investigations. In: J. R. Nesselroade, and P. B. BaRes, eds., Longitudinal Research in Human Development: Design and Analysis. Academic Press, New York (in press). Jtreskog, K. G., van Thillo, M. and Gruvaeus, G. T. (1971). ACOVSM: A general computer program for analysis of covariance structures including generalized MANOVA. Research Bulletin 71-01. Educational Testing Service, Princeton. Khatri, C. G. (1966). A note on MANOVA model applied to problems in growth curves. Ann. Inst. Statist. Math. 18, 75-86. Kutner, M. H. (1974). Hypothesis testing in linear models (Eisenhart model I). Amer. Statist. 28, 98-100. McNemar, Q. (1962). Psychological Statistics. Wiley, New York. Nelder, J. A. (1965). The analysis of randomized experiments with orthogonal block structure. I and II. Proc. Roy. Soc. London 283, 147-162. Nie, N., Hull, C. H., Jenkins, J., Steinbrenner, K., and Bent, D. (1975). SPSS: Statistical Package for the Social Sciences, McGraw-Hill, New York, 2nd ed. OSIRIS (1974). Organized set of integrated routines for investigation with statistics. Institute for Social Research, Ann Arbor. Overall, J., and Klett, C. J. (1972). Applied Multivariate Analysis. McGraw-Hill, New York Overall, J. E., and Spiegel, D. K. (1969). Concerning least squares analysis of experimental data. Psychol. Bull. 72, 311-322. Patterson, H. D., and Thompson, R. (1974). Maximum likelihood estimation of components of variance. Proceedings of the 8th International Biometric Conference, 197-207. Potthoff, R. F., and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313-326. Rao, C. R. (1952). Advanced Statistical Methods in Biometric Research. Wiley, New York. Roy, S. N. (1957). Some Aspects of Multivariate Analysis. Wiley, New York. Searle, S. R. (1971a). Linear Models. Wiley, New York. Searle, S. R. (1971b). Topics in variance component estimation. Biometrics 27, 1-76. Scheifley, V. M., and Schmidt, W. H. (1978). Analysis of repeated measures data: A simulation study. Multivar. Behav. Res. 13, 347-362. Speed, F. M. and Hocking, R. R. (1976). The use of R( ) notation with unbalanced data. Amer. Statist. 30, 30-33.

744

R. Darrell Bock and David Brandt

Timm, N. (1975). Multivariate Analysis with Applications in Education and Psychology. Wadsworth Publishing Co., Belmont, California. Wiley, D. E., Schmidt, W. H., and Bramble, W. J. (1973). Studies of a class of covariance structure models. J. Amer. Statist. Assoc. 68, 317-323. Wilks, S. S, (1932). Certain generalizations in the analysis of variance. Biometrika 24, 471-494.

P. R. Kl-ishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 745,971

~_
/~'--IF

Computations of Some Multivariate Distributions*


P. R. Krishnaiah

1.

Introduction

Multivariate distributions play an important role in tests of hypotheses under ANOVA and M A N O V A models. For example, multivariate t, multivariate F, multivariate normal and multivariate chi-square distributions are useful in the application of the finite intersection tests proposed by Krishnaiah (1965, 1979) for multiple comparisons of means and mean vectors of univariate and multivariate normal populations respectively. Similarly, various functions of the roots of the Wishart matrix, multivariate beta matrix and multivariate F matrix are useful for testing the general linear hypotheses under M A N O V A models. In this chapter, we discuss briefly certain computational aspects of several multivariate distributions and give percentage points of some of these distributions. Multivariate normal and multivariate t distributions are discussed in Section 2, whereas the distributions of the studentized largest chi-square and studentized smallest chi-square are discussed in Section 3. In Section 4, we discuss the distributions of the range and studentized range whereas Section 5 is devoted to a discussion of a multivariate chi-square distribution, and the multivariate F distribution introduced by Krishnaiah (1965). The distribution of the ratio of two correlated chi-square variables is discussed in the same section. The distributions of quadratic forms are discussed in Section 6, whereas the distribution of the maximum of correlated Hotelling's T 2 statistics is discussed in Section 7. In Section 8, we discuss the distributions of the individual roots of the central Wishart matrix and central multivariate beta matrix, whereas the distributions of
*This work was sponsored partly by the Air Force Flight Dynamics Laboratory under the Grant AFOSR 77-3239 and partly by the Air Force Office of Scientific Research under the Contract F49620-79-C-0161. Reproduction in whole or in part is permitted for any purpose of the United States Government. 745

746

P. R. Krishnaiah

various ratios of the roots of these matrices are discussed m Section 9. The distributions of the traces of the multivariate F matrix a n d multivariate beta matrix are discussed in Section 10. Most of the tables given at the end of this chapter were constructed b y Krishnaiah and his colleagues. The discussions in the text are primarily devoted to the procedures used in the computation of the tables given in this chapter. For reviews of the literature on some multivariate distributions, the reader is referred to Johnson and Kotz (1972) and Krishnaiah (1978, 1979).

2.

Multivariate normal and multivariate t distributions

Let x ' = ( x I..... Xp) be distributed as a multivariate normal with m e a n vector #'=(/*1 . . . . . 15,) and covariance matrix 12= o2~2 where ~2=(Oij ) is the correlation matrix. Now, let Oij=cicj for i=/=j. Then, it is known (e.g., see Durmett and Sobel (1955)) that

x,-.,:qV--: y,-:o

(2.1)

where Y0,---,Yp a r e distributed independently and normally with m e a n zero and variance one. So,

P[ x i <~ai; i= 1,2 ..... P ]

= P[
=

Ji ~'~ (iYo-- ~J~+ai) /VUl --ci2 ; i = l . . . . . P]

{'s:
~ V2~r

-oo

exp. - 7~-y0 z

i=

IPi(Yo) dYo

(2.2) where
= -

and d i =(CiYo+ a i - - I ~ i ) / ~ -~- C2 . In Eq. (2.2) the inner integral liP/= l+i(Y0) m a y be evaluated by using any of the standard routines, whereas the outer integral m a y be evaluated by using G a u s s - H e r m i t e quadrature formula for each i. G u p t a (1963) gave extensive tables for the distribution function of the multivariate normal when Pij = P (i 4:j). We now discuss the evaluation of the multivariate t distribution. Let t i = Xi~t-n/S, ( i = 1..... p), where x 1.... ,xp were defined earlier. Also, let s 2 / o 2 be distributed independent of x as a chi-square variable with n

Computations of some multivariatedistributions

747

degrees of freedom. Then, the joint distribution of t I..... tp is a central (noncentral) multivariate t distribution with n degrees of freedom when # = 0 0s4=0). Dunnett and Sobel (1954)'and Cornish (1954) independently obtained the following expression for the density of the central multivariate t distribution:

..... , , ) =

lal

l/2F[~(~ p

+")l/',+
k

where t ' = ( t t .... ,~). Kshirsagar (1961) derived the density of the noncentral multivariate t distribution. The probability integral of the noncentral multivariate t distribution can be evaluated by using the following known result:

faloo''" a~: f2 ( / i ..... te)dtl.., dip =

--r
(2.4) where h(x) is the density of chi distribution with n degrees of freedom,

~(YO,x)=(2~r)--sUz~ { f?' exp(-ly~)dy,) i=l

(2.5)

and ~i=[(aixl~Fnn)-I.ti+CiYol/~-c 2 . When # = 0 , Pij=P (i~j) p > 0 and ai . . . . . ae = a, it is of interest to compute a for given a or vice versa where

f a

f]

f2(ti,'",tp)dt, "'" dt,=(1-a).

(2.6)

For computation of the above probability integral, we can use the follow~ing approximation:

f a ""U fz(' l..... 'p)dti'"dt,~-2 cexp(--xe)x=-' r(~.)


x[@--fLexp(-yg)(@f_~ooexp(--yE)dy)Pdyo]dx
(2.7)

where c is a properly chosen constant. Now, let R be the error committed

748

P. 1L Krishnaiah

by using the above approximation. Then R < f ~ f e x p ( - X)X (n-2)/2 r(n) dx. (2.8)

We can make R as small as possible by choosing c large enough. By making the transformation z -- ( 2 x - c)/c, the right side of (2.7) becomes equal to

1 exp[--c2(z+l)2](z+l) "-l 2n-1 f l l r(n)


where 2v~n But, we know that -P

en

-3-f

~ ( 1 -8* ~" ooe x p ( - y~) - ~ j - ooexp(-Y2) dy ) dy o dz

(2.9)

d-oofS*exp(-y2)dy = ~ +_ ,~*2exp(-z)z -1/2 dz


~0 r(~) where the sign is positive or negative according as 8" is positive or negative. The right side of Eq. (2.9) can be evaluated by using one of the routines for the evaluation of the incomplete gamma integral for desired values of a, n, 8, Yo and z. Then, the integral V,~ f_~oexp(-Yo 2)

~-~-f'_ooexp(-yZ)dY

dyo

can be evaluated by using Gauss-Hermite quadrature formula for desired values of z, 6, n, p and a. Finally, the integral (2.9) was computed for required values of a, 6, n and p by using Gauss-quadrature formula method with c = 10 and a combination of 40 point Gauss-Hermite quadrature formula and 48 point Gauss quadrature formula. Krishnaiah and Armitage (1965a) computed the values of a for a =0.1(0.20)6.1. The values of a were then computed by using cubic interpolation for a - 0 . 1 0 , 0.05, 0.025, 0.01, p = 1(1)10, 0=0.0(0.1)0.9, and n=5(1)35. Tables of the values

Computations of some multivariate distributions

749

of a for a =0.05, 0.01 were given in Krishnaiah and Armitage (1966); some of these tables are given in Table 1.* Dunnett and Sobel (1954) gave upper percentage points of the bivariate t distribution for p=0.5. Dunnett (1955) and Gupta and Sobel (1957) gave upper percentage points of the multivariate t distribution when 0=0.5 Krishnaiah et al. (1969a) computed the values of ct for p = 2, a--1.0(0.1)5.5, O=0, +0.1, ___0.2..... +0.9 and n=5(1)35. For reviews of the literature on the multivariate normal and multivariate t distributions, the reader is referred to Gupta (1963) and Johnson and Kotz (1972).

3. Distributions of the studentized largest and smallest chi-square distributions


Let s o, 2 s2 I ..... s;2 be distributed independently as chi-square variables with n,m 1..... mp degrees of freedom respectively. Also, let F~=ns2/mi s2 for i = 1,2 ..... p. When m I . . . . . mp= m, the distributions of m a x ( F 1..... Fp) and min(F 1..... Fp) are known as the studentized largest chi-square and the studentized smallest chi-square distributions respectively. We will first discuss the evaluation of the probability integrals associated with the joint distribution of F t ..... Fp. Consider the problem of evaluating ct for given b 1. . . . . bp, cl,..., cp where

e[bi<Fi<ci; i--1 . . . . . p]----(1-- a).


The left-side of Eq. (3.1) can be written as
p

(3.1)

f0 exp(-- x)x ("/2)-1 H (I((eimixln); mi)- I((bimixln); mi) ) dx r(n) ,=,

(3.:)
where

I(a; q ) = r ( q )

f0 a

e x p ( -- y)y(q/2)-i d y .

(3.3)

We first discuss the evaluation of l(a; q). If a is large, l(a; q) can be


*All tables are placed in the Appendix at the end of this paper.

750

P. R. Krishnaiah

approximated by using the following asymptotic expansion:

i(a;q)=l_[ exp(-a) F(lq)

((q-2) a(, -2)/2 1 + 2a

(q-2)(q-4) 4a 2

-~ (q-2)(q-4)(q--6) + . . . 1 ] .
8a 3

)j

(3.4)

If a is small, we use

l(a;

q) = e x p ( -

a)j~= a(q/2)+J (q+J)!

(3.5)

When (3.4) is used, we use the first few terms of the asymptotic expansion with the assurance that these terms give good approximation. The series on the right side of (3.5) can be approximated by taking the firstp terms only where p is chosen such that (S~(q)- Sp(q))/Sp(q) is less than a prescribed quantity and
P a j + (q/2)

Sp(q) = e x p ( - a ) j ~ _ o

((q+ 2j))!

"

(3.6)

But, it is known (see Wilk et al. (1962)) that

Soo(q) - Sp(q) < aP2p+1 q(q+ 2)(q+4). . . (q+ 2p-2)(q+ 2p-2a) Sp(q)
(3.7) provided that p > (a - q). Using the above inequality, we can estimate the number of terms required to get the desired accuracy. In actual practice, we can determine as to whether p terms are sufficient by adding more terms and examining as to whether this makes any difference. The above method can be used whether q is even or odd. But, when q is even, it is better to use the following known expression for the evaluation of the incomplete gamma integral:

I(a;

q)= 1- e x p ( -

a)(q/f ai j=o-~ J---('"

(3.8)

Computations of some multivariate distributions

751

Next, consider the evaluation of the integral fo exp( where


x(m/2)- 1 P

x)~p(x) dx

(3.9)

44x)-

r(m) i=,II(I(cimix/n);

mi}-I((bimlx/n); mi).

Using Gauss-Laguerre quadrature formula, the integral (3.9) can be approximated with E~-=,A)Otp(x)O) where

Aj(.t

Lt(x ) = ( _

[L,+,(xj)] 2'

l)%X dt (e_Xxt) dx t

and the xfs are the roots of Lt(x ). The number of points t to be chosen depends upon the degree of the accuracy desired. Now consider the problem of computing c for given values of a or vice versa where

P[u<c;

i--1 ..... p ] = ( 1 - - a),

(3.10)

u = m a x ( F 1..... Fp), and mi=m for i = 1,2 .... ,p. The left side of Eq. (3.10) is equivalent to (3.2) when ci = c and bi = 0. So, the method described in this section for the evaluation of a in Eq. (3.1) can be used to evaluate a for given values of c in Eq. (3.10). Using the above method, Armitage and Krishnaiah (1964) computed the values of c for (i) a = 0 . 1 0 , 0.05, 0.025, m=1(1)19, n=5(1)45, p = 1 ( 1 ) 1 2 and (ii) a = 0 . 0 1 , m = l ( 1 ) 1 9 , n=6(1)45 and p = 1(1)12. In computing these tables, they used 32 point Gauss-Laguerre quadrature formula to compute the values of a for different values of c with an increment of 0.25. Then, they computed the values of c for different values of a, p, m and n by using the cubic interpolation. Table 2 gives the values of c for a =0.05 for the values of p, m and n mentioned above. The entries in this table are taken from the tables of Armitage and Krishnaiah (1964). Some very special cases of the tables of the studentized largest chi-square distribution were given by Finney (1941), Nair (1948), and Ramachandran 0956). Gupta (1962) gave the reciprocals of the upper 25%, 10%, 5% and 1% points of the above distribution when m = n = 2(2)50.

752

P. R~Krishnaiah

Next, consider the problem of computing b for given values of a or vice versa where

P[F~>~b; i = 1..... p ] =(1 - a )

(3.11)

and mi=m. The left side of Eq. (3.11) is equivalent to P[v>~b] where v = m i n ( F 1..... Fp). The distribution of v is known to be the studentized smallest chi-square distribution. The left side of Eq. (3.11) is equivalent to (3.2) when c/= m and bi = b for i = 1,... ,p. So, the method discussed for the evaluation of (3.1) can be used to evaluate a for given values of b in Eq. (3.11). Using the above method, Krishnaiah and Armitage (1964) computed the values of b for a = 0 . 1 0 , 0.05, 0.025, 0.01, m = 1(1)20, n=5(1)45 and p = l ( 1 ) 1 2 . In applying the above method, they used 32 point Gauss-Laguerre quadrature formula to compute the values of a for given values of b, m, n and p. Then, the values of b were computed for given values of a, p, m and n by using cubic interpolation. Table 3 gives the values of b for a=0.05, m - 1 ( 2 ) 1 9 , n=5(1)10(2)22, 25(5)45 and different values of p. The entries in this table are taken from the tables of Krishnaiah and Armitage (1964). Ramachandran (1958) constructed the lower 5% points of the studentized smallest chi-square distribution for selected values of p, m and n whereas Gupta and Sobel (1962) computed the lower 25%, 10%, 5% and 1% points of the studentized smallest chi-square distribution for m = n -- 2(2)50 a n d p = 1(1)10.
4. Distribution of the range and studenfized range

Let x 1..... x n be distributed independently and normally with mean 0 and variance o 2. Also, let x(l ) and x(n) respectively denote the maximum and minimum of x 1..... x n. Then R =(xO)-x(n))/o is the (standardized) range of the sample from the normal population with mean 0 and variance 1. Next, let T = R / 6 where p62/02 is distributed independent of R as chi-square distribution with v degrees of freedom. Then, the distribution of T is known to be the studentized range distribution. Table 4 gives the values of c where

P[R<c]=(1-a)

(4.1)

for a =0.05, 0.01 and n =2(1)20(5)40(10)100. Table 5 gives the values of d for o~=0.05, 0.01, n = 2(1)20(5)40(10)100, and v = 1(l)20, 24, 30, 40, 60, 120, where P[ T<d] = (1-- a). (4.2)

Computations of some multivariate distributions

753

These tables are taken from Harter (1960). For a discussion of the computation of the percentage points of the distributions of the range and the studentized range, the reader is referred to Harter (1960).

5.

Multivariate chi-square and multivariate F distributions

Let S = (so) be a central (noncentral) Wishart matrix with m degrees of freedom and E ( S ) - m E where Z=(oij). Then, the joint distribution of s~l.... ,Spp is known to be the central (noncentral) multivariate chi-square distribution with m degrees of freedom and with E as the covariance matrix of the "accompanying" multivariate normal. The joint distribution of s~(2, 0/2 is known to be the multivariate chi distribution with m ~pp degrees of freedom and with E as the covariance matrix of the accompanying multivariate normal. When m - - 2 , the joint distribution of (1~ ~1/~..... t~OppJ t ~ ~1/~ is a multivariate Weibull distribution. When B = 1, ~11~ the above multivariate Weibull distribution is a multivariate exponential distribution. The bivariate chi distribution was expressed by Bose (1935) as an infinite series such that each term in the series involves a product of densities of two chi variables. Some properties of this distribution were studied by Krishnaiah et al. (1963). Krishnamoorthy and Parthasarathy (1951) expressed the density of the multivariate chi-square distribution as an infinite series involving Laguerre polynomials. In the bivariate case, the above expression is equivalent to the expression obtained by Kibble (1941). Some properties of the multivariate chi-square distribution were studied by Krishnaiah and Rao (1961). Moran and Vere-Jones (1969) showed that the multivariate chi-square distribution is infinitely divisible when p/j =O (i ~ j ) and Oij= %/{o~%.} ~/z. They also showed that the multivariate chi-square distribution is infinitely divisible when p = 3 and Po = P I~-A. Further work on the problem of the infinite divisibility of the multivariate chi-square was done by Griffiths (1970). Now, let y~ = siJ % for i = 1,2 ..... p. Then, the joint density of Yl and Y2 is known (see Bose (1935)) to be
.,

f(y,,y2)=( 1

2 "~m/2

F(m+ i)p2~

i!r( m)
(5.1)

(YJ2)("/2)+i-'exp[-(Y'+Y2)/2(1-P~z)] [2 `m/2, + iF(ira --[-i)(1-- P:a)((m/2)+i)/2]z

754

P. R. Krishnaiah

So, P[cj <yj -<<d/;j: 1,2] : ( 1

-P,2)2 ,,,/2
i=0

F( m + i)pl2i2JliJ2i

i,F(m)
(5.2)

where

JJi= r(m+/)

fc;*exp(_y)y(m/2)+i_ldy,

(5.3)

~* = c)/2(1 - 0 2 ) and dj* = dJ2(1-072). The terms Jj, can be evaluated by using a program for computation of the incomplete gamma integrals. If we approximate the right side of (5.2) with the first t terms, then a bound on Rt, the error of truncation, is as given below:
Rt<~ l

(1 -P'~) 2 xm/2

r( m)

Et F ( l m ' C J ) +o0

Pl2" 2+

(5.4)

We can choose t such that the error of truncation is as small as possible. Using the above method, P. R. Krishnaiah and F. J. Schuurmann (unpublished) computed the values of d when a=0.05, 0.01, 0=0.1(0.1)0.9, and m = 5(1)15(5)45 where

e[ y, ~ a, y2 < a] = (l - , 0 .

(5.5)

Some of these values are given in Table 6. Next, consider the problem of evaluating d for given values of a when

P[Yi ~d;

i = 1..... p ] = ( 1 - - a),

(5.6)

and Pij =to for i:/:j. Let d* be given by the equation

Then d* is a lower bound on d. P. R. Krishnaiah and F. J. Schuurmann (unpublished) gave the values of d* for a=0.05, 0.01, m=5(1)15(5)45, p = 0.1 (0.2)0.7 and p = 3(1) 10. The values a* for the above values of d* are also given by the above authors where a* is given by the equation

pe[y, >a*]= . * .

(5.8)

Computations of some multivariate distributions

755

Some of the above values of d* and 1 - a* are given in Table 7. Next, let (x~ ..... xp) be distributed as a multivariate normal with mean vector 0 and covariance matrix ozf~ where f~= (Pij), Oij =P (i v~j) and Oil = 1. Also, let z i = x 2 / o 2 for i = 1,2 ..... p. Using Eq. (2.1), we obtain the following: P[ zi <c; i = l,2 ..... p ] = ~ f V2~ exp(-1 2 t -~Yo i=, ~ ~iCY)dY (5.9) where 8 ' Lexp( x/ff 1 2

-~

x)dx

, 10)

X=(-V~cc + ~ 0 y o ) / V r l - p and 6 = ( ~ + V p p y o ) / ~ v / 1 - p . Using Eq. (5.9), Krishnaiah and Armitage (1965c) computed the values of c for a=0.10, 0.05, 0.025, 0.01 and p=0(0.125)0.9 where

P[Zi~C ; / = 1

.....

p] = ( l - - a ) .

(5.11)

Some of the above values of c are given in Table 8. For extensive tables of the values of a for p = 0(0.125)0.9 and c = 0.1(0.1)11.5, the reader is referred to a technical report by Krishnaiah and Armitage (1965b). We will now discuss the evaluation of the probability integral of the multivariate F distribution considered by Krishnaiah (1965). Let F,. nsiioZ/gglS2Oii for i = 1..... p where sll ..... spp are jointly distributed as central (noncentral) multivariate chi-square distribution as defined earlier. Also, l e t $2/O 2 be distributed independent of Sll . . . . . Spp as central chi-square distribution with n degrees of freedom. Then, the joint distribution of F l ..... Fp is known to be a multivariate F distribution with (m, n) degrees of freedom and with f2= (Pij) as the correlation matrix of the accompanying multivariate normal. This distribution was introduced by Krishnaiah (1965). We will first discuss the evaluation of the probability integral of the bivariate F distribution. The joint distribution of F 1 and F 2 is given by
=

f(F1,F2) = nn/2(1 --/)12] 2 ~(m+n)/2

r(
X ff i=0

m)r(ln)
p2-~Y[m + n +2i]m'+Ei(FiF=) 'm/2)+i-I

i!r[

m + i][ n(1 -- p22) + re(F, + F2)]m+(n/2,+2i

(5.12)

756

P. R. Krishnaiah

Now consider the problem of evaluating a for given values of c t, c 2, d I and d 2 where

P[ei<Fi <di; i= 1,2] = ( 1 - a ) .


We know that
d, d2 (1 - - p 2 ) m/2

(5.13)

I'(m +j)p2j
j=0 J! Bj* (5.14)

fc I fc 2 "f(FI'F2) dFld~2-~" ~(1--m)


where

Bj, = fo~ eXp( - z )z(n/2)- '

2./2r(.)
1

(IvI )d
u(m/z)+j-ldu

(5.15)

and

Io=2(,,,/2)+jF(im+j)[fo4~exp(--lu)

The integrals 1o can be computed by using the method discussed in Section 3. The outer integral in Bj* can be computed by using GaussLaguerre quadrature formula. Now, let R t denote the error of truncation when we approximate the right side of Eq. (5.14) with the sum of the first t terms. Then

R,~<1

(1--02)m/' '--' r( m+S)

r(m)

j=0E

j!

p2j.

(5.16)

Using the above method, Schuurmann et al. (1975) computed the values of d for a = 0.05, 0.01, p-- 0.1(0.2)0.7, m -- 2(2) 10, n = 5(1) 15(5)40 where

P[Fs<d; i = 1,2] = ( l - a ) .

(5.17)

The above values of d are given in Table 9. In computing the tables, they truncated the series in Eq. (5.14) such that the error of truncation (using (5.16)) is less than 10 - 4 . A l s o , the outer integral was computed by using 32 point Gauss-Laguerre quadrature formula.

Computations of some multivariatedistributions

757

When p > 2, the following bounds based on Poincare's formula may be used to compute lower and upper bounds on the percentage points: p 1 - ~] P [ F , . > d ] < P [ P ] - < d ; i = 1 , 2 ..... p ] i=1 p -<<1 - P [ F / > d ] + P[Fi>d, F j > d ]. (5.18)
i=1 i<j

Schuurmann et al. (1975b) were interested in computing the values of d* for certain values of a l, n, m and O where

1-pP[Fl>d*]+(2)P[fl>d*,F2]>d*]=(1-Otl)

(5.19)

and P=Oij (ivaJ =1 ..... p). They were also interested in computing the values of (1 - %) for the above values of d* where

pP[ F l > d * ] = a 2.

(5.20)

Due to oversight, they computed the values of d*l (instead of d*) where

1-pP[ F, > d r ] + (~)P[ F, >d~'(l- p2),F2 >dr(l-p2)] = (I-a,).

(5.21)
They have also computed the values of 1 - a~ for the above values of d]~ where

pP[ F, >d~' ] = a~'.

(5.22)

F. J. Schuurmann and P. R. Krishnaiah (unpublished) have constructed the values of d* for al=0.01, 0.05, k = 3 , 4, 5 and 0=0.1,0.3,0.5. These values are given for a =0.05 in Table 10. Next, consider the evaluation of the values of d for given values of when m = 1, X=o2f~, f] =(p,j), pii= 1, Pij=P (i--/=j) and P[ F / < d ; i = 1,2 ..... p ] = (1 -- a). We know that P[ F,. < d; i = 1 , 2 ..... p ] = (5.23)

"o

""Jo

h(z, ..... zj,)iL[ ' dz i dx

(5.24)

where g(x) is the density of the chi-square distribution with n degrees of freedom and h(z 1..... Zp) is the density of the multivariate chi-square distribution with one degree of freedom and with ~2 as the covariance

758

P. P~ Krishnaiah

matrix of the accompanying multivariate normal. The inner integral can be evaluated by using the same method as used in the evaluation of the right side of Eq. (5.9). The outer integral can be evaluated by using GaussLaguerre quadrature formula. Using the above method, Krishnaiah and Armitage (1970) computed the values of d for a = 0.05, 0.01, p = 0.1 (0.1)0.9 and n =5(1)35; more extensive tables were given in a technical report by Krishnaiah and Armitage (1965d). Table 9a gives the values of d for a=0.05, 0.01, and p=0.1(0.2)0.9. We now discuss the distribution of the ratio Fo=yl/y 2 where the joint density of Yl and Y2 is given by Eq. (5.1). Then, the distribution of F 0 is known (see Bose (1935)) to be 2m(1--p2)m/2F[ (m+ l)]/;o"-1(1 + F02)
f(/~o) =

V~ r(lm)[l +(1 + Fg)z-4o2Fff] ('~+')/z

Finney (1938) showed that

P[Fo<.t2]=l-Ix -~m; ~m
where
x=51 . .

{(t+t ')2-402}'/2

1 foty(p/2)_l(1._y)(q/2)_ld): l~(p;q)= fl(p,q)


Krishnaiah et al. (1965) obtained an alternative expression for P[Fo <t].

6.

Distributions of quadratic forms

In another chapter of this volume, Khatri gave a review of the distributions of single quadratic forms. In this section, we review the joint distributions of correlated quadratic forms. We will first briefly discuss the distribution of a single quadratic form to fix the ideas. Let x' : 1 x n be distributed as a multivariate normal with mean vector 0 and covariance matrix E. Then, it is of interest to derive the distribution of x'Ax where A is a symmetric positive semidefinite matrix of rank q. The characteristic function of x'Ax is given by
q

q~(t)= ]7[ (1-2itXj) -'/2


j=l

(6.1)

Computations of some multivariate distributions

759

where ~1..... )tq are the eigenvalues of EA. Robbins and Pitman (1948) give the following expressions for the density of x'Ax: f ( y ) = ~ ajgq+2j(Y), j=o (6.2)

1 q) f ( y ) = XcjLj( x; -~
where exp(-y)y(r-2)/2
gr(Y)
----"

(6.3)
(6.4)

2r/2r(r)

Lj(x; ~ 1q ) =

(-1)J dj e x p ( - - 7 ) T q / 2 > ' dx j

{exp(_x)x(q/2)+j-l).

(6.5)

The constants aj and cj are given by the equations


q

{ j + ( 1 - j)(1-- 2it)- l ) - ' / 2 =


j=l

(6.6)

= ~,, a r ( 1 - 2 i t ) - " ~
r=O
j=l

1+

(1

it)

= E 5
j=0

(it)

j (6.7)

We can use Eqs. (6.2) and (6.3) to compute exact percentage points of the distribution of x ' A x when q > 2 whereas Marsaglia (1960) and Solomon (1960) computed the percentage points for q=2,3. Johnson and Kotz (1968) computed the percentage points for q--4, 5; these authors also gave a computer program for computing the percentage points of the quadratic form. Jensen and Solomon (1972) approximated the distribution of certain power of the quadratic form with normal distribution. Now, let z = alw I + + aqWq where w 1..... Wq are distributed independently as exp( Wi)Wia- 1 h(w,) = (6.8)
-

Then, the characteristic function of z is given by


q

qh(t) = ]-[ (1-afit) -~.


j=l

(6.9)

When aj.=2~ and a---, the distribution of x ' A x is the same as the distribution of z. Now, let a * = a(a I + ... + aq) and a**= a(al2 + . . . + %2). Krishnaiah (1977) obtained the following expression for the distribution of

760

P. R. Krishnaiah

( z - a*)/V~d *~ when a** is large:

fl(y)=@(y)

E1

1 3~/a ~

~q t~jajH3(Y ) j=l

+0(~**-~/2)]
where ~ = ~ 2 / ~ . 2 , V ~ ~(y)=exp(-

(6.10)

ly2), and

~(x)~(x) = ( - ly ~

~(x)

We will now review the results on correlated quadratic forms given in Krishnaiah (1977). Let x -' ( x ~ ' ..... Xp) be distributed as a multivariate normal with mean vector 0 and covariance matrix E where

I ZII ~12 " " " Zlp X-- ~"21 ~22 " " " ~']2p

(6.11)

and Yij=E(xixj) is of order qi X ~. Also, let yi=x~Aixi where A i is symmetric and nonnegative definite. Then, the joint characteristic function of yl ..... yp is given by

q,2(tl . . . . . tp) = I I - 2iTA Nl- 1/2

(6.12)

where A =diag(A 1..... A ) and T = d i a g ( t l I ..... t I ). We now generate the above distribution b ; taking sums of correlated ~lai-square variables as described below. Let u I..... uq be distributed as a multivariate chi-square distribution (defined in Section 5) with one degree of freedom and with f~ as the

Corr~outations of some multivariate distributions

761

covariance matrix of the "accompanying" multivariate normal. Also, let


'/31 = Ul + ' ' "

+UqL ,

V2= Uql+l "~ . . o --~ Uql+q2,

Vp=Uq,~ .... +q,_~+l + " " + Uq

where q = q l + " " + q p . It is known that the characteristic function of u 1. . . . . Uq is given by

q,3 (t,

.....

tq) = t i - 2iZaa I - , / 2

(6.13)

where T 1=diag(t I..... tq). Then, the characteristic function of v x..... Vp is given by

q,4( t'( .....

tp ) = 11 - 2iT~{al-1/2

(6.14)

where T ~ = d i a g ( t ~ I q , , . . . . t~,Iqp). So, q,2(tl ..... tp) is of the same form as q~4(t~..... t;). Next, let ~9'-(Xlj' . . . . ,Xpj); ( j = 1,2,...,n), be a sample drawn from a multivariate normal with mean vector 0 and covariance matrix x where E is given by Eq. (6.11). Now, let
Yi = -~ E xijAixij

j=l

for i = 1,2 ..... p

where A 1. . . . . A : are as defined earlier. Also, let z i = (Yi -- E ( y i ) ) / V ' n where E ( y i ) = n t r A i X , / 2 and let etr(B) denote the exponential of the trace of B. Then, the joint characteristic function of z~ . . . . . Zp is given by ~5(tl ..... 9) =

i-

iT*v~ n

22"1-"/2 :

etr~- il Vn T'Z*

(6.15)

where T* = d i a g ( t l l q l , . . . , tplq,) and 22* = AN. Krishnaiah (1977) obtained the following asymptotic expression for the joint distribution of z l . . . . . Zp: 1 - - 1 E Cjlj2j3I-IjlJ2J3(X) "11" O ( F/ "-1)] 6V-n-

f ( z 1. . . . . Z p ) = ~ ( z ; D ) [ l +

(6.16)

762

P. R, Krishnaiah

where z ' = ( z , ..... zp), D=(dy,), dj~ = tr Zj,,~Z,,j/2, ~b(Z; O ) = {1/2~rlDI)l/2exp{ -- -~

1 z'D-lz),

(6.17) (6.18)

Ov tp(z; D)Hj ..%(Z)= ( - l) ~ azjt.. " Ozjo~b(Z; D). Also, the constraints ~,jd, are given by the relation

tr(T*X*)3= ~ %j~tj,thS~
where the summation is overjl,j2 and j3 varying from 0 top. The first term on the right side of Eq. (6.16) can be evaluated using known techniques for the evaluation of the probability integrals of the multivariate normal. Khatri et al. (1977) obtained an asymptotic expression for the joint density of z l ..... Zp in the noncentral case. Krishnaiah and Waikar (1973) gave expressions for the distribution of a linear combination of correlated quadratic forms. Next, let QI= V A I V ' and Q 2 = V * A 2 V*" where V=[VI[--. ]Vk], V*= [Vk+ 11"'" ] Vk+r] and the columns of [V I V*] are distributed independently as multivariate normal with mean vector 0. Also, the covariance matrices of the columns of Vi: p ni, (p <n i) are given by ~; and the matrices A 1 and A 2 are given by

Al=

/~.1/,, 0 l
"..
o

A2=

''"

~+r~+,

1"

k S _ k+r t Then QI=Y~/=I~ i and Qz--~i=k+lXiSi where Si= ViV,. Gupta et al. (1975) derived asymptotic expressions for the distributions of - V ~ log] Q1/v[, - X/~v log IQ1Q2-1] and - ~ log IQI(Q1 + Q2)-11 where k+r v1= 2Y~= l niXi and v = 23~i= l ni~. These expressions are linear combinations of normal density and Hermite polynomials. The above authors have also obtained an asymptotic expression for the joint density of

( - V'N log IS 2 S l ' I..... - V ~ log] Sks,-ll) w h e r e N = n 1 + . . - + n k.

Computations of some multivariate distributions

763

7,

Distributions of the maximum of correlated Hotelling's T 2 statistics

Let Y'=(Y'I ..... Y~v) be distributed as a multivariate normal with m e a n vector 0 and covariance matrix B Z where denotes Kronecker product, Z is of order p p, Yi is of order p 1 and B is of order N N. Also, the diagonal elements of B are assumed to be equal to /3 and the off-diagonal elements of B are assumed to be equal to 6. In addition, let T/2=y/'S -lyip (i= 1..... N) where S is distributed independent of Yi as the Wishart distribution with v degrees of freedom and E ( S ) = u Z . In this section, we discuss approximate values of c where

(7.1)
and T ~ = m a x ( T ? ..... T#). Using Poincare's formula, it is known that
q

P [ T i 2 < c ; i = l ..... N ] < I -

~] P [ T / 2 > c ] +
i=1

X P[Ti2>c, TjZ>c].
i~-.j

(7.2) Siotani obtained an asymptotic expression for the distribution

of

P[Ti2>

e, Tj2>c]. Using this expression and Eq. (7.2), he computed tables of


approximate percentage points of max T/2 for some special cases. A description of these special cases is given below. Let x'~ = (x 1. . . . . . xw) (a = O, 1..... n), be n + 1 r a n d o m vectors distributed independently as multivariate normal with m e a n vector m ' and covariance matrix E. Also, let nX" = y n= ix ~ and let vL be distributed independent of x 1..... x n as the Wishart matrix with v degrees of freedom and E ( L ) = E . In addition, let
T2Max.0 ---- m a x ( x a - - m ) ' t - l ( x r ~ - - t n ) , Z2ax.D = maax ( x a -- x . ) ' Z - l ( x a - - x . ) , (7.3) (7.4)

T Max.C 2 = m a x ( x a - xo) t L -- 1( x , - x o ) , R 2 = max (x, - xB)'L - l(x, - xl~).


a<B

(7.5) (7.6)

In Eqs. (7.3)-(7.5), a takes the values from 1. . . . . n whereas a <13= 1..... n in Eq. (7.6). The distributions of T~x. o , 2 T~,x.D ,2 T~,x.c2 and R 2 are special

764

P. R. Krishnaiah

cases of the distribution of T~a x with/3, 6 and N given in the following table.
Statistic T~lax.O T~tax.D
T~ax. c R z

/3 1
(n2 1)/n

6 0
-- 1 / n 1

N n
n n

0or 1

~ n ( n - 1)

Table l l a gives the values of c for a = 0 . 0 5 , 0.01, n=2(1)12, 14, 16, p = 2 and different values of p where

e[

=(1-

(7.7)

Table l l b gives the values of c for a = 0 . 0 5 , 0.01, n=2(1)12, 14, 16, p = 2 and different values of v where

e[

=(1-

(7.8)

Tables 1 la, 1 lb and 1 lc are reproduced f r o m Siotani (1959b, 1960). Table 1 lc gives the values of c for a = 0.05, 0.01, n = 3(1)12,14, p = 2 and different values of v where P [ T2M.~.D< c ] = ( 1 - a). (7.9)

8.

Distributions of the individual roots of a class of random matrices

In this section, we discuss some aspects of the computations of the central distributions of the individual roots of a class of real r a n d o m matrices. This class includes such important matrices like the Wishart matrix, multivariate F matrix a n d multivariate beta matrix. Tables for the individual roots as well as the joint distributions of the extreme roots of these r a n d o m matrices are also given. We will first define the above r a n d o m matrices. Let the columns of X: p m be distributed as a multivariate normal with mean vector p and covariance matrix E~. Then, the distribution of S~ = X X ' is known to be a central or noncentral Wishart distribution with m degrees of freedom according as p = 0 or p~a0. Next, let the columns of Y: p n be distributed independently and identically as a multivariate normal with m e a n vector 0 and covariance matrix Y'2- Also, let $2= Y Y ' and

Computations of some multivariatedistributions

765

F = S1S2 -1. Then, the random matrix nF/m is known to be a central or noncentral multivariate F matrix accordingly a s / t = 0 or/~ va0. Similarly, the random matrix B = SI(S 1+ $2) -1 is known to be a central or noncentral multivariate beta matrix accordingly as ~ = 0 or ~:/=0. Next, consider the random matrix A =(aij): p p whose elements are distributed independently and normally with zero means. Also, let the variances of the diagonal elements be equal to 2 whereas the variances of the off-diagonal elements are equal to 1. Then A is known to be a central Gaussian matrix. Let w I > / . . - /> Wp be the eigenvalues of the Wishart matrix S~ with m degrees of freedom. Also, letp < m , / ~ - - 0 and E = Ip. Then, it is well-known (see Nanda (1948b)) that the joint density of the eigenvalues w1..... wp is given by

i<d 0<Wp<... <w~<~ where r = (m --p - 1) and

II (w,- '5),
(8.1)

C'=~rP/2(2) rap~2~

,Oi

[F(l(m+l-

i))r(z(p+

1-i))J.

Next, let 01 ) ' ' " )Op denote the eigenvalues of the multivariate beta matrix B. Also, we assume that p<(m,n), ~ - - 0 and E x = Z 2. Then, the joint density of the eigenvalues 0 x..... Opis given by

f2(01..... Op)=C2 II {o[(1--oy} ~I (Oi--Oj),


i=1 i<~

0.< o, < . . . <

.< 1,

(8.2)

where r-- (m - p - 1) and s = (n - p - 1), ~P2/2Fp(r+ s + p + 1)

C2= {Fp(m)rp(n)Fp(p)}
and
P

I'p(a) = 7rP(P-1)/4 I I F ( a - 1 ( i - 1)).


i=l

If m < p <n, then the joint density of the eigenvatues of

St(S l + $2) i is

766

P. R. Krishnaiah

obtained from (8.2) by interchangingp and m, and changing n to m + n - p w h e n / L = 0 and Z~ = Z 2. Next, let 61/> /> 6p be the eigenvalues of the matrix F when # = 0 and Z l = Z 2. Then the joint density of 61..... 6p is obtained from (8.2) by making the transformations 6i = (OJ 1 -Oi) for i = 1..... p w h e n p < m, n. The joint density in this case is
P i=1 i<j

(8.3)
Next, let a ~ / > . . . ~>ap denote the eigenvalues of the central Gaussian matrix A. Then, the joint density of a~ ..... ap is given by

f4(a~ ..... ap)= C 3 ]~ exp - ~ai


i~l

I121
1

]~ (ai-aj.)
i<j

(8.4)

where
C 3 -~

2p/2 ( P

The joint density of the eigenvalues of B when ~ = 0 derived independently by Roy (1939), Fisher (1939) and joint density of the eigenvalues of A was derived by Hsu The joint densities of the eigenvalues of a wide class of are of the following form:
P

and 1 =~]2 was Hsu (1939). The (1941). random matrices

f ( l 1..... /p)= C ~[ q'(li) ]-[ (l~-/j),


i=1 i<j

a < / p < . - . <ll<<.b.

(8.s)

We will now give a few special cases of the above expression. Case (i) C = C l, a--0, b = m and ~ P ( l i ) = l [ e x p ( - 1 /i); in this case, f ( l I..... lp) = fl(ll ..... /p). Case (ii) C = C 2, a = 0 , b = 1, and ~ ( l i ) = l r ( l - l i ) S ; in this case, f ( l l , . . . , l p ) = f 2 ( 1 ~. . . . . /p). Case (iii) C = C 2, a = 0 , b = o o , and '/'(/i) = 1 7 ( l + l i ) - r - s - P - l ; in this case f(11 ..... lp)=f3(l I ..... /e)" Case (iv) C = C 3 , a-- - 00, b = co and q'(/i) = e x p ( - /2); in this case f(/~ ..... /p) =f4(/~ ..... /p).

Computations of some multivariate distributions

767

8.1. Evaluation of some integrals


We now discuss the evaluation of an integral which arises in computation of the probability integrals of the eigenvalues of several random matrices. Let
q q
i>]

I(q',q,c,d)--

f "" f

I I {~P(xi)} I I

(xi-xj) Ii dxi.
i=l

C < x I ~ ' ' ' <Xq'~d i = l

(8.6) Here we note that


1
X1

1
X2

--.
"""

1
Xq 2 ~q

~(~-~)=
in

x ~,
xq--1

x~
xq--1

..

...

Xqq--'

and the above determinant is known to be the Vandermond determinant. Eq. (8.6) can be written as follows:
xIt(Xl) c f XlXI~(Xl)
. . .

x~r(Xq) XqXIl(Xq)

"'"

l(q';

q,r,c,d) <f~'.i]<xq<d i
X q-- lxIt(Xl)

"'.
...
Xqq--lx~t(Xq)

dxi.
i~l

(8.7) The integral in (8.7) is of the same form as the integral

J(g; q,c,d)=

f ... f
C<Xl~ "-" ~ x q ~ d

[B[dx,"""

dxq,

(8.8)

where B = (bo) and bij = gi(xi). We will now discuss the evaluation of the integral in Eq. (8.8). Case (i): Let q=2m. In this case,

IBI =

+b,i~b2,2"'" b2m,izm

(8.9)

768

P.R. KrBhnaiah

where (i 1..... i2m) is a permutation of (1,2 ..... 2m) and the sign is positive or negative accordingly as the permutation is even or odd. The summation is over all permutations (il ..... i2m) of (1,2 ..... 2m). We integrate out x~, x 3..... X2m_ j holding x2, x4..... x2m fixed. Then
m

J(g;q,c,d)=

...

E +bl*'lb2*=

2re,is,. ~I dxzj"
j= 1

< X 2 <X4< " " " '~X2m < d

(8.1o)
Now, let F , ( O ) = f g , ( x ) d x . In Eq. (8.8),

g,(x,,)

if it is an even integer,
if
it =

bt~=~fax g'(x')dx= F'(x2) [ fx,~7'gt(xi,)dxi,- Ft(xi,+l)- Ft(xi,-l)


So

1,

if it=3,5 .... , 2 m - - 1.

J(g; q,c,d)=

f"" f
c " ( x 2 < X 4 "( <X2m < d

[B*Idx2 "'" dx2m

where B*=(b~). Hence


J(g; q,c,d)--

f ''" f
C<X2<X4 ~ ... <X2m<b

dx2dx4...dx2m

F1(x2)

gl(x2)
g2(x2)
g2m(X2)

FI(X4)- FI (x2)
F2(x.)- F2(xO
F2m(X4)-F2m(X2)

F2(x2)
F2m(X2)

gl(x4) g2(x4)

""

F,(x2m ) - F,(X2m_ 2)
F2( X2m) - r2(x2m_ 2)

"'"
""

gl(x:~) g2(x2~)

g2m(x4)

Fzm(X2m ) - F2m(x2m_2)

(8.11)

Computations of some multivariate distributions

769

The above integral c a n be written as

J( g; q , e , d ) =

(8.12)

Case (ii): Let q = 2 m +

1. I n this case (8.13)

where the s u m m a t i o n is over all permutations (i I . . . . . izm+ 1) of (1 . . . . . 2m + 1) and the sign is positive or negative according as the permutation is even or odd. If we integrate out o d d variables, we o b t a i n

(8.14) where if it is an even integer, if it= 1, if i t = 2 m + l, if it = 3,5 ..... 2m - 1. So

(8.15)

770

P. R. Krishnaiah

where B* =(b;~.). We can write (8.15) as follows:


1 d d

J(g; q,c,d)=-m-~, fc "'" fc dx2dx4""dx2m

FI(X 2)

gl(X2)
gz(x2)

/71(X4)
F2(x4)

Fz(x2) X

F2m(X2) g2m(X2) F2m(X4) F2m+l(X2) g2m+l(X2) F2m+l(X4)


gl(X4) g2(x4)
g2m(X4) "'" "'" Fl(X2m ) F2(X2m ) r2m(X2m ) gl(X2m) g2(X2m) g2rn(X2m)

F,(b) r2(b) F2m(b) r2m+l(b) (8.16)

".,
"'"

g2m+l(X4)

"'"

~'2m+l(X2m)

g2m+l(X2m)

Now, let D~ be the d e t e r m i n a n t o b t a i n e d b y deleting the last c o l u m n and ith row in the d e t e r m i n a n t on the right side of (8.16). Then
1

J(g; q,c,d) = ~..

2m+l

i=1

Y' (-1)i+'Fi(d) f a . . . fcaDidx2dx4...dx2,,,.


(8.17)

W e now express the integrals on the right sides of Eqs. (8.10) and (8.15) as Pfaffians by following the s a m e lines as in A p p e n d i x A.7 of M e h t a (1967). A matrix A = (aij) of order p X p is said to be skew-symmetric if a/j = - ajg. It can be shown, by induction, that ]A [ -- 0 when p is a n o d d n u m b e r a n d [A[ is a perfect square w h e n p is an even n u m b e r . W h e n p is an even n u m b e r , the Pfaffian of A is defined to be Pf(A) __ - -

__ aiti2ai3i4

aip_~

where the s u m m a t i o n is over all p e r m u t a t i o n s of i 1..... /e of 1..... p subject to the restrictions ij <i2,i3<i 4..... ip_ 1< ~ and the sign is positive or negative accordingly as the p e r m u t a t i o n is even or odd. Here, we note that

Computations of some multivariate distributions

771

(Pf(A))2= IAb N o w

consider the integral

l*(+,iqz;m,c,d):

fed'"

feddyldY2""d2

xIt,(Y,) xIt2(Y1)

q~l(Y2) q~2(Y2)

l+2m(y,) ~I2m(Y]) dP2m(Y2) "Iq(y2) %(Y2) %~(y9


"'" "'".
"'"

~bl(Ym) ~l(Ym) ~2(Ym) xIr2(Ym)..


+2m(Ym)

'I'2,(Ym) l (8.18)

Suppose the d e t e r m i n a n t in the a b o v e integral is denoted b y [C[ where C = (ci i). T h e n ci~ = t'i(Yj/2) w h e n j is even. W h e n j is o d d cij =eOi(Yo + 0/2). Also

IC[ = E

+~r~elilC2iz"" " C2m,i2m

where the s u m m a t i o n is over all p e r m u t a t i o n s il...i2,,, of 1,2 ..... 2m and the sign is positive or negative accordingly as the p e r m u t a t i o n is even or odd. Now, let

aij = fea[<~i(x)qdj(x) - q>j(x)'Pi(x) ] dx.


T h e n aij = - ~ . So

I*(q~,q~; m , c , d ) = ~_, + - f~ d " " f~ g ClilC2i2"''Czm, i2= dy, .. .dy2m.


W e observe the following: (i) I*(~, 't'; m, c, d) is a sum of certain terms a n d each t e r m is a p r o d u c t of rn n u m b e r s aij. (ii) T h e indices of various aij occurring in each term are all different. These indices range f r o m 1 to 2m. (iii) W e restrict i to be less than j ; if i>j, we replace a;j with -aji.

772

P. R. Krishnaiah

(iv) Coefficient of a typical term ai,ifl6~" .. a i..... 6, is + 1 or - l accord.ingly as the permutation (i I..... i2m) is even or odd. So
6,, = [A[l/2m!" I*(~,q*; m , c , d ) = ~ + a6i ~. . . a ......

(8.19)

Using Eq. (8.19), we can write the right sides of Eqs. (8.12) and (8.17) in terms of Pfaffians. Here, we note that
q q

q,c,d)= f ... f
f ''" f

I1
q

II (x,-x,) 1I dxi
i>j
i=1

C<Xl< *'' < x q < d i = 1

II ~(xi) II (x,-x,)
i<j

1-[ dxi
i=1

(8.20)

< X q ' ~ ' ' " ~.Xl<d i = 1

and so l(q'; q,c,d) can be written in terms of Pfaffians. The results given in Section 8.1 were shown in the literature (see Krishnaiah and Chang (1971 a)) by using the techilique of integration over alternate variables due to Gaudin and Mehta (1960). Mehta (1960) evaluated I(q~; q , c , d ) when q'(x)= x % x p ( - x k) and q = 2 m .
8.2. Exact distributions of the extreme roots of the Wishart and multivariate beta matrices

If we make the transformations & = w i / w p ( i = 1..... p - 1 ) , w v = w p in (8.1) and integrate out gl,..-,gp-1, we obtain the following expression for the density of Wp: k(we) = C, exp( - Wp)w~ +(e- ,)o + (p/2))x
p--I p-1

f-'"
l~gp_~.-.

f
"~gl~o~

exp - w e 2
i~l

~gi
i 1

[g[(&-l)]

5[
i>j=t

(&--N) H d&
i=l

= C z e x p ( - ~Wp)Wpl "~ rP+~-l)(l+(p/2))l(Xi*l; p-- i, 1)

(8.21)

where ~I'l(x ) = exp(- w p x ) ( x - 1)x r. Making the transformations w i = hiw 1 (i---2 ..... p), and w l = w 1 and integrating out h 2..... hp, we obtain the following expression for the density of Wl: f6(w1)
=

C, wT+~'-'~o+(p/2)>exp(-~2w,)I(~2;p - 1,0, 1) (8.22)

where q ' 2 ( x ) = e x p ( - w l x ) ( 1 - x ) x r. By using the same method as above,

Computationsof somemultivariatedistributions

773

we obtain the following expression for the density of the largest root of the multivariate beta matrix B:

f7(01) =

C20T (p- 1)(P+2)/2(1 -01)~I(qt3; p - 1, O, l)

(8.23)

for 0 < 01 < 1 and ~t'3(x ) = ( 1 - - X w l ) S ( l -- X ) X r. The expressions in (8.21)-(8.23) were given in Krishnaiah and Chang (1971a). We know that

P[ c <Ip <l I <d]=CI(gt; p,c,d), P[ l I < d ] = CI(~I'; p,a,d), e[ lp <.<c]= l - CI(~t,; p,c,b)

(8.24) (8.25) (8.26)

when the joint density of the roots l 1 > />/p was given by (8.5) and I(~; q, c, d) was defined by Eq. (8.20). We now consider the problem of evaluating d for given values of a where 1 <wp <w I <d] = ( 1 - 2 a ) P[-~ (8.27)

and wp < . . . <w I are the roots of the central Wishart matrix S 1 defined earlier. From (8.24), we know that

P[ l <.wp <~ Wl <dl = ClI( ~t; P, l , d ),

(8.28)

where ~t'(x) = e x p ( - X ) X (re-t) -- 1 ) / 2 . Using (8.28), and the results in Section 8.1, Clemm et al. (1973b) computed the values of d for a =0.05, 0.025, 0.01, 0.005, p = 2(1) 10(2)20, and m = (p + 1)(1)20(2)30(5)50. These values are given in Table 12. Hanumara and Thompson (1968) gave tables for L and U for p = 2(1)10 and various values of r and a where

P [ L < w v < w , < U] = ( 1 - 2 a )


and

(8.29) (8.30)

e[w.

774

P. R. Krishnaiah

For p =2, the values computed by them are exact. For p >3, they computed the values of L and U by using the approximations

P[wp>iLl ~ ( 1 - a ) ,
P[ w, < U] --~(1 - a).

(8.31)
(8.32)

They used an approximation in computing the left sides of (8.31) and (8.32). We now discuss the problem of evaluating the probability integrals of w l and wp. We are interested in computing c and d for given values of a where

P[w, < d ] = ( 1 - ~ ) , P[wp >c] = ( 1 - a ) .


Using (8.25) and (8.26), we obtain

(8.33) (8.34)

P[w, < d ] = C,I(~; p,O,d), P[ wp >>-c ] = ClI(~ff ; p,c, oo),

(8.35)
(8.36)

where ~P(x)--exp(-i x)x r. Using Eq. (8.35) and the results in Section 8.1, Clemm et al. (1973b) computed the values of d 1 for a--0.01, 0.025, 0.05, 0.10, p = 2(1)10(2)20 and m = (p + 1)(1)20(2)30(5)50 where

e[w,>a,]=(1-.).

(8.37)

The computer program developed by the above authors is used to compute the values of c and d for certain values of a, p and m where c and d are given by Eq. (8.34) and Eq. (8.33) respectively. The values of d and c are given in Tables 13 and 14 respectively. Here we note that Pillai and Chang (1970) computed approximate values of d for a =0.05, 0.10,p =2(1)20 and different values of v. Also upper 10%, 5%, 2.5% and 1% values of the distribution of Wp were given by Clemm et al. (1973a). Next, consider the problem of computing the values of a for given values of A where
P [ 1 - A < 8p ~<0, < A ] = (1 -- a) (8.38)

and 8p < - - - <01 are the roots of SI(SI+ $2) -1 defined earlier. From Eq.

Computations o f some multivariate distributions

775

(8.24), we observe that


P[ 1 - A <07 <01 <<.A ]=CzI(q~; p , I - A , A )
(8.39)

where xtC(X)=xr(l--x) s, r = l ( m - - p - 1 ) and s = ( n - p - 1 ) . Using Eq. (8.39) and the results of Section 8.1, Schuurmann et al. (1973a) computed the values of A for a=0.100, 0.050, 0.025, 0.010, p=2(1)10, r = 0(1)5,7, 10, 15 and s=5(1)10(2)20(5)50. Some of these values are given in Table 16. Using the computer program developed by Schuurmann et al., values of A* are computed for certain values of a, p, r and s where

P[OI <<.A*J=(1-a ).

(8.40)

These values are given in Table 17. Heck (1960) gave charts of the values of A* for a=0.05,0.025,0.01, p=2(1)5, r = - , 0(1)10 and s>~5. These charts are given in the Appendix. Pillai (1960, 1964, 1965, 1967) computed extensive tables for the upper percentage points of 01 by using approximations. The approximations used by Pillai are good for several practical purposes. Chang (1974) computed the values of A* for p=2(1)5, a = 0.05,0.01, r=0(1)10,15, s=2(1)10, 15,20 by using the exact expression based on the results given in Section 8.1.

8.3. Exact distribution of the intermediate roots of a class of random matrices


Let the joint density of the eigenvalues/p < . . . <l 1 of a random matrix be of the form (8.5). The cumulative distribution function (c.d.f.) of l~+ 1 (1 <r < p - 2 ) is given by

e[lr+l < x ] = e [ l , < x ] + e[lp<<.... <lr+, <X<<.lr<<... <ll]


(8.41) where

e[~ <--. <z,+l <x<tr <

<ll]
P P

= f---f
R1

i= l

II "~(o II (~:- ~) II d~,


i <j i~ l

(8.42)
and RI: a < / p < . . - < < . l r + l < x < . l r < ' ' "

<ll<b. We can expand the

776

P. R. Krishnaiah

Vandermonde determinant (using Laplace expansion) by the first r columns and obtain the following:
1 1 -.. 1

=El(-

l)'(' +~)/:+~:-'~"v(z~ ..... 5; kl ..... I,,)

X V(lr+ 1. . . . ,/p; t, . . . . . t~_,) ( 8 . 4 3 )


where

v ( x , . . . . . :,q;

kl . . . . .

~,q)=
.~q X; q "'* xlkq

(8.44)

In Eq. (8.43), Y'l denotes the summation over all ( \P / /) possible choices of k i . . . k r. A l s o k l . . . k r is a subset of the integers {0..... p - l ) , t 1 < - . . < tp_ r is the subset complementary to k 1< - - . < k r. So, Eq. (8.42) may be written as
e[~ < ..o

~+,+1~ < l , < - . -

<tl]=

~l(--1) r(r+3)/2+]~:-'k' f ''+ f f I x~c(Zi)U(ll,...,Zr; kl ..... kr)dll'*dlr x<l~<... ,~ll<b i=l


X (-- 1)P(P-1)/2
P

a<lp~ ''' <lr+l<X


H
i=r+l

f *o f

'~(li)V(lr+l,...,lv; Ii . . . . . / p - , ) d l ~ + l ' " d / e

(8.45)

The integrals on the right side of Eq. (8.45) are of the same form as the integral in Eq. (8.8) and so they can be evaluated by using the results in Section 8.1. The above method of evaluating the c.d.f, of the intermediate roots was given by Krishnaiah and Waikar (1971a). Now, let wp < < w 1 be the roots of the central Wishart matrix S 1 and let 0p < . . . -<<01be the roots of S I ( S 1 + S z ) -1 as defined earlier. Using the expressions given above, Clemm et al. (1973a) computed the values of a for

Computations of s o m e multivariate distributions

777

a = 0.10, 0.05, 0.025, 0.01, p = 2(1) 10, m = (p + 1)(1)20(2)30(5)50 where

P[ wi < a ] = (1 - a)

(8.46)

and i = 2, 3..... p - 1. Some values of a are given in Table 15. 'l-he expressions given in this section were also used by Krishnaiah et al. (1973) to compute the values of b for a = 0 . 0 5 , 0.01, p = 4 , 5 , 6 , 7 , r - 0 ( 1 ) 5 , 7 , 1 0 , 1 5 , s = 5(1)10(2)20(5)50 where

P[ 0i < b ] = (1 - a)

(8.47)

and i = 2, 3 ..... p - I. When p = 8, the above authors also constructed upper 5% and 1% points of the distributions of 02 and 07 for the same values of r and s. Some of the values of b computed by Krishnaiah et al. (1973) are given in Table 18.

9. Distributions of the ratios of the extreme roots and ratios of the individual roots to the sum of the roots

The joint density of the roots of the Wishart matrix S, is given by Eq. (8.1). Davis (1972a) showed that

( (1 + w) (2,p +p2 +p. 4)/2fp0)r(1/( 1 + w)) } =

= 2r(

p(2r + p + 1)) eSs(2-p(2r +p + l)/2)g~)~(2s)


(9.1)

where E(h(w)) denotes the Laplace transformation of h(w), g(J) ~p,r is the density of wj, f y ) is the density of uj and uj= wJE~=lW i. Krishnaiah and Chang (1971a) gave exact expressions for the densities of w 1 and wp whereas Krishnaiah and Waikar (1971a) obtained exact expressions for the densities of wj ( 2 < j < p - 1); these expressions are given in Section 8 of this chapter. The above expressions may be written in the following form when r is an integer:
N i=1

(9.2)

778

P. R. Krishnaiah

In the above equations, it is quite tedious to obtain algebraic expressions for the constants d~j, rnij and ~ii" But, they can be computed using a computer when p is not large. Using Eq. (9.2) in Eq. (9.1) and inverting Schuurmann et al. (1973b) obtained expressions for the densities of u 1 and up whereas Krishnaiah and Schuurmann (1974) obtained expressions for uj (2 < j < p - 1). These expressions are of the following form:
~p',)r(Z) = C l Sp(2r+p +

o/2)r( (p(2r

+p + 1) -- 2))

U (1 -- (~lij + 1)Z)~'J -I E d?
i=1

z~J-l(mij--1)!

z(p(2"+p+')-4)I2

(9.3)

where (x)+ is equal to x or 0 according as x > 0. When r is not an integer, fp,r)(z) is of the same form as Eq. (9.3) with N replaced by oo. Using Eq. (9.3), Schuurmann et al. (t973a) computed values of c and d for a =0.01, 0.05, p =3(1)6, and different values of r where P[ u, < c] =- (1 - a), (9.4)

P[ Up~>d] = (1 -

a).

(9.5)

More extensive tables are given in a technical report by Schuurmann et al. (1973b). Percentage points of u 1 and Up are given in Table 19. Krishnaiah and Schuurmann (1974a) computed the values of a and b by using Eq. (9.3) for a =0.05, 0.01, p--3(1)6 and different values of r where

P[ u 2 <a] = (1 - a),

(9.6)
(9.7)

P[u,_,

More extensive tables of the percentage points of u 2 and up_ 1 are given in a technical report by Krishnaiah and Schuurmann (1974b). Percentage points of u 2 and Up_ 1 are given in Table 19. We now discuss the computation of the ratio of the extreme roots of the Wishart matrix. We know that

P[fPl<'c]=l-Cif=(f

''' fD

hi(f21 ..... fPl'W')i=21~[ dfji}dwi


(9.8)

where F~, =

wi/Wl, hi(f21 . . . . . D I ' w,) is the joint density of f2, ..... fp. and w 1

Computations of some multivariate distributions

779

and D : c < fpl < " " " < f21 < 1. Using Eq. (9.8) and the results in Section 8.1, Krishnaiah and Schuurmann (1974a) obtained the following expression for the distribution function of fp~ when r is an integer:

N* P[ fp, ~c] "~"1-C1 E d i c r i r ( n P - l ) i ) / ( c s i -~ ti dr. 1)(ha/2)-v/ i=l


(9.9) where the constants d i, r~, s t, tg and v can be computed explicitly by using a computer when p is not large. Table 20 gives the values of c for a = 0.05, 0.01, p = 3, 4 and different values of n where P[ ~ ,

<c] = (1 -

a).

(9.10)

Next, let 01 ~ - . . >~Op be the roots of S I ( S I + $2) -1 where S 1 and S 2 are distributed independently as central Wishart matrices with m and n degrees of freedom respectively and E ( S 1 / m ) = E ( S 2 / n ) . Then, the joint density of the roots of 0l,...,0p is given by Eq. (8.2). Krishnaiah and Schuurmann (1974b) computed the values of a for a =0.990, 0.975, 0.950, 0.900, 0.750, 0.500, 0.250, 0.100, 0.050, 0.025, 0 . 0 1 0 , p = 2 ( 1 ) 7 where

0:
Table 21 gives the values of a for a =0.95, 0.990 a n d p =2(1)7.

10. Distributions of the traces of multivariate beta and multivariate F matrices In this section, we discuss the evaluation of the probability integrals associated with the distributions of T~ and T2 where T I = t r S I ( S 1 + $2) -1 and T 2 -- tr S 1S~- 1; here Sj and S 2 are independently distributed Wishart matrices defined in the preceding section. We refer to T 1 and T 2 as the traces of the multivariate beta matrix and multivariate F matrix respectively. N a n d a (1950) derived the exact distribution of T 1 f o r p = 2 , 3 , 4 and r = 0 where r = (m - p - 1). The paper of N a n d a is the first significant paper on the distribution of T I. Pillai and Jayachandran (1967) computed the exact percentage points of the distribution of T 1 when p = 2, whereas Pillai and Jayachandran (1970) computed the exact percentage points of T~ f o r p = 3,

780

P. R. Krishnaiah

r = 1,2, 3, s = 5(5)25 and p = 4, r = 0, 1, s = 5(5)25. The density of T l satisfies a differential equation (see Davis (1970c)). Pillai (1960) computed approximate upper 5% and 1% points of the distribution of T~ for p =2(1)8. We now discuss a method of the evaluation of the probability integral of T 1 When p = 2q, the Laplace transform of T~ is

C2 L ( T I ' t ) = -q-(. Z +- b,,i2(t)b,3i,(t)" " bi2, ,i~q(t)


whereas

(10.1)

t ( r l , t):--~.

c2 2q

Z (-- l)UFr-l-u([,1) ~ -t-bili2([)'" "bi2q_qi2q(~) u=o

(10.2) when p = 2q + 1. In the above equations,

bij(t)-J+r-ltt~-ji+r-I~ ,,
fv~(t) =

Fr(t, 1)= f o l e X p ( - t x ) x ~ ( 1 - x ) S d x , - t ( x + 0))(1 - x)'(1 -O)S(O"xV-x"OV)dxdO,

f0'f0exp(

and s= ( n - p - 1 ) . In Eq. (10.1), Y. denotes the summation over all permutations i l,...,i2q of 1,2,...,2q subject to the restrictions i I < @ i 3 < i4..... i2q - 1 < izq and the summation is positive or negative according as the permutation is even or odd. In Eq. (10.2), ~ , denotes the summation over all permutations i I..... i2q of 1,2 ..... u, u + 2,..., 2q + 1 subject to the restriction i 1 <i 2. . . . . i2q_ 1<i2q and the summation is positive or negative according as the sign is positive or negative. Also, if(t)= ~ (a m ) 2R ( t ; i + a , j + a )

a~O

+2(
al <a2

m al )( am2 )(--1)a)+ a2[ R ( t ; i +

a,,j + a2)+ R ( t ; i + a2, j + al)],

(lO.3)
R(t; a,a + k ) = f0' f0exp { - t(x + O) }(XaO a + k - Xa+kO a) d x dO
(10.4) and k is a nonnegative integer. The right-hand side of Eq. (10.4) can be written as a linear combination of the incomplete gamma integrals. Now,

Computations of some multivariate distributions

781

the distribution of T~ can be obtained by taking the inverse Laplace transformation of Eqs. (10.1) and (10.2). The above results on the derivation of the distribution of T~ were given by Krishnaiah and Chang (1972). When r and s are integers, the density of T 1 can be expressed as a finite series as follows:

f(T')=-~.v ~/d~" (mi-l)!

( T, - k , ) ? - '

'

O<~TI<~P

(10.5)

where d/, k i and m i are certain constants; here (x)+ is equal to x or 0 according as x/> 0 or otherwise. The constants d,., k i a n d m i can be computed by using a computer when p is not large. Using Eq. (10.5), Schuurmann et al. (1975) computed the values of c for a = 0.01, 0.025, 0.05, 0.100, p = 2, 3, 4, 5 and different values of r and s where

fo~f(T,)dT,=(1-a).

(10.6)

These percentage points are given in Table 22. We will now discuss the distribution of T2. The distribution of T 2 was derived by Hotelling (1951) for p = 2 . The distribution of T 2 can be approximated with a suitable Pearson type curve by using the first four moments. Using this approximation, Pillai and his colleagues (see Pillai (1960)) computed upper 5% and 1% points of the distribution of T2 for p = 2 ( 1 ) 8 . When r is an integer, Pillai and Young (1971) expressed the distribution of T 2 as a linear combination of the inverse Laplace transformations of certain pseudo-determinants. The above authors gave explicit expressions for the above inverse Laplace transformations for p = 3, r = 0(1)5, and p = 4, r = 0, 1,2. Exact percentage points of the distribution of T 2 were given by Davis (1970b), G r u b b s (1954) and PiUai and Young (1971) for p = 2, 3, 4 and some values of r. Davis (1968) showed that the density of T 2 satisfies a differential equation. Krishnaiah and Chang (1972) expressed the exact density of T 2 as a linear combination of terms of the form h(Tz; al,fll),h(T2; a2,f12),... ,h(T2; O/q,q) where denotes convolution, q is the integral part of (p + 1), and h(x; a, r ) -- fl ~'/(fl + x) '~ for 0 < x < ~ . Table 23 gives the approximate percentage points of the distribution of T 2 given in Pillai (1960) whereas Table 24 gives the exact percentage points of T 2 computed by Davis (1970a, b).

Appendix
A brief explanation of various tables and a chart given in this Appendix is given below for the convenience of the users of the tables.

782

P. R. Krishnaiah

Table 1: Percentage points of the multivariate t distribution Let x'-~(x~ .... ,xp) be distributed as a multivariate normal with m e a n vector 0 and covariance matrix oz~2 where f~= (Pij), Pti = 1 ( i = 1,2 . . . . . p) and Oij = 0 (iv~J = 1, 2 ..... p). Also, let s 2 / a 2 be distributed independent of x as chi-square with n degrees of freedom. In addition, let ti = x i V ~ n / s for i = 1,2 .... ,p. The entries in Table 1 give the values of a for different values of a, p, and n where

i=1,2 .... , p ] = ( 1 - . ) .

(A.1)

The entries in this table are reproduced from Krishnaiah and Armitage (1966) with the kind permission of the Indian Statistical Institute. Table 2: Percentage points of the studentized largest chi-square distribution Let F / = ns2/ms0Z ( i = l,2 . . . . . p ) where So,S~,S 2 2 2. . . . . s~ 2 are distributed independently as chi-square variables with n , m , m . . . . . m degrees of freedom respectively. Also, let u = m a x ( F l , . . . ,Fp). The entries in Table 2 gives the values of c for different values of a, m, n and p where

e[u.<c]

(A.2)

The entries in this table are reproduced from Armitage a n d Krishnaiah (1964). Table 3: Percentage points of the studentized smallest chi-square distribution Let v =min(F~ ..... Fp) where F1,...,Fp are defined in the description of Table 2. Table 3 gives the values of b for different values of a, m, n and p where

(A.3)
The entries in this table are reproduced from Krishnaiah and Armitage (1964). Table 4: Percentage points of the distribution of the range Let R = ( x 0 ) - x ( n ) ) / o where x0) and X(n) respectively denote the maxim u m and minimum of n r a n d o m variables which are distributed independently and normally with m e a n 0 and variance o 2. Table 4 gives the values of c for different values of a and n where P[R<c]=(I-~). (A.4)

Computations of some multivariatedistributions

783

Table 5: Percentage points of the distribution of the studentized range Let T = R / 6 where R was defined in the description of Table 4 and v62/02 is distributed independent of R as chi-square with v degrees of
freedom. The entries in Table 5 give the values of d for different values of a, v and n where

P[ T<.d] = (1 - a).

(A.5)

The entries in Tables 4 and 5 are reproduced from Harter (1960) with the kind permission of the author and the Institute of Mathematical Statistics.

Table 6: Percentage points of the bivariate Phi-square distribution Let S=(sij): p p be a central Wishart matrix with m degrees of freedom and E ( S / m ) = Z where Z = ( o i j ) . Also, let yi=sii/Ou ( i = 1..... p ) and p = o i j / ~ for i@j= 1..... p. Table 6 gives the values of d for
different values of a, m and p where

P [ y l ~ d , y2<~d]=(1-a).

(A.6)

Table 7: Bounds on percentage points of the multivariate chi-square distribution Let Yl ..... yp be as defined in the description of Table 6. The entries in
Table 7 are the values of d* given by the equation

1-pP[yl>~d*]+(2)P[y,>~d*,y2>~d']=(1-a )

(A.7)

for different values of a, m and p where p=oij/~OuO; (i4=j = 1..... p). The values of 1 - a * for the above values of d* are given in parenthesis in Table 7 where a* is given by the equation

pe[ y,

Tables 6 and 7 are computed by P. R. Krishnaiah and F. J. Schuurmann (unpublished).

Table 8: Exact percentage points of the multivariate chi-square distribution with one degree of freedom
Let x 1..... xp be as defined in the description of Table 1 and let

zi = xi2/O 2 ( i = 1,2 . . . . . p). Table 8 gives the values of c for different values

784 of a, p and 0 where

P. R. Krishnaiah

P[z i <c; i = 1,2 . . . . . p ] = ( l - a ) .

(A.8)

The entries in Table 8 are reproduced from Krishnaiah and Armitage (1965c) with the kind permission of the editor of Trabojos de Estadistica.

Table 9: Percentage points of the bivariate F distribution Let S = ( s u ) be as defined in the description of Table 6. Also, let s 2 / o 2 be distributed independent of sll . . . . . See as chi-square with n degrees of freedom. In addition, let Fi=siio2/s2oii ( i = 1,...,p). Table 9 gives the values of d for given values of c~, m, n and p where P[ F l <.d,F2 < d ] = (1 - a).
(A.9)

The entries in this table are reproduced from Schuurmann et al. (1975) with the kind permission of the Indian Statistical Institute.

Table 9a: Exact percentage points of the multivariate F distribution Table 9a gives the values of d for m = 1 and different values of a, n,p and 0 where
P[ F / < d ; i = 1,2 . . . . . p ] = ( l - a ) and F l ..... Fp were defined in the description of Table 9. The entries in this table are reproduced from Krishnaiah and Armitage (1970) with the kind permission of the University of North Carolina Press.

Table 10." Bounds on the probability integrals of the multivariate F distribution Let Fl,...,Fp be as defined in the description of Table 9. Table 10 gives the values of d* for different values of a l, p, m, n and p where

The values in parenthesis of the table are the values of 0L 2 for the above values of d* where

pP[ F, > d* ] = a z.
Table 10 is computed by P. R. I ~ i s h n a i a h and F. J. Schuurmann (unpublished).

Tables l l a, 11 b, l l c." Approximate pereentage points of T~a x statistic Let Ti2=y~.S- lyiu, ( i = 1,2 . . . . . N), where Y'=(Y'I ..... Y~v) is distributed as a multivariate normal with m e a n vector 0 and covariance matrix B Z where denotes Kronecker product, E is of o r d e r p P,Yi is of o r d e r p 1

Computationsof some multivariate distributions

785

and B is of order N N. Also, the diagonal elements of B are equal to/3 and the off-diagonal elements are equal to 6. In addition, S is distributed independent of y as a central Wishart matrix with v degrees of freedom and E ( S ) = rE. Tables 1 la, 1 lb, 1 lc give approximate values of c f o r p = 2 and different values of N, v, /3, 8 and a. The entries in these tables are reproduced from Siotani (1959b, 1960)with the kind permission of the editor of the Annals of the Institute of Statistical Mathematics. Table 12: Percentage points of the joint distribution of the extreme roots of the Wishart matrix Let $1: p p be a central Wishart matrix with m degrees of freedom and E ( S I / m ) = I . Also, let w I >1... >~w e be the eigenvalues of S 1. Table 12 gives the values of d for different values of c~, p and m where P[ 1/d<<.wp <w, < d ] = (1 - a). (A.12)

The entries in this table are reproduced from Clemm et al. (1973b) with the kind permission of G o r d o n and Breach Company. Table 13." Percentage points of the distribution of the largest root of the Wishart matrix The entries in this table give the values of d for different values of a, p and m where P[ w, < d I = (1 - a) (A. 13)

and the notations a, p, m, d and w~ are defined in the description of Table 12. The entries in this table are computed by using the computer program developed by Clemm, Krishnaiah and Waikar. Table 14: Percentage points of the smallest root of the Wishart matrix Table 14 gives the values of c for different values of a , p , m where P[ w, >/c] : (1 - a) (A.14)

and the notations Wp,p, and m are defined in the description of Table 12. The entries in this table are computed by using the computer program developed by Clemm, Krishnaiah and Waikar. Table 15: Percentage points of the individual roots of the Wishart matrix Table 15 gives the values of a for different values of ~, m, p and i (1 < i < p - 1) where

P[wp_i+ 1 < a ] =(1 - a)

(1.15)

and m, p and wi are defined in the description of Table 12. The entries in

786

P.R. Krishnaiah

this table are reproduced from Clemm et al. (1973a) with the kind permission of Gordon and Breach Company.

Table 16: Percentage points of the joint distribution of the extreme roots of the multivariate beta matrix Let 01~> . . . ~>0p be the roots of S~(S~+S2) -~ where St: p p and $2:
p p are distributed independently as central Wishart matrices with m and n degrees of freedom respectively and E ( S 1 / m ) = E ( S 2 / n ) = I . Table 16 gives the values of A for different values of a, p, r and s where P[ 1 - A

<07 <~0, ~<A] = ( 1 -

a),

(A.16)

r=(m-p-1)

and s = ( n - p - 1 ) . The entries in this table are reproduced from Schuurmann et al. (1975b) with the kind permission of G o r d o n and Breach Company.

Table 17." Percentage points of the largest root of the multivariate beta matrix
Table 17 gives the values of A * for different values of a, p, r and s where

P[Ol <~A*]=(l--a )

(A.17)

wherep, r, s and 01 are defined in the description of Table 16. The entries in this table are computed by using the computer program developed by Schuurmann, Waikar and Krishnaiah. Chart 1: Chart for the largest root of multivariate beta matrix This chart gives the values of A* for different values of a, p, r and s where

and the notations p, r and s are defined in the description of Table 16. This chart is reproduced from Heck (1960) with the kind permission of the Institute of Mathematical Statistics. On each page, the curves give the values of A * for particular values of p, r, s and a. The curves in the lower section are continuation of the curves in the upper section with an overlap of the values of A* from 0.525 to 0.550. The twelve curves in each page correspond to the values of r = - , 0(1)10 with the following exception; for p = 2 , a = 0 . 0 2 5 , 0.05, the lower curves give the values for r--0(1)10. The upper scale for A* corresponds to the upper set of curves whereas the lower scale corresponds to the lower set of curves. The lowest curve in each case (with the exception of the cases p = 2, a = 0.025, 0.05) corresponds to r = - , the next lowest to r = 0, the next to r = 1, etc. The scale for s is logarithmic and it is on the left margin of the page.

Computations of some multivariate distributions

787

Table 18: Percentage points of the intermediate roots of the multivariate beta matrix Table 18 gives the values of b for different values of a, p, r, s and i (2 ~<i ~<p - 1) where e[op_i+, < ~ b ] = ( l - a )
(A.18)

a n d p , r, s and 0i are defined in the description of Table 16. The entries in this table are reproduced from Krishnaiah et al. (1973).

Table 19: Percentage points of the ratios of the individual roots to the trace of the Wishart matrix Table 19 gives the values of c for different values of a, p, r a n d j where
P[ uj ~<c] -- (1 - a) (A. 19)

where u : = w J t r S l and the notations wj, S~, p, and r are defined in the description of Table 12. Upper 5% and 1% points of u~ and lower 5% and 1% points of Up are reproduced from Schuurmann et al. (1973b) whereas upper 5% and 1% of u 2 and lower 5% and 1% of Up_ 1 are reproduced from Krishnaiah and Schuurmann (1974a) with the kind permission of Academic Press.

Table 20." Percentage points of the ratio of the extreme roots of the Wishart matrix Table 20 gives the values of c for a, p and m where
P[ fpl "<< e] = (1 - a), (A.20)

fpl = Wp/Wl and the notations p, m, w 1 and Wp are defined in tile description of Table 12. U p p e r 10%, 5%, 2.5% and 1% points of the distribution of fpl are reproduced f r o m Krishnaiah and Schuurmann (1974b). Table 21: Percentage points of the ratio of the extreme roots of the multivariate beta matrix Table 21 gives the values of a for different values of a, p, r and s where

Op<~a] =(l-a)

(A.21)

where 01, Op, p, r and s are defined in the description of Table 16. The entries in this table are reproduced from Krishnaiah and Schuurmann (1974b).

788

P. R. Krishnaiah

Table 22." Percentage points of the trace of the multivariate beta matrix Table 22 gives the values of b for different values of a, p, r and s where

P[ T 1 <b]

=(I-c 0

(A.22)

where rl=trSl(Sl-I-S2) -1, a n d S1, $2, p, r and s are defined in the description of Table 16. The entries in this table are reproduced f r o m Schuurmann et al. (1975) witli the kind permission of G o r d o n and Breach Company. Table 23: Approximate percentage points of the trace of the multivariate beta matrix Table 23 gives approximate values of b for different values of a, p, r and s where

P[Tl <.b]=(l-~).
The notations used in this table are the same as used in the description of Table 22. The entries in this table are reproduced from Pillai (1960) with the kind permission of the author. Table 24." Percentage points of the trace of the multivariate F matrix Table 24 gives the values of c for different values of a, p, m and n where

,In mr <e ]
T 2 = t r S I S 2 ! and S 1, S2, p, m and n are defined in the description of Table 16. T h e entries in this table are reproduced from Davis (1970a, b) with the kind permission of Biometrika Trust and the editor of the Annals of the Institute of Statistical Mathematics.

Computations of some multivariate distributions


Table 1 Percentage Points of the Multivariate t Distribution a = 0.05, p = 0.0

789

np
5 6 7 8 9 10 tl 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3t 32 33 34 35

1
2.01 1.94 1.89 1.86 1.83 1.81 1.79 1.78 1.77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1.71 1,71 1.71 1.71 1.70 1.70 1.70 1.70 1.70 1.69 1..69 1.69 1.69

2
2,53 2.42 2.34 2.28 2.24 2.21 2.18 2.16 2.15 2.13 2.12 2.11 2,10 2.09 2.08 2.08 2,07 2.06 2.06 2.05 2.05 2.05 2.04 2.04 2.04 2.03 2.03 2.03 2.03 2.02 2.02

3
2.84 2.70 2,60 2.53 2.48 2.44 2.41 2.38 2.36 2.34 2.32 2.31 2.30 2.29 2.28 2.27 2.26 2.26 2.25 2,24 2.24 2.23 2.23 2.22 2,22 2,22 2.21 2.21 2.21 2.21 2.20

4
3.06 2.89 2.78 2.70 2.64 2.60 2.56 2.53 2.51 2.48 2.47 2.45 2.44 2.42 2.41 2.40 2.39 2.39 2,38 2.37 2.37 2.36 2.36 2.35 2.35 2.34 2.34 2.34 2.33 2.33 2.33

5
3.23 3.05 2.92 2.84 2.77 2.72 2.68 2.65 2,62 2.59 2.57 2.56 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.47 2.47 2.46 2.46 2,45 2.44 2.44 2,44 2.43 2,43 2.42 2.42

6
3.38 3,17 3.04 2.95 2.87 2.82 2.77 2.74 2.71 2.68 2.66 2.64 2.63 2.61 2.60 2.59 2.58 2.57 2.56 2.55 2.55 2.54 2.53 2.53 2.52 2.52 2.51 2.51 2.51 2.50 2.50

7
3.50 3.28 3.14 3.04 2,96 2.90 2.86 2.82 2.79 2.76 2.74 2.72 2.70 2.68 2.67 2.66 2.65 2.64 2.63 2,62 2.61 2.60 2,60 2.59 2.59 2.58 2.58 2.57 2.57 2.56 2.56

9 3,69 3.45 3.30 3.19 3.10 3.04 2.99 2.94 2.91 2.88 2.86 2.83 2.81 2.80 2.78 2.77 2.76 2.75 2.74 2.73 2.72 2,71 2.70 2.70 2.69 2.69 2.68 2.68 2,67 2.67 2.66

10 3.77 3.53 3.36 3.25 3.16 3.10 3,04 3.00 2.96 2.93 2.91 2.88 2.86 2.84 2.83 2.81 2.80 2.79 2.78 2.77 2.76 2.75 2.75 2.74 2.73 2.73 2.72 2.72 2.71 2.71 2.71

3.60 3.37 3.22 3.12 3.04 2.97 2.93 2.89 2.85 2.82 2.80 2.78 2.76 2,74 2.73 2.72 2.71 2.70 2.69 2.68 2.67 2.66 2.65 2.,65 2.64 2.64 2.63 2.63 2.62 2.62 2.61

790

P.R. Kr~hnaiah

Table 1 (Continued) a =0.05, p=0.2 n p 1 2 3 4 5 6 7 8 9 10

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

2.01 1.94 1.89 1.86 1.83 1.8t 1.79 1.78 1.77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1.71 1.71 1.71 1.71 1.70 1.79 1.70 1.70 1.70 1.69 1.69 1.69 1.69

2.51 2.39 2.32 2.27 2.23 2.19 2.17 2.15 2.13 2.12 2.11 2.09 2.09 2.08 2.07 2.06 2.06 2.05 2.05 2.04 2.04 2.03 2.03 2.03 2.02 2,02 2,02 2,02 2.01 2.01 2.01

2.79 2.66 2.56 2.50 2.45 2.41 2.38 2.35 2.33 2.32 2.30 2.29 2.28 2.27 2.26 2.25 2.24 2.24 2.23 2.22 2.22 2.21 2.21 2.21 2.20 2.20 2.20 2.19 2.19 2.19 2.19

3.00 2.84 2.73 2.66 2.60 2.56 2.53 2.50 2.47 2.45 2.44 2.42 2.41 2.40 2.39 2.38 2.37 2.36 2.36 2.35 2.34 2.34 2.33 2.33 2.33 2.32 2.32 2.32 2.31 2.31 2.31

3.15 2.98 2.87 2.78 2.72 2.68 2.64 2.61 2.58 2.56 2.54 2.52 2.51 2.50 2.49 2.48 2.47 2.46 2.45 2.45 2.44 2.43 2.43 2.42 2.42 2.41 2.41 2.41 2.40 2.40 2.40

3.28 3.10 2.97 2.89 2.82 2.77 2.73 2.70 2.67 2.64 2.62 2.61 2.59 2.58 2.57 2.55 2.55 2.54 2.53 2.52 2.52 2.51 2.50 2.50 2.49 2.49 2.49 2.48 2.48 2.47 2.47

3.39 3.19 3.06 2.97 2.90 2.85 2.80 2.77 2.74 2.71 2.69 2.67 2.66 2.64 2.63 2.62 2.61 2.60 2.59 2.58 2.58 2.57 2.57 2.56 2.55 2.55 2.55 2.54 2.54 2.53 2.53

3.48 3.28 3.14 3.04 2.97 2.91 2.87 2.83 2.80 2.77 2.75 2.73 2.72 2.70 2.69 2.68 2.67 2.66 2.65 2.64 2.63 2.63 2.62 2.61 2.61 2.60 2.60 2.59 2.59 2.59 2.58

3.56 3.35 3.21 3.11 3.03 2.97 2.92 2.89 2.85 2.83 2.80 2.78 2.7'1 2.75 2.74 2.73 2.71 2.71 2.70 2.69 2.68 2.67 2.67 2.66 2.65 2.65 2.64 2.64 2.64 2.63 2.63

3.63 3.41 3.27 3.16 3.08 3,02 2.97 2.94 2.90 2.87 2.85 2.83 2.81 2.80 2.78 2.77 2.76 2.75 2.74 2.73 2.72 2.71 2.71 2.70 2.70 2.69 2.69 2.68 2.68 2.67 2.67

Computations of some multivariate distributions


Table 1 (Continued) a =0.05, p=0.4 n p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 2.01 1.94 1.89 1.86 1.83 1.81 1.79 1.78 1.77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1.71 1.71 1.71 1.71 1.70 1.70 1.70 1.70 1.70 1.69 1.69 1.69 1.69 2 2.47 2.36 2.29 2.24 2.20 2.17 2.14 2.12 2.11 2.09 2.08 2.07 2.06 2.06 2.05 2.04 2.04 2.03 2.03 2.02 2.02 2.01 2.01 2.01 2.00 2.00 2.00 2.00 1.99 1.99 1.99 3 2.72 2.60 2.51 2.45 2.40 2.37 2.34 2.32 2.30 2.28 2.26 2.25 2.24 2.23 2.22 2.22 2.21 2.20 2.20 2.19 2.19 2.18 2.18 2.18 2.17 2.17 2.17 2.16 2.16 2.16 2.16 4 2.91 2.76 2.66 2.60 2.54 2.50 2.47 2.45 2.42 2.40 2.39 2.37 2.36 2.35 2.34 2.33 2.33 2.32 2.31 2.31 2.30 2.30 2.30 2.29 2.29 2.28 2.28 2.28 2.27 2.27 2.27 5 3.04 2.89 2.78 2.71 2.65 2.61 2.57 2.54 2.52 2.50 2.48 2.47 2.46 2.44 2.43 2.42 2.42 2.41 2.40 2.40 2.39 2.38 2.38 2.38 2.37 2.37 2.36 2.36 2.36 2.35 2.35 6 3.16 2.99 2.88 2.80 2.74 2.69 2.65 2.62 2.60 2.58 2.56 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.~7 2.47 2.46 2.45 2.45 2.44 2.44 2.44 2.43 2.43 2.43 2.42 9 42 7 3.25 3.07 2.96 2.87 2.81 2.76 2.72 2.69 2.66 2.64 2.62 2.60 2.59 2.58 2.57 2.56 2.55 2.54 2.53 2.52 2.52 2.51 2.51 2.50 2.50 2.49 2.49 2.49 2.48 2.48 2.48 8 3.33 3.15 3.02 2.94 2.87 2.82 2.78 2.75 2.72 2.70 2.68 2.66 2.64 2.63 2.62 2.61 2.60 2.59 2.58 2.57 2.57 2.56 2.56 2.55 2.55 2.54 2.54 2.53 2.53 2.53 2.52 9 3.40 3.21 3.08 2.99 2.92 2.87 2.83 2.80 2.77 2.74 2.72 2.71 2.69 2.68 2.66 2.65 2.64 2.63 2.62 2.62 2.61 2.60 2.60 2.59 2.59 2.58 2.58 2.57 2.57 2.57 2.56

791

10 3.46 3.27 3.13 3.04 2.97 2.92 2.87 2.84 2.81 2.78 2.76 2.74 2.73 2.72 2.70 2.69 2.68 2.67 2.66 2.66 2.65 2.64 2.64 2.63 2.63 2.62 2.62 2.61 2.61 2.60 2.60

792

P. R. Krishnaiah

Table 1 (Continued) ~ =0.05, 0=0.5 n p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 2.01 1.94 1.89 1,86 1.83 1.81 1.79 1.78 1.77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1.71 1.71 1,71 1.71 1,70 1.70 1.70 1.70 1.70 1.69 1.69 1.69 1.69 2 2.44 2.34 2.27 2.22 2.18 2.15 2.13 2.11 2.09 2,08 2.97 2.06 2.05 2.04 2.03 2.03 2.02 2.02 2.01 2.01 2.00 2.00 2.00 1.99 1.99 1.99 1.99 1.98 1.98 1.98 1.98 3 2.68 2.56 2.48 2.42 2.37 2.34 2.31 2.29 2.27 2.25 2.24 2.23 2.22 2.21 2.20 2.19 2.18 2.18 2.17 2.17 2.16 2.16 2.16 2.15 2.15 2.15 2.14 2.14 2.14 2,14 2.13 4 2.85 2.71 2.62 2.55 2.50 2.47 2.43 2.41 2.39 2.37 2.36 2.34 2.33 2.32 2.31 2.30 2.30 2.29 2.28 2.28 2.27 2.27 2.26 2.26 2.26 2.25 2.25 2.25 2.24 2.24 2.24 5 2.98 2.83 2.73 2.66 2.60 2.56 2.53 2.50 2.48 2.46 2.44 2.43 2.42 2.41 2.40 2.39 2.38 2.37 2.37 2.36 2.36 2,35 2.35 2.34 2.34 2.33 2.33 2.33 2.32 2.32 2.32 6 3.08 2.92 2.81 2.74 2.68 2.64 2.60 2.58 2.55 2.53 2.51 2.50 2.49 2.48 2.46 2.46 2.45 2.44 2.43 2.43 .2.42 .2.42 2.41 2.41 2.40 2.40 2.39 2.39 2,39 2.38 2.38 7 3.16 3,00 2.89 2.81 2.75 2.70 2.67 2.64 2.61 2.59 2.57 2.56 2.54 2.53 2,52 2.51 2.50 2.50 2.49 2.48 2.48 2.47 2.46 2.46 2.46 2.45 2.45 2.44 2.44 2.44 2.43 8 3.24 3.06 2.95 2.87 2.81 2.76 2.72 2.69 2.66 2.64 2.62 2.61 2.59 2.58 2.57 2.56 2.55 2.54 2.53 2.53 .2.52 2.52 2.51 2.51 2.50 2.50 2.49 2.49 2.49 2.48 2.48 9 3.30 3.12 3.00 2.92 2.86 2.81 2.77 2.73 2.71 2,69 2.67 2.65 2.63 2.62 2.6] 2.60 2.59 2.58 2.57 2.57 2.56 2.56 2.55 '2.54 2.54 2.54 2.53 2.53 2.52 2.52 2,52 10 3.36 3.17 3.05 2.96 2.90 2.85 2.81 2.77 2.75 2.72 2.71 2.69 2,67 2.66 2.65 2.64 2.63 2.62 2.61 2.60 ,2.60 2.59 2.58 2.58 2.57 ~.57 ~.57 2.50 ,2.56 2-.56 2.55

Computations of some multivariate distributions


Table 1 (Continued) a =0.05, p~0.7 n p 1 2 3 4 5 6 7 8 9

793

10

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

2.01 1.94 1.89 1.86 1.83 1,81 ].79 1.78 ].77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1,71 1.71 1.71 1.71 1.70 1.70 1.70 1.70 1.70 1.69 1.69 1.69 1.69

2.37 2.27 2.21 2 16 2.12 2.10 2.08 2.06 2.04 2.03 2.02 2.01 2.00 1.99 1.99 1.98 1.98 1.97 1.97 1.96 1.96 1.96 1.95 1.95 1.95 1.94 1.94 1.94 1.94 1.94 1.93

2.56 2.45 2.38 2.32 2.28 2.25 2.23 2.21 2.19 2.17 2.16 2.15 2.14 2.13 2.13 2.12 2.11 2.11 2.10 2.10 2.09 2.09 2.09 2,08 2.08 2.08 2.08 2.07 2.07 2.07 2.07

2.70 2.57 2.49 2.43 2.39 2.36 2.33 2.31 2.29 2.27 2.26 2.25 2.24 2.23 2.22 2.21 2.20 2.20 2.19 2.19 2,18 2.18 2.18 2.17 2.17 2.17 2.16 2.16 2.16 2.16 2.15

2.80 2.67 2.58 2.52 2.47 2.43 2.40 2.38 2.36 2.34 2.33 2.32 2.31 2.30 2.29 2.28 2.27 2.27 2.26 2.26 2.25 2.25 2.24 2.24 2.24 2.23 2.23 2.23 2.22 2.22 2.22

2.88 2.74 2.65 2.58 2.53 2.50 2.47 2.44 2.42 2.40 2.'39 2.37 2.36 2.35 2.34 2.34 2.33 .2.32 2.32 2.31 2.31 2.30 2.30 2.29 2.29 2,29 2.28 2.28 2.28 2.27 2.27

2.94 2,80 2.71 2.64 2.59 2.55 2.52 2.49 2.47 2.45 2.43 2.42 2.41 2.40 2.39 2.38 2.37 2.37 2.36 2.35 2.35 2.34 2.34 2.34 2.33 2.33 2.33 2.32 2.32 2.32 2.32

3.00 2.85 2.75 2.68 2.63 2.59 2.56 2.53 2.51 2.49 2.47 2.46 2,45 2.44 2.43 2.42 2.41 2.40 2.40 2.39 2.39 2.38 2.38 2.37 2.37 2.37 2.36 2.36 2.36 2.35 2.35

3.05 2.90 2.80 2.72 2.67 2.63 2.59 2.57 2.54 2.53 2.51 2.50 2.48 2.47 2.46 2.45 2.44 2.44 2.43 2.42 2,42 2.41 2.41 2.40 2.40 2.40 2.39 2.39 2.39 2.38 2.38

3.09 2.94 2.83 2.76 2.71 2.66 2.63 2.60 2.58 2.56 2.54 2.52 2.51 2.50 2.49 2.48 2.47 2.47 2.46 2.45 2.45 2.44 2.44 2.43 2.43 2.43 2.42 2.42 2.42 2.41 2.41

794 Table 1 (Continued) a = 0 . 0 5 , p~O,9 n p ] 2 3 4

P. R. Krishnaiah

10

5 6 7 8 9 10 11 12 13 !4 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

2.01 1.94 1,89 1,86 1,83 1.81 1.79 1.78 1,77 1.76 1,75 1.75 1.74 1,73 1,73 1.72 1.72 1,72 1,71 1,71 1.71 1.71 1.70 ].70 ].70 1,70 1.70 1.69 1,69 1.69 1.69

2.24 2.15 2.09 2.05 2.02 2.00 1.98 1.96 1.95 1.93 1.93 1.92 1.91 1.90 1.90 1.89 1.89 1.88 ].88 1.88 1.87 ].87 1.87 1.86 ].86 1.86 1.86 1.85 1.85 1.85 1.85

2.36 2.26 2.20 2.15 2.12 2.09 2.07 2.05 2.Or 2.03 2.01 2.01 2.00 1.99 1.98 1.98 1.97 1.97 1.96 1.96 1.96 ].95 1.95 1.95 1.95 1.94 1.94 1.94 1.94 1.93 1.93

2.44 2.34 2.27 2.22 2.18 2.16 2.13 2.12 2.10 2.09 2.08 2.07 2.06 2.05 2.04 2,04 2.03 2.03 2.02 2,02 2.01 2,01 2,0] 2.00 2,00 2.00 2.00 1.99 1.99 1.99 1.99

2.50 2.39 2,32 2.27 2.23 2.20 2.18 2.16 2.14 2.13 2.12 2.11 2.10 2.09 2.09 2.08 2.07 2,07 2,06 2.06 2.06 2.05 2.05 2.05 2.04 2.04 2.04 2.04 2.03 2.03 2.03

2.54 2.43 2.36 2.31 2.27 2.24 2.22 2.20 2.18 2.17 2.15 2.14 2.13 2.13 2.12 2.11 2.11 2.10 2.10 2.09 2.09 2.09 2.08 2.08 2.08 2.07 2.07 2.07 2.07 2.06 2.06

2.58 2.47 2.40 2.34 2.30 2.27 2.25 2.23 2.21 2.19 2.18 2.17 2.16 2.15 2.15 2,14 2.13 2~13 2,12 2,12 2,12 2,11 2,11 2,11 2,10 2,10 2,10 2.09 2.09 2,09 2.09

2.61 2.50 2.42 2.37 2.33 2.30 2.27 2.25 2.23 2.22 2.21 2.19 2,19 2.18 2.17 1.16 2.16 2,]5 2.15 2.14 2.14 2,13 2.13 2.13 2.12 2.12 2.12 2.12 2.11 2.11 2.11

2.64 2.53 2.45 2.39 2.35 2.32 2.29 2.27 2.25 2.24 2.23 2.22 2.21 2.20 2.19 2.18 2.18 2.17 2.17 2.16 2.16 2.15 2.15 2.15 2.14 2.14 2.14 2.]4 2.13 2.13 2.13

2.67 2.55 2.47 2.41 2.37 2.34 2.31 2.29 2.27 2.26 2.25 2.23 2.22 2.22 2.21 2,20 2,19 2,19 2,18 2,18 2,17 2,17 2,17 2,16 2.16 2.16 2.]5 2,15 2.15 2.15 2.]5

Computations of some multivariate distributions


Table 1 (Continued) a =0.01, p=0.0 n p 1 2 3 4 5 6 7 8 9 5.50 4.94 4.59 4,~4 4.17 4.03 3.93 3.84 3.77 3.71 3.66 3.62 3.58 3.55 3.52 3.49 3.47 3.45 3.43 3.41 3.40 3.38 3.37 3.36 3.35 3.34 3.33 3.32 3.31 3.30 3.30

795

10 5.61 5.03 4.67 4.42 4.23 4.09 3.98 3.90 3.82 3.76 3.71 3.67 3.63 3.59 3.56 3.54 3.5] 3.49 3.47 3.46 3.44 3.42 3.41 3.40 3.39 3.37 3.36 3.36 3.35 3.34 3.33

5 6 7 8 9 10 ll 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

3.36 3.14 3.00 2.90 2.82 2.76 2,72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.48 2.47 2.47 2.46 2.46 2.45 2.45 2.44 2.44 2.44

4.00 3.68 3.48 3.34 3.24 3.16 3.10 3.05 3.00 2.97 2.94 2.92 2.89 2.87 2.86 2.84 2.83 2.81 2.80 2.79 2.78 2.77 2.77 2.76 2.75 2.75 2.74 2.73 2.73 2.72 2.72

4:39 4.01 3.77 3.61 3.49 3.39 3.32 3.26 3.21 3.17 3.14 3.11 3.08 3.06 3.04 3.02 3.00 2.99 2.97 2.96 2.95 2.94 2.93 2.93 2.92 2.91 2.90 2.90 2.89 2.89 2,88

4.67 4.25 3.98 3,80 3.66 3.56 3.48 3.4] 3.36 3.31 3.28 3.24 3.21 3.19 3.16 3.14 3.13 3.11 3.10 3.08 3.07 3.06 3.05 3.04 3,03 3.02 3.02 3.01 3.00 2.99 2.99

4.90 4.44 4.15 3.95 3.80 3.69 3.60 3.53 3.47 3.42 3.38 3.35 3.32 3.29 3.26 3.24 3.22 3.20 3.19 3.17 3.16 3.15 3.14 3.13 3.12 3,11 3.10 3.10 3.09 3.08 3.08

5.08 4.59 4.28 4.07 3.91 3.79 3.70 3.63 3.56 3.51 3.47 3.43 3.40 3.37 3.34 3.32 3.30 3.28 3.27 3.25 3.24 3.22 3.21 3.20 3.19 3.18 3.17 3.16 3.16 3.15 3.]4

5.24 4.73 4.40 4,17 4.01 3.89 3.79 3.71 3.64 3.59 3.54 3.50 3.47 3.44 3.41 3.38 3.36 3.34 3.33 3.31 3.30 3.29 3.27 3.26 3.25 3.24 3.23 3.22 3.21 3.21 3.20

5.38 4.84 4.50 4.26 4.09 3.96 3.86 3.78 3.71 3.65 3.60 3.56 3.53 3.50 3.47 3.44 3.42 3.40 3.38 3.36 3.35 3.34 3.32 3.31 3.30 3,29 3.28 3.27 3.27 3.26 3.25

796

P. R. Krishnaiah

Table 1 (Continued) a =0.01, p=0.2 n p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 3.36 3.14 3.00 2.90 2.82 2.76 2.72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.48 2.47 2.47 2,46 2.46 2.45 2.45 2.44 2.44 2.44 2 3.97 3.66 3.47 3.33 3.23 3.I5 3.09 3.04 3.00 2.96 2.93 2.91 2.89 2.87 2.85 2.83 2.82 2.81 2.80 2.79 2.78 2.77 2.76 2.75 2.75 2.74 2.74 2.73 2.73 2.72 2.72 3 4.34 3.97 3.74 3.58 3.46 3.37 3.30 3.24 3.20 3.16 3.12 3.10 3.07 3.05 3.03 3.01 2.99 2.98 2.97 2.95 2.94 2.93 2.93 2.92 2.91 2.90 2,90 2.89 2.88 2.88 2.87 4 4.60 4.20 3.94 3,76 3.63 3.53 3.45 3.39 3.34 3.30 3.26 3.22 3.20 3.17 3.15 3.13 3.12 3.10 3.09 3.07 3.06 3.05 3.04 3.03 3.02 3.01 3.01 3.00 2.99 2.99 2.98 5 4.80 4.37 4.10 3.90 3.76 3.66 3.57 3.50 3.45 3.40 3.36" 3.33 3.30 3.27 3.25 3.22 3.21 3.19 3.17 3.16 3.15 3.14 3.13 3.12 3.11 3.10 3.09 3.09 3.08 3.07 3.06 6 4.97 4.51 4.22 4.02 3.87 3,76 3.67 3.59 3.54 3.49 3.44 3.41 3.37 3.35 3.32 3.30 3.28 3.26 3.25 3.23 3.22 3.21 3.20 3.19 3.18 3.17 3.16 3.15 3.14 3.14 3.13 7 5.12 4.63 4.33 4.11 3.96 3,84 3.75 3.67 3.61 3.56 3.51 3.48 3.44 3.41 3.39 3,36 3.34 3.33 3.3] 3.30 3.28 3.27 3.26 3.24 3.23 3.22 3.22 3.21 3.20 3.19 3.19 8 5.24 4.74 4.42 4.20 4.04 3.91 3.82 3.74 3.67 3.62 3.57 3.53 3.50 3.47 3.44 3.42 3.40 3.38 3.36 3.35 3.33 3.32 3.31 3.30 3.29 3.28 3.27 3.26 3.25 3.24 3.23 9 5.35 4.83 4.50 4.27 4.11 3.98 3.88 3.80 3.73 3.68 3.63 3.59 3.55 3.52 3.49 3.47 3.45 3.43 3.41 3.39 3.38 3.36 3.35 3.34 3,33 3,32 3,31 3.30 3.29 3,29 3.28 10 5.44 4.91 4.57 4.34 4.17 4.04 3.93 3.85 3.78 3.72 3.68 3.63 3.60 3.56 3.54 3.51 3.49 3.47 3.45 3.43 3.42 3.40 3.39 3.38 3.37 3.36 3.35 3.34 3.33 3.32 3.32

Computations of some multivariate distributions


Table 1 (Continued) a =0.01, 0=0.4 p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2O 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 3.36 3.14 3.00 2.90 2.82 2.76 2.72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.48 2.47 2,47 2.46 2.46 2.45 2.45 2.44 2.44 2.44 2 3.03 3,63 3.44 3.30 3.20 3.13 3.07 3.02 2.98 2.95 2.92 2.89 2.87 2.85 2.84 2.82 2.81 2.80 2.78 2.77 2.77 2.76 2.75 2,74 2.74 2.73 2.72 2.72 2.72 2.71 2.71 3 4.26 3,92 3.69 3.54 3.43 3.34 3.27 3.21 3.17 3.13 3.10 3.07 3.05 3.02 3.00 2.99 2.97 2.96 2.95 2.94 2.93 2.92 2.9[ 2.90 2.89 2.89 2.88 2.87 2.87 2.86 2,86 4 4.50 4.12 3.87 3.71 3.58 3.49 3.41 3.35 3.30 3.26 3.22 3,19 3.17 3.14 3.12 3.10 3 9 5 4.68 4~27 4.01 3,83 3.70 3.60 3.52 3.46 3,40 3.36 3.32 3.29 3.26 3.23 3,21 3.19 3.17 3.16 3.14 3.13 3.12 3.ll 3.I0 3.09 3.08 3.07 3,06 3.06 3.05 3.04 3,04 6 4.83 4.40 4.13 3.94 3.80 3.69 3.61 3.54 3.48 3.44 3.40 3.36 3.33 3.31 3.28 3.26 3.24 3.23 3.21 3.20 3.18 3.[7 3.16 3.15 3,14 3.14 3.13 3.12 3.11 3.11 3.10 7 4.95 4.51 4.22 4.02 3.88 3.77 3.68 3.61 3,55 3.50 3.46 3.43 3.39 3.37 3.34 3.32 3.30 3.29 3.27 3.26 3.24 3.23 3.22 3.21 3.20 3.19 3.18 3.17 3.17 3.16 3.15 8 5.06 4.60 4.30 4.10 3.95 3.83 3.74 3.67 3.61 3.56 3.52 3.48 3.45 3,42 3.39 3.37 3.35 3.33 3.32 3.30 3.29 3.28 3.27 3.26 3.25 3.24 3.23 3.22 3.21 3.20 3.20 0 5.15 4.68 4.37 4.16 4.01 3.89 3.80 3,72 3.66 3.6] 3.56 3,53 3.50 3.47 3.44 3.42 3.40 3.38 3.36 3.35 3.33 3.32 3.31 3.30 3.20 3.28 3.27 3.26 3.25 3.25 3.24

797

10 5.23 4.75 4.44 4.22 4.06 3.94 3.85 3.77 3.71 3.65 3.61 3.57 3.54 3.51 3.48 3.46 3.44 3.42 3.40 3.38 3.37 3.36 3.34 3.33 3.32 3.31 3.31 3.30 3.29 3.28 3.28

3.07 3.06 3.05 3.03 3.02 3.0[ 3.01 3.00 2.99 2.98 2.98 2.97 2.96 2.96

798 Table 1 (Continued) a =0.01, p = 0 . 5 n p 5 6 7 8 9 10 1! 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 3.36 3.14 3.00 2.90 2.82 2.'76 2.72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2,5 0 2.49 2.48 2.48 2.47 2.47 2.46 2.46 2.45 2.45 2.44 2,44 2 .4 4 2 3.90 3.61 3.42 3.29 3.19 3.tl 3.06 3.01 2.97 2.93 2.91 2.88 2.86 2.84 2.83 2.81 2.80 2.79 2.77 2.77 2.76 2.75 2.74 2.73 2.73 2.72 2.72 2.7] 2.71 2 .70 2.70 3 4.21 3.88 3.66 3.51 3.40 3.31 3.25 3,119 3.15 3.11 3.08 3.05 3.03 3.01 2.99 2.97 2.96 2.94 2.93 2.92 2.91 2.90 2.89 2.88 2.88 2.87 2.86 2.86 2.85 2.85 2.84 4 4.43 4,06 3.83 3.66

P. R. Krishnaiah

5 4.60 4.21 3.96 3.78 3.66 3.56 3,48 3.42 3.37 3.32 3.29 3.26 3.23 3.20 3.18 3.16 3.15 3.13 3.12 3.11 3.110 3.08 3.07 3.06 3.06 3,05 3.04 3.03 3.03 3,02 3.01

6 4.73 4 .32 4.06 3.88 3.75 3.64 3.56 3 .50 3.44 3.40 3.36 3.33 3.30 3.27 3.25 3.23 3.21 "3.20 3.18 3,17 3,16 3,15 3.14 3.13 3,12 3.11 3.10 3.10 3.09 3.08 3.08

7 4.85 4.42 4,15 3.96 3.82 3.72 3.63 3.56 3.51 3.46 3.42 3.39 3.36 3,33 3.31 3.29 3.27 3.25 3.24 3.22 3.2] 3.20 3.19 3.18 3.17 3.16 3.15 3.15 3.14 3.13 3,13

8 4.94 4.51 4.22 4.03 3.89 3.78 3.69 3.62 3.56 3.51 3.47 3.44 3.41 3.38 3.36 3.34 3.32 3.30 3.28 3.27 3.26 3.24 3.23 3.22 3.21 3.20 3.20 3.19 3.18 3.17 3.17

9 5.03 4.58 4.29 4.09 3.94 3.83 3.74 3.67 3.61 3.56 3.52 3.48 3.45 3.42 3.40 3.38 3.36 3,34 3.32 3.31 3.30 3.29 3.27 3.26 3.25 3.24 3,23 3.23 3.22 3.21 3.21

10 5.11 4.64 4.35 4.14 3.99 3.88 3.79 3.71 3.65 3.60 3.56 3.52 3.49 3.46 3.44 3.41 3.39 3.38 3.36 3.34 3.33 3,32 3.31 3,30 3.29 3.28 3.27 3.26 3.25 3.25 3.24

3.54 3.45 3.38 3.32 3.27 3.23 3.20 3.17 3.14 3.12 3.10 3.08 3.07 3.05 3.04 3.02 3.0] 3.00 2.99 2.99 2.98 2.97 2.96 2.96 2.95 2.95 2.94

Computations of some multivariate distributions


Table 1 (Continued) a =0.0l, n p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2O 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 3.36 3.14 3.00 2.90 2.82 2.76 2.72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 p=0.7 2 3.82 3.54 3,36 3.23 3.14 3.07 3.01 2.96 2.93 2.90 2.87 2.84 2.82 2,81 2.79 2.78 2.76 2.75 2.74 2.73 2.72 2.72 2.71 2.70 2.70 2.69 2.69 2.68 2,68 2.67 2.67 3 4.07 3.76 3.56 3.42 3.31 3.23 3.17 3.12 3.08 3.04 3.01 2.99 2.97 2.95 2.93 2.91 2.90 2.89 2.88 2.87 2.86 2.85 2.84 2.83 2.82 2.82 2.81 2.81 2.80 2.79 2.79 4 4.25 3.91 3.70 3.54 3.43 3.35 3.28 3.23 3.18 3.15 3.11 3.09 3.06 3.04 3.02 3.01 2.99 2.98 2.97 2.95 2.94 2.94 2.93 2.92 2.91 2.91 2.90 2.89 2,89 2.88 2.88 5 4.38 4.03 3.80 3.64 3.53 3.44 3.37 3.31 3.26 3.22 3.19 3.16 3.14 3oll 3.10 3.08 3.06 3.05 3.03 3.02 3.01 3.00 2.99 2.98 2.98 2.97 2.96 2.96 2.95 2,94 2.94 6 4.48 4.12 3.88 3.72 3.60 3.51 3.43 3.37 3.33 3.29 3.25 3.22 3.19 3.17 3.15 3.13 3.12 3.10 3.09 3.08 3.07 3.06 3.05 3.04 3.03 3.02 3.01 3.01 3.00 3,00 2.99 7 4.57 4.19 3.95 3.78 3.66 3.56 3.49 3.43 3.38 3.34 3.30 3.27 3.24 3.22 3.20 3.]8 3.16 3.15 3.13 3.12 3.11 3.10 3.09 3.08 3.07 3.07 3.06 3.05 3.04 3.04 3.03 8 4.65 4.26 4.01 3.84 3.71 3,61 3.54 3.47 3.42 3.38 3.34 3.31 3.28 3.26 3.24 3.22 3.20 3.18 3.17 3.16 3.15 3.14 3.13 3.12 3.11 3.10 3.10 3.09 3.08 3.08 3.07 9 4.71 4.32 4.06 3.89 3.75 3.66 3.58 3.51 3.46 3.42 3.38 3.35 3.32 3.29 3.27 3.25 3.23 3.22 3.20 3.19 3.18 3.17 3.16 3.15 3.14 3.13 3.13 3.12 3.11 3.11 3.10

799

10 4.77 4.37 4.11 3.93 3.79 3.69 3.61 3.55 3.50 3.45 3.41 3.38 3.35 3.33 3.30 3.28 3.27 3.25 3.23 3.22 3.21 3.20 3.19 3.18 3,17 3.16 3.15 3.15 3.14 3.13 3.13

i.51
2.50 2.49 2.48 2.48 2.47 2.47 2.46 2,46 2.45 2.45 2.44 2.44 2.44

8OO
Table t (Continued)

P. R. Krishnaiah

a=O.Ol, p = 0 . 9
n p 1 2 3.66 3.40 3.23 3.12 3.03 2.97 2.91 2.87 2.84 2.81 2.78 2.76 2.74 2.72 2.71 2.70 2.69 2.67 2.67 2.66 2.65 2.64 2.63 2.63 2.62 2.62 2.61 2.61 2.60 2.60 2.59 3 3.82 3.54 3.36 3.24 3.14 3.07 3.02 2.97 2.94 2.90 2.88 2.85 2.83 2.82 2.80 2.79 2.77 2.76 2.75 2.74 2.73 2.73 2.72 2,71 2.71 2.70 2.70 2.69 2.69 2.68 2.68 4 3.92 3.63 3.44 3.32 3.22 3.15 3.09 3.04 3.00 2~97 2,94 2.92 2.90 2'.88 2.86 2.85 2.83 2.82 2.81 2.80 2.79 2.78 2,78 2.77 2.76 2.76 2.75 2.75 2.74 2.74 2.73 5 4.00 3.70 3.51 3.37 3.28 3.20 3.14 3.09 3.05 3.02 2.99 2.96 2.94 2.92 2.91 2.89 2.88 2.87 2.85 2.84 2.83 2.83 2.82 2.81 2.80 2.80 2.79 2.79 2.78 2.78 2.77 6 4.06 3.75 3.56 3.42 3.32 3.24 3.18 3.13 3.09 3.05 3.02 3.00 2.98 2.96 2.94 2.93 2.91 2.90 2.89 2.88 2.87 2.86 2.85 2.84 2.84 2.83 2.83 2.82 2.81 2.81 2.80 7 4.11 3.80 3.60 3.46 3.35 3.28 3.21 3.].6 3.12 3.09 3.06 3.03 3.01 2.99 2.97 2.95 2.94 2.93 2.92 2.91 2.90 2.89 2.88 2.87 2.87 2.86 2.85 2.85 2.84 2.84 2.83 8 4.16 3.83 3.63 3.49 3.39 3.31 3.24 3.19 3.15 3,11 3.08 3.06 3.03 3.01 2.99 2.98 2.96 2.95 2.94 2.93 2.92 2.91 2.90 2.90 2.89 2.88 2.88 2.87 2.86 2.86 2.85 9 4.19 3.87 3.66 3.52 3.41 3.33 3.27 3.2] 3.17 3.14 3.10 3.08 3.05 3.03 3.02 3,00 2.98 2.97 2.96 2.95 2.94 2.93 2.92 2,92 2.91 2.90 2.90 2.89 2,89 2.88 2.88 10 4.23 3.89 3.69 3.55 3.43 3.35 3.29 3.24 3.19 3.15 3.12 3.10 3.07 3.05 3.03 3.02 3.00 2.99 2.98 2.97 2.96 2.95 2.94 2.93 2,93 2.92 2.91 2.91 2.90 2.90 2.89

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

3.36 3.14 3.00 2.90 2.82 2.76 2,72 2.68 2.65 2.62 2,60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2.50 2.49 2.48 2.48 2.47 2.47 2,46 2.46 2.45 2.45 2.44 2.44 2.44

Computations o f some multivariate distributions

801

~ ~ 4 ~ d ~ d d d d d d d g d
~ ~ 0 0 ~ ~ 0 ~ ~ 0 ~ ~ ~ 0 ~ ~

b,.

ZO
c~ 0 ~ .0 0 ~ ~ ~ ~ ~

0 ~

~ ~

~ ~

0 0

~ ~

~ ~

~ ~

~ ~ ~

. o

~ ~ 0

0 . . 0 . ~ . ~

~ . ~ . 0 .

~ . ~ ~ .

~ . ~ ~ .

~ . ~ ~ . . .

O~

~ ~

0 ~

~ 0 ~

~ 0

~ ~

~ ~

~ ~

802

P. R. Krishnaiah

(% H

O0 r~

14"h CO ~Cq C) Oh

~Ll~h ~ Oh O0 b~

~h L'~

C) Oh

Oh ~

0 C'd kO

L"~- 0 C) -~, CO LCh Lg~

k-O k~D 14~

0 L~ LCh

0 OQ L~

k-C) H L~

~ ~

Oh
Oh

k.C) k4D

~4"

k.~', rl Hr~

C~h Oh Oh O0 Oh CO

r~'~ CO

CM MD b--

rq. Od b~

Or-} ~ k.O C'q k.C) ME)

MD Oh Lg~

LC~ ['--Lgh

Oh MD LU~ ~ i Lg~ Lg~

0 O'h L(h

~ Lfh

(~ ~

C]h (M CO O0 ~-

~ Oh CO

b-[~

CkJ b-

CO kK)

C~ kD Lgh

~D Lg~

~ Ldh

Oq L~

Od Ldh Lr~

Oh ~

~ ~j*

MD ~

t.O

k.O

~:~

Oh

kO

b~-

0"~

Oh

b--

LO

14~

Lf~

~4

MD

CO

C~

OO

LC~

0%

Oh

kO

L(~

Orb

kO kO

Oh C~J

O~ --~-

~:i CO

H ~

OO O

0"3 ~D

(M C~

O H

~~ Oh

O OO

O ~

OO L~

Oq ~

O~ g~

LO CkJ

O OJ

~1-

CO

MO

L~

OJ

gkJ

0"3

Oh

L~

C~J

Oh

k.O

O~

OJ

Oh

CO

CO

b'-

FI

kO

(~1

Oh

k4D

Oq

k)

LC% ~ i

~1"

(Yh

CO

-~

Or)

Ckl

rq

CO

0
v

L~

kO

L ~-

CO

Oh

Ckl

kfD

CO

CD

CM

Lg~

LCh

Lg~

Computations of some multivariate distributions

803

0
0,1

~ ~ 0

~ . . ~ . .

~ . ~

~ . .

0 0 .

~ . . ~

0 ~

~ .

0 . ~ .

~ . . .

~ O0

~ ~ b'. . .

~ ~ . .

0 0 .

~ ~ .

~ ~ .

0 0 . . . .

~ ~

~ ~
o

~ ~0 .. .

~ . .

~ . .

0 . .

~ . .

0 . . .

~
o o

~ ~ t~

~ 0

0 ~

~ ~

0 ~

~ 0

~ ~

~ ~

~ ~

~ cq

804

P. R. Krishnaiah

~ 0

~ ~

~ ~

~ ~

0 ~

~ 0

0 ~

~ ~

~ ~
o

~ 0

~
b-

CO

~ ~

~ ~

~ ~

~ 0

~ 0

0 ~

0 , ,

~ . .

~ . .

~ . .

0 . . .

~ . .

~ . . .

Computations of some multivariate distributions

805

~ ~

~ ~

CO

~ ~

~ ~

0 ~

~ ~

~ ~

0 0

~ ~

L LO

I~

d ~ 4 4 ~ J J ~
~ ~ ~ ~ 0 0 ~ 0 ~ 0

JJd~d~d
~

0 ~

~ 0

~ . . . .

~ .

~ . .

~ . .

806

P. R, Krishnaiah

b~

L~

LF~

L~

~b

~b

Crh

Cch

C~1

dr1

CB

C'fh

C~

O.J

0 rl

CO b-kLD Lf~ L~ ~r -~~f'ch C~ C~ ('(h (w) Orb Oth CM CM (M

b--

kO

L~

Lf%

~4~-

--~

C'~

Or/

f~

Cth

f~

Oq

CM

Ckl

04

Od

kO L "~-

CF~ C) MD

CFI C~l L~

CO CO ~-

(~J L~ ~t-

LO C~J ~r

Oh O0 0"~

~ ~ C'~

L(m, ~ Owl

C~J Or) C/

r--~ CM CXI ~ r ~ Or] C~

CM C~ Orb

Oh CO Od

CD CO (M

~ b~ (~J

Oh LO Od

L~

L(h

-~-

CYq

Cth

C~

C~/

C,'h

04

CM

OJ

CM

Ck[

CM

OJ

CM

C~i

LO LCh

O0 ~ ~

Oh fY~

LC) O'h

~ 0 r] EY~

Oh Od

CO CM CLI

LO C~J Cq

Lrh C~J

~i CLI

O'h C~I C~I

CXJ CXJ

@
0 ~P ~O~h or1 0~ H Orb Cq OJ Lrh C~J ~i C'J Of~ Q CM C'q OJ OJ OJ H OJ OJ Od CM

_,

r-~

r--~

OJ

Cq

CM

C~

Cth

Computations of some multivariate distributions

807

(xl

kD

~N

b-~

Od

CO

14~

CO

Eh

O~

C~

0%

["--

0 c~

. b'-

. kO

. L~

. ~

. ~

. ~

. ~1"

. 0~

Oq

C'~

o O'h

. Of%

. . . Oq Od

. . Ckl OJ

Cq

O.I

Cq

L~

OJ

O~

kO

~.i

Od

OO

['~

k4D

kCD '~D

k~D Lr~

Oh ~

td~ ~F

Od G

O~ Oq

kD cr~

('#h ~

~J ~

0 Oq

O~ C,J OJ OJ

" kD (~1 Cq

L~ CNI

trm, C~J

oo U
CQ

o-,~
~

L~ ~,

~,~
Od

r~{

o, o-, ~
O~ CO

CO

b--

kO

~ 2 ~

0"3

EY~ O kO

(~; r~l L~

L~ ~T

r~ ~

L~ CO C~

z3kD Oq

{M O~ C~

~ C~

LD O", Cq

IZ1LD CO b-0d ~J

CO ~C) OJ

0~, LC~ OJ

0 14~ Od

GI ~1OJ

LO 0'h OJ

(~1 O'h OJ

Le% Lc~

~ ~ ~

CO C'~

14~ 0"3

C~ Oq Cr~

CO Od

b~CXI

',dD C'J

Lrh CXI Cq

~ C~J C~I (~J

CxJ C~I (~

!
k.CD C~ Le~ C,q (~] Orb O Or) (:43 OJ kO OJ (XJ 0q C~J C~ Od C~I Od Od OJ Od O CkJ OJ rl 0% r~ L~ kD L ~ CO O~ O r~l ~J r--I ~ r~ ~ H CO r~ O C,.I Od (xj L~ O.I O 0q L(~, O 0q ~

_ct-

808

P. R. Krishnaiah

rH

O rd

OJ L ~-

O k~)

C'1 L~

0% ~~-

~ -~ ~ C~

~ C~I

C~J CY~ C~

O C~I

O~ Od

CO C~J

~ C~

~ Od CkJ

~o,~

o ~o

kO

L~

<

<

Oq

O~, L~

O Lr~

~" z:~ ~

~ ~

LC~ C~

CM C'~ 0f~

CO CM

~ OJ OJ

~ Od CM Od

C1 OJ C~I

Cd C~

Cd

=1 ~ ~ . O . . ~ . . ~ . . H . . O . . O . . ~ . .

8~
E-4 CM Cd OJ ~ cr~ ~ ~-

Computations of some multivariate distributions

809

r-~ D'Lf~ LD, ~.i ~ --~{Y'} {Y~ pr~ {y~ Cr~ Crl (XI Od (XI 04 Od

C) r-I

CO . L{~

CxJ . L~

L"-. ~a-

Or1 . . ~ ~

LC~ . Oq . ~ .

Oq . Oq

r~ . Oq . Oq

C~ . OJ

. Od

. OJ

~ . C'J

14~ . ~ O,J

(30 ~O

',.O LI~ LI~ -~~S~q-

~C) Or'}

~IOq

Od Oq Oq

C) C~}

~ CM

{30 (XJ

kO C~] C~I

LD, Od (XI

k.O

~i

~-

Oq

Or~

0r~

Cr~

CXI

Od

(X]

{X]

Od

(XI

(XI

kid ~O

~_i Lf~

CO ~

~--i ~1-

r~l ~

CO 0"5

LO~ Oq

(XI Or~ Oq

{D~ CXl

OO (XI

CO ~

~ O.} Od

tI~ C~I

~1CXI

Oq O.I

r-t MD L~ :q~

CO Or)

RD f'cq

{TT Oq

~--1 Oq

O~ C~

CO Od

~ CU (X]

LC~ {XI

~ O,1

PrB Od ~ (21

LC~

(Y~

Or~

{Y~

C~J

Cq

B.I

(XI

Od

CXI

C',J

OJ

(XI

Lr~

~q-

~1-

{'~

Oq

Oq

{X]

C~I

(XI

04

(XI

Od

CXI

CX]

O.I

{x.I

C~

810

P. R. Krishnaiah

'~

Oh

O~

L'--

CO

kC3

b~

C~J

C~

r~

~-

O L~

Oq P'%

OO r~

\0 O

b ~Oh

~ CO

Oq b~

or% kO

kD ~

~4 Lr~

CO

(~1

CO

CO

CO

03

ff]

Lq

CO

(~

k.O

CO

%D kO

Lr~ L~

Oh ~

Lgh ~

rl ~

Oh Or3

Lgh C r)

@% r 4 0 0 ~ Or] Oq Or] OA

k ~ Db ~ O O (M Oq C~

L~ Od

~ Od
Ckl

L'--

k3

IA~

-~

0%

'q

v3

0"3

C'~

C'~

C'~

C~3

Od

Ckl

Od

Od

kZh

b--

~D

r~

,4 C'r3

LC]

(\1

f--I

~-

Or]

(XI

(r) OJ

b~ ~XJ

(XI Od

kO

~Y

~-

C'q

Or)

Od

Ckl

Od

Od

OJ

Lg~

C~ kO

C~J L~

k) ~n-

C~I ~ Cr } Orb

('r% C'%

r Oq

Oh ~

CO Od

~ C~

k) (XI

kc~

Or%

0"3

(XJ C~

(XA

(M

C~

Cq

~l) kO

Oh

Gh ~i

O0 C~

b-(~

Lgh ~ Cq C~

CO C~I

kO
(hj

bOd

Oh
Ckl

0
Ckl

Gh
(XI

~
(XI

k) C'd

r-t OJ

LCX ~ i

~ o ~
L~ ~i ~i

o
~

~
Oq

r~
Oq

~
or)

~
Od

~
(M

o
(XI

~
Od

o
Od Od

o
(hd

~ D ~
C~ 0,4 Od

L~h

~"

Cq

0"%

P~

Or )

(XI

(XI

(XJ

(XI

Od

~XI

Od

Od

Ckl

Ckl

L~ ~a~

CO eq

~C~

H 0q

Oh (X)

~ C~

L~ Od

~i Cq

Od (XI

(XI (XI Od

O Od C~I ~1

OO r4 r-I H

LP,

k~D

b ~-

OO

Oh

O H

(X] ~

~i~1

k4D r4

CO r--I

O t~

~ 04

L~ Od

(D ~

L.q ~

(D ~r

wrh

~I-

Computations of some multivariate distributions


! o 7o
0 0 0

8t I

o o 0 0 0 0 0 0 0 0

o 0 0 0 0 0

o 0 0 0 0 0

o 0 0 0 0

o 0 0 0 0

o 0 0 0 0

o 0 0 CJ CD

o 0 0 0 0 0

o 0 0 0 0

o 0 0 0 0

o 0 0 0 0

o 0 0 0 0 0 0

o 0 0 0

o 0 0

o 0 0

0 0 0 0 O

CD

0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0

0 0 0 0 C

0 0 0 0 b

0 0 0 0 O 0

0 0 0 0 0 0 0

0 0 0 0 0

CD 0 0 0 0 0

0 0 0 0 0

0 Cb 0 0 0

0 0 0 0 0

0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 C) 0 0

o o s o o o o o o s s o o o o s o
O 0 0 0 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 dD 0 0 0 0 0 0 0 0 0 0 0 0 0 C D 0 0 0 O 0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0

0 0 0 0

0 0 0 0

0 0 0 0 0 0 0

~ 0 0

0 0 0 0

0 0 0 0

0 0 0

0 0 0

0 0 0 0

0 0 0 0

0 0 0 0 0 0

~ 0

0 0 0

0 ,.Q

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

tg

0 0 0 O 0 0 0 ~ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 0 0 J 0 0 0 O 0 0 0

r 0

f 0

l 0

~ 0

H 0

H 0

0 0

0 0

0 0

0 0

0 0

0 0 0

~ o o o o o o ~ o ~ o o o o o o
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

"S

- -

2T

~'

~ ~ ~ ~ ~ ~ o o o o o o o o o ~ o o o ~ o ~ o o o s o o o
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

o 0

812

P. R. Krishnaiah

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(Z) rH 0 0

O 0 0 0 0 0

O 0 0 0 0

0 0

0 0

0 0

0 0

0 0 0 0 0

O O

0 0

0 0

C 0

) 0 0

O 0

0 q

0 ~

0 O

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0

~ 0 0

O 0 0 0 0

h 0 0

~ 0 0

O 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

kO 0 0 . 0 . 0 . 0 0 . 0 . . 0 0 0 . 0 0 . 0 0 . 0 . 0 . 0 O 0 . . 0 0 . 0 0 . 0 . 0 0 O 0

0 0 0 0 0 0 0

0 0

0 0

0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

C'q 0 0 0 0 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

~
0 0

.
0 0

0 0

o~
0 0 0 0

o
0 0

~ o
0 0

~ o
0 0 0 0

~
0 0

~
0 0

~
~ 0 0 0

~
0 0 0 0 0 0

~
0 0

L~

k.O

CO

Oh

0 r-~

C~ H

-~ r-I

k~ r~q

CO ~

0 C~

Ckl C~J

LCh Ckl

0 Oq

IWh Oq

C) ~

LCh

Computations of some multivariate distributions

813

~
C~ ~4

~
0 0 0 0 0 0 0 0 0 0

~
0 0

2
0 0

~
0 0 0 0 0 0

~ ~
0 0 0 0 0 0

~
0 0 0 0

0 0

0 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0

O 0 0

O 0

O 0

{20

0 O O O O

0 O

0 O O O

O 0 O O O

0 O

0 O

0 O

0 O

0 O

0 O

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

~
0 0 0 0

~
0 0 0 0

o
0 0 0 0

~
0 0 0 0

~
0 0 0 0

o
0 0 0 0

~
0 0 0 0

~
0 0 ~ 0

o
t~

~
~ O 0 O O 0 O

~
0 O 0 O

o
0 O 0 O

~
0 O 0 O

~
0 O 0 O O

~
0 O 0 O 0 O

~ O

CXJ

LO,

LO, '..O

D~- 00

0%

Od

(D

C~

CD

C2~

C~

(D

{22)

C2)

(2D

(23

(D

CD

CD

(D

(D

CD

O 0 0

~ 0 0

O 0

O 0

H 0

H 0

H 0

H 0

H 0

~ 0

Od 0

~ 0

~ 0

~ 0 0

CU

i4~

k.O

CO

Oh

(XJ

~1-

CO

Od

14~

L(~

LC~

814

P, R. Krishnaiah

O.I

rq O 0 0 O O 0 0 0 0 0 0 0 0 0 0 0 0

0 H 0 ~ 0 0 D 0 0 0 0 0 0 0 0 0 0 0 0

.,~=
CO H
0

~o~
t
0

~
H
0

~
H H
0 0

~
H
0 0

~
H H
0 0

~
H
0

F
0

H
0

H
0

H
0

H
0

H
0

H
0

0 0

O~

~
kD

~
~
0 0 0 0

~ ~
~-~
0 0 0 0

~
~
0

o o ~
~
0 0

~ ~
~
0 0

~
0

~
0

~
0 0

~ ~

~ O

~ ~

0 -

~ O

~ ~

~ 0 0 ~ O r q

~
0 0

o
0 0

~
0

8
0 0

~
0 0

~
0

~
0 0

%
0 0

~
0 0

o
0

!dg

gg

g g gg

Jg

J J Jg

J J d

04 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

rl

r--I

r4

r~

r4

Od

Od

OJ

~i

Computations of some multivariate distributions

815

Lgh

CO

Oh

-zl-

Cch

CO

CO

Oh

~.t

r~

LCh

k~D

C'rfl

MD

~~1 0

Oh ~1 0

Cfl ~-I 0

~ H 0

0 H 0

0,1 H 0

kD m4 0

Oh H 0

(M H 0

-~ H 0

MO H 0

b ~~ 0

Oh rd 0

t-4 ~ 0

Orb rd 0

~F ~d 0

kD / f~,

rd 0

~ (D r'd H 0 r

O d 0

~ H 0

O H 0

O H 0

~ H 0 H

~ H 0

O H 0

~ H 0

~ H ~

~ H 0 H

~ H

~ H 0 0

~ d 0

~ H

O0 O O O O O O O O O O O O O O O O O

b-kD

~,~
O O

LO C'q

~i~ b~

Orb ~

~
O

L/h :::::i

Cr~ b'~

~,~
O

CO r-fl

Cr~ Lfh

Crb
0

Cq
C~]

CL~
Oq

00
LCh

~CO

~
O

CO

~
O

~,~A,o
O O

CM 0

~ r-I

o ~ CXJ
O O

Oh CM

CD

~
0

~
0 0 0

~~
0 0 0

~ ~o~
0 0 0

~ ~ o ~ ~
0 0 0 0

C~

e
'~ H

L~

b-

C0

Oh

CNI

~D

CO

0J

Lrh

LCh

LC~

816

P. R, Krishnaiah

~ 0

S 0

b 0

b 0

S 0

@ 0

8 0

~ 0

6 0

6 0

0 0

0 0

0 0

0 0

H 0

H 0

d 0

0 H . 0 0 . 0 . . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 0 0 0
o . . .

cO H 0 . H 0 . ~ . 0 H . 0 H . 0 q . 0 C . 0 . q 0 q 0 q @ 0 q 0 q . 0 . q 0 . C 0 . q 0 . q 0 . q . 0 q .

b--

_~
0", k23

~
r--t

~,~
Cch Od 0 OJ 0

o-, ,.o
C'J OJ 0 "~D OJ 0

~
C~I Ckl 0

L.-, ~
~-~ CXI 0 CXI 0 C,J 0

~
b-CkJ 0

~
O~ CM 0

~
04 0

o3
Ld~ C~J 0

~O0 OJ 0

~
CXI 0

~
OJ (~J 0

t~

~C) C~J 0

O~ C~I 0

L"Cq 0

~ C,J 0

0 OJ 0

0 OJ 0

"..0 ~ 0

E"-- O 0 Od 0 (XI 0

C'q C,.I 0

mq ~ 0

~" Od 0

~.1" CX] 0

C'q OJ 0

0 (Nl 0

Oq OJ 0

0 C~ 0

C~J 0

CkJ 0

CM 0

CM 0

Ckl 0

Ckl 0

C'd 0

C~J 0

CKI 0

CM 0

Od 0

~ 0

C~J 0

Od 0

C~ 0

C~ 0

Od 0

tr~

~ ~

04 ~
0

t~. 0

O"h CO . 0

CO . 0 .

L~ ~ . 0

0 . 0

k.O 0 . 0 .

0% 0 . 0

Cd r-q . 0

r~l . 0

r-I . 0 .

(M . 0

k) C~l

(XI

0 Cr~

@
0'h . C ~ ~ LC~ LC~ k.O b~ ~ CO CO

L~

kO

b~

CO

O~

O ~--I

Od r~q

~ r--I

"-.O r--I

CO ~--I

O O.1

C',J O4

~ Od

O 0"h

14~ C'~

O ~i-

14~

Computations of some multivariate distributions

817

C~

~H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

~H

<~D <0 CO ~--~ 0

GO -mOh r--{ 0

14h ~H 0 C~ 0

~t b~ 0 C~ 0

Oh r-~ rl Od 0

0 <0 r--I C~I 0

O0 C~ 0.~ C~ 0

C~ O0 C<I C<l 0

L~h Od Or ] Od CD

r--I <0 C~ M 0

C~ Oh CO Od 0

CO rJ =.~" C<l 0

0 ~h :=r 0,] 0

O~ Oh ~ C,J 0

~ 0,1 t~ 0,] 0

Oh -:T LC~ C~] 0

Oh <0 LC~ C~ 0

CO

<0 . 0 0 . . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 0 0 . 0 . 0 . 0 . 0 .

Lfh . 0 . 0 . 0 . 0 . 0 . 0 0 0 . 0 . 0 . 0 . 0 . 0 . 0 . . 0 . 0 . 0

0 =a. 0

0 . 0 . 0 .

~ . 0 0

0 . 0 0 0

~ . 0 . 0

@ . 0 . . 0

~ . 0 . 0

o
r

C~

. o

. o

. . o o

o o

. o

. o

. o

. o

. o

. o

. o

. o

. o

. o

818

P. R. Krishnaiah

~-q

~ 0

(M 0 0 0

C'kl 0

CM 0 CD

CNI 0

(M 0

Od 0

Od C)

CM 0

(M 0

CM 0

Od 0

(XI 0 (~

0 rH

0 (M 0

r-~ CXI 0

OJ Ckl 0

Ckl (M 0

C'q Cq 0

CO (M 0

-J~ Od 0

L~'~ C<] 0

LCh CM 0

kO C'<I 0

k~D 0.1 0

kO C<] 0

~ Ck] 0

~ (M 0

b~ (M 0

CO CX.I 0

CO Od CD

bOJ

Cr] CM
(M

CM
0 Ckl 0

Oh
ti) (M 0

<0
CM CM 0

LCh
~ CM 0

<-0
IWh (M 0

0
CM Od 0

Od t~Od 0

<) ~q
CM 0

Oq L/h

L(h CO

L(h (%]
CM

~ b~
Od 0

'-0 r-4

b~

CW b~
Ckl

O0
Od 0

CO
(M 0

kO
(M 0

(Y]
CM 0

Oh
Od 0

~'~
Od 0

CM
(M 0

Oh
CM 0

--~r C O
(M 0 CM 0

CM CM 0

LCh Ckl 0

O~ Ckl 0

LC~ CM 0

Oh Ckl 0

~ ~ 0

--~ 0'3 0

CkJ 0

CM 0

Ckl 0

(M 0

CW 0

Ckl 0

CW 0

CM 0

Ckl 0

C\I 0

C~J 0

OJ 0

Ckl 0

{Yq 0

Crq 0

Or] 0

~ 0

r-d (~]
0

L~ C<]
0

2 Cd
0

~ CW
0

O0 Ckl
0

<0 CM
0

(Y~ CW
0

(M (~]
0

bOJ
0

:T C<]
0

Oq

["~

O~

-J-

Gh

O0

z<t

CO
0

0'3
0

CO
0

Cq
0

(Y~
0

Cq
0

Cq
0

my

=<r (~i
0

Lfh CW
0

<~
0

bOJ
0

O0 C<I
0

O0 (~
0

Oh Cq
0

0
CO 0

0
C'~ 0

r-4
Or ] 0

r-~
(~q 0

~Oq
0

Cd
(x~ 0

Cg
C~ 0

Oq
CO 0

gr]
C~] 0

-~
C'~ 0

O0

~ Ckl 0

b~ CM 0

CO OJ 0

Oh {'kl 0

O~ CkJ 0

0 Cr] 0

r--~ Cr~ 0

(%] C'~ 0

~,1 CO 0

CO 0"3 0

CO ('~ 0

O~ (x3 0

~ (Y] 0

~ C'~ 0

LCh 0'3 0

LCh C'~ 0

k.O

Oh

r~

D~-

P~

b-

CM

KO

LC~

LCh

Oh

Ck]

CD

CO

kO

~-

CM

Oh

Oh

Oq

O0

Lr',

Oh

kD

L[~ k O

O0

Oh

(M

~T

M2)

CO

r~

0 CW

(M

LC~

C~J

Ch]

0 CO

L/h 0"3

0 m=f

L~h _~j

Computations of saree multivariate distributions

819

2.
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

o
0 ,-4
O

2 ~

O0

b~

b-

(SO

::4(M CD

L'--

kO

O0

Oh

k~D

b~

~J-

rd

i-4

(M 0

Od 0

CM 0

CM 0

(X] 0

(XI 0

(XI 0

~ 0

~ 0

Oq 0

Oq 0

Cr~ 0

~ 0

~ 0

~ 0

Or] 0

~ 0 ~

~
0 0 0

~
0

~
0 0

~
0

~
0

~
0 0 0

~
0 0

o
0

~
0

j
0

~
0

CM

0
L LL

"~ I::1

b"

L~ Lfh

O ~

CO

dM 0%

O O

(:D

C~I

~ CM

kO C~h

CE1 ~"

Od LG

CO [1%

LO

14"% b~-

b~

CD

Lfh

LD

E ~-

CO

Oh

(~

~J

02)

C~I

LEh

14"h

IZh

820

P. Ro Krishnaiah

C~I

r4
. . . . . . . . o . . . . . . . .

CD

,-4
. . . . . . . . . . . . , . . o

~ ~ o
CO . 0 0 . 0 . . 0 . 0

~
. 0 . 0 . 0 . 0

~
. 0 . 0

. 0

. 0

. 0 0 0 0

b--

O L~

~3
Cq O 04 O

~
OJ O

~
Oq O

~
Or) O

c,~
Cch O

o-,
Oth O

~,.o
C~ O 0~ O

~,_~-~
Or~ O Or~ O ~ O

~, ~
O(~ O C#~ O

~o
~ O Or~ O

~,-,

. 0

. 0

. 0

. 0

. 0

. 0

. 0

. 0

. 0

. 0

. 0

. 0 0 0 0

Cq

C~I r--t O

k.O 03 O

r'~ O

b'~ ~.r O

kZD L~ O

O'~ ',.O O

bO

~ OO O

H O~ O

0% O

0"3 O O

O O

Oth r--I O

r-I Od O

O,I O

~ O

Lr~ Oq O

L~

k~D

b--

CO

O~

[,-,

O H

Od r-4

~ H

~ ~

OO rd

O Ckl

O.I Cq

L~, C'J

O 0"3

Lr~ Oq

O ~

147 -.~

Computations of some multivariate distributions

821

Table 4 Percentage Points of the Distribution of the Range

a : 0.05 2.772 3.314 3.633 3 858 4 030 4 170 4 286 4 387 4 474 4 552 4 622 4 685 4 743 4 796 4 845 4 891 4 934 4 ~74 5 012 5 08i 5 144 5 201 5 253 5 301 5 346 5 388 5 427 5.463 5.498 5.646 5.764 5.863 5.947 6.020 6.085

~:0.01 3.643 4.120 4.403 46O3 4.757 4882 4.987 5.078 5.157 5227 5.290 5.348 5.40O 5.448 5.493 5.535 5.574 5.611 5.645 5.709 5.766 5.818 5.866 5.911 5.952 5.990 6.026 6.060 6.092 6.228 6.338 6.429 6.507 6.575 6.636

6 7 8 9 I0 ll 12 13 14 15 16 i7 18 19 2O 22 24 26 28 30 32 34 36 38 40 50 60 7O 80 90 lO0

822

P. R, Krishnaiah

Table 5 Percentage Points of the Distribution of the Studentized Range a = 0.05


7 1 2 3 4 5 6 7 8 9 I0 ii 12 13 14 15 16 17 18 19 20 24 30 40 60 120 17,97 6,085 4,501 3.927 3,635 3,461 3.344 3,261 3,199 3,151 3,113 3.082 3,0>5 3,033 3,014 2,998 2,984 2,971 2,960 2.950 2,919 2,888 2.858 2.829 2,800 2,772 Ii 50,59 14,39 9,717 8,027 7,168 6,649 6.302 6,054 5,867 5,722 5,605 5,511 5.431 5.364 5,306 5,256 5,212 5,]74 5,140 5,108 5,012 4,917 4,824 4.732 4,641 4,552 ' 26,98 84331 5,910 5.040 4,602 4.339 4,165 4,041 3,949 3.877 3,820 3,773 3,735 3,702 3,674 3,649 3,628 3,609 3.593 3,578 3,532 3,486 3,442 3,399 3,356 3,314, 12 51,96 14,75 9e946 8,208 7,324 6,789 6.431 6,175 5,98~ 5,853 5.713 5,615 5,533 5.463 5,404 5,352 5307 5,267 5,231 5,199 5,09.9 5,001 4,904 4,808 4,714 49622 32,82 9,798 6,825 5,757 5,218 4,896 4,681 4,529 4,415 4,327 4,256 4,199 4,151 4,111 4,076 4,046 4,020 3,997 5,977 5,958 3,901 3,845 3,791 3,737 5,685 3.633 13 53,20 15,08 lOe15 8,373 7,466 6,917 6,550 6,287 6,089 5,955 5,811 5,710 5*6"25 5,554 5,493 5,439 5,392 5,352 5,315 5,282 5,179 5,077 4,977 4,878 4.781 4,685 37,08 10,88 7,502 6*287 5,673 5.305 5,060 4,886 4,756 4,654 4,574 4,508 4,453 4,407 4,367 40,41 ii,74 8,037 6,707 6,033 5,628 5,359 5,167 5*024 4,912 4,823 4,751 4,690 4,639 4595 4,557 4,524 !4,495 4,469 4,445 4,373 4,302 4,232 4,163 4,096 4,030 15 55o36 15,65 I0,53 8,664 7717 7,143 6,759 6,483 6,276 6,114 5,984 5,878 5,789 5,714 5,649 5,593 5,544 5,501 5,462 5,427 5o319 5,211 5,106 5,001 4,898 4,796 43,12 12,44 8,478 7,053 6,350 5o895 5,606 5,399 5.244 5,124 5,028 4,950 4,885 4,829 4;782 4,741 4o705 4,673 4*645 4,620 4,541 41464 4,389 4,314 4,241 4,170 8 45,40 13,03 8,853 7,347 6,582 6,122 5,815 5e597 5m432 9 ~7,36 13,54 9,177 7,602 6*802 61319 5,998 5,767 5,595 5,461 5,353 5,265 5,192 5,131 5,077 5,031 4099] 4,956 4,924 4,896 4,807 4,720 41635 4,550 4.468 4,387 I0 49,07 13,99 9,462 7,826 6,995 6,493 6,1'5~ 5,91g

5,739
5,599 5,487 5%395 5,318 5,254 5,198 5,150 50108 5,071 5,058 5*008 4,915 4,824 4,735 4,646 4,560 4,474

5,305
5,202 5,i19 5,049 4,990 4,940 4,897 4,858 4,824 4.794 4,768 40684 40602 4*521 4,441 4,363 4,286

4,333
4,303 4,277 4.253 4,292 4,166 4.102 4,059 3,977 3,917 3,858, 14 54,33 15,38 10,35 8,525 7,596 7,034 6,658 6,389 6,186 6,028 5.901 5,798 5,711 5,637 5,574 5,520 5,471 5,429 5,391 5,557 5,251 5,147 5,044 4,942 4,842 4.743

v~%n 1 2 3 4 5 6 7 8 9 I0 ii 12 13 14 15 16 17 18 19 20 24 30 40 60 120

Ig"" I 7 T - |
56.~32 i 57,22 15,91 16,14 I0,69 i0,84 8o794 8,914 7o828 7,932 7o244 7,338 6a852 6,939 6,57 ! 6 , 6 5 3 6,359 6,437 6,194 68269 6o062 56953 5o862 5,786 5e720 5,662 5,612 5,568 5o528 5o4%3 5~381 5,271 5,]63 5,056 4,950 4.o@45 6,134 6,023 5.931 5,852 5,785 5,727 5,675 5,630 5,589 5,553 5,439 5,327 5,216 5,107 4,998 ,4.891

18~
58,04 16,37 10,98 9,028 8,030 7,426 7.020 6,729 6*510 6,339 6o2~2 60089 5,995 5,915 50846 5,786 5,734 5,688 5,647 5.610 5,494 5,379 5,266 5,154 5,044 4,934

~9
58,83 16,57 II,II 9,134 8,122 7,508 7,097 6,802 6,579 6.405 6.P6~ 6.151 6.055 5,974 50904 5,843 5,790 5,743 5,70] 5,663 5,5~5 5,&? ~ 5,3] ~ 5,199 5,086 4,97&

Computations of some multivariate distributions


Table 5 (Continued) a = 0.05

823

38 1 2 3 4 5 6 7 8 9 lO ii 12 13 14 15 16 17 18 19 20 2& 30 40 60 120 68,26 19,11 12,75 I0,44 9,250 8,529 8,043 7693 7,a28 7.220 7.053 6,916 6800 66702 6,618 6,5~& 6,479 6,422 6,~7] 6325 6,181 6,037 5,893 5,750 5,607 5,4~3

~0 68,92 19,28 12,87 I0,53 9,330 ~,601 8,ii0 7,756 7,488 7,279 7,ii0 6,970 6,854 6,754 6,669 6,594 6,529 6,A71 6,419 6,373
I

50 71,73 20,05 13,56 10,93 9,674 8,913 8,400 8,029 7,749 7,529 7,352 7,205 7,083 6,979 6,888 6,810 16741 i6,680 6,626 6,576 6,421 6,267 6,112 5,958 5,802 5,646

6o 7397 20.66 15,76 11,24 9,949 9,163 8,652 8248 7,95A 7,730 7,546 7,394 7,267 7,159 7,065 6,984 6,9]2 6848 6,7Q2 6,740 !6,579 6,417 !6,255 ~6,093 i5,929

70 75,82 21.16 14,08 ii,51 I018 9,370 8,824 8,430 8,132 7,897 7,708 7,552 7,421 7,509 7,212 7,128 7,054 6,989 6,930 6,877 6,710 6,543 6,375 6,206 6,035 5.865

'Bo 77,40 21,59 14,36 11,73 I0,38 9,548 8,989 8,586 8,281 8,041 7,847 7,687 7,552 7,438 7,339 7,252 7,176 7,109 7,048 6,994 6,822 6,650 6,477 6,303 :6o126 i5,947

90 78,77 21,96 IA,61 11,92 I0,54 9,702 9,193 8,722 8,410 8,166 7,968 7,804 7,667 7,550 17,~49 7,282 7,219 7,152 7,097 6,920 6,744 6,566 6,987 6,205 6,020

i00

79,98
22,29 14,82 12,09 10,69 9,839 9,261 8,843 8,526 8,276 8,075 7,909 7,769 7,650 7,546 7,457 7977 7,307 7,244 7#187 7,008 6,827 6,645 6,462 6,275 6,085

17,960

16.226
6,080 5,934 5,789 5,644 5,498

15.764

824 Table 5 (Continued) a =0.01

P. R. Krishnaiah

-'--7
l

2 3 4 5 6 7 8 9

I 14,04
l [ I I 8.261 6.512 5,702 5.243 4.949 4,746 4.596 4,482

1 3 5 . 0 1164.3 1 1 8 5 , 6 19,02 122,29 124.72

I0.62 I 12.17 113.33 8.120 19.173 19.958


~p976 i7,804 .|8,421 6.331 |7.033 ]7,556 5,635 16.204 16,625 5,428 15,957 16,348 5 , 2 7 0 i 5 , 7 6 9 |6,136

5,91916,543 17.oo5

202,2 26.63 14.24 10,58 8,913 7,973 7.373 6.960 6,658

--T-
215,8 ] 227,2 28,20 | 29053 15.00 I 15,64 11,10 I 11,55 9o321 9o669 8,318 I 8,613 7,679, 7 , 9 3 9 7,237 7,474 6,915 7,134
I

---n7237,0 30,68 16,20 11,93 9,972 8,869 8,166 7,681 7,325


7.055

iO

I0
Ii 12 13 14 15 16 17 18 19 20 24

6,428 6,247
6,101 5.981 5.88]

6,669

6.875

245,6 31.69 16,69 12,27 10.2# 9,097 8,368 7,863 76495 7.213 6,992
6,814

4,392 4,320 4,260 4,210 4,168 4.131 4,099 4,071 4,046 4,024
3,956

5'.146 1 5 . 6 2 1 5,046 15,502

I 15.836
5,970

4,964 | 5,404 4.895'15,322


m

I 5.727 15,634

4~836 4..786 4,742 4,703 4,670 4,639

]5.252 5.556 I 5.192 5.489 15.140 15,43o |5,094 15.379 5,054 15,334 5,018 1 5 , 2 9 4

5,796 5,722
5.659 5,603 5.554 5,510

6,476 6,321 6,192 60085 5.994 5.915 5.847 5,788 5,735 5,688

6,672 1 6,507 1 6,372 i 6#258 ] 6.162 I 6.079 I 6.007 I 5,944


l

[ 5o889 I 5,839 I

6,842 6,670 6,528 6.409 6,309 6,222 6.147 6,081 6,022 5,970
5.809 5,653 5.502 5,356 5,214 5,978 18 290,4 37,03 19,32 14,08 11,68 I0,32 9,456 8,854 8,412 8,076 7,809 7,594 7,417 7,268 7,142 7,032 6,937 6,854 6,780 6,714 6.510

6,667
6,543 6.439 6.349 6e270 6,201 6,141 6,087 5.919

30
40 60 120

3,889
3.825 3.762 3.702 3.643

4,546
4.455 4,367 4,282 4.200 4,120 12

4,907
4,799 4,696 4.595 4,497 4.403 13 266.2 34,13 17,89 13,09 10.89 9,653 8,860 8,312 7,910 7.603 7=362 7,167 7,006 6,871 6,757 6,658 6,572 6,497 6,430 6,371

15.168 15.048
14,931 14.818 [4.709 n4,603
14 271,8 34.81 18.22 13.32 11.08 w.80R 8,997 8,436 8,025 7,712 7.465 7,265 7,101 6,962 6.845 6.744 6.656 6,570 6,510 6,450

5,374 5.242 5,114 4.991 4.872 4.757


15 277,0 35.43 18,52 13,53

5.542 I 5 , 6 8 5 5,401 1 5,536 5.265 I 5,392 5,133 I 5 , 2 5 3 5,005 5,118 4,882 4,987 16 281,8 36,00 18,81 13,73 11.40 10,08 9,242 8,659 8,232 7,906 7,649 7,441 70269 7,126 7,003 6,898 6,806 6,725 6,654 6,591 17 286,3 36,53 19,07 13,91 11,55 I0,21 9,353 8,760 8,325 7,993 7,732 7,520 7,345 7,199 7,074 6,967 6,873 6,792 6,719 6,654

5,756 5o599 5,647 5,299 5,157


19

i
2 3 4 5 6

253,2 32.59 17,13 12,57 10,48 9.301

7 8 9 I0
II

8,548 8,027
7,647 7,356 7,128 6,943 6,791 6,664 6,555 6.462 6,381 6,310 6,247 6,19]

260.0 33,40 17,53 12,84 10,70 9,485 8,711 8,176 7,784 7485 7,250 7,060 6,903 6.772 6,660 6,564 6,480 6.407 6,342 6,285

II,24
9,051 9,124 8,552 8,132 7,812 7,560 7,356 7,188 7,047 6,927 6,823 6,734 6.655 60585 6.523

294,~ 37.50 19,55 14a2~ 11,81 10,4~ 9,554 8m943 8.495 8,153 7o883 7,665 7o485 7,333 7,204 7",09t 6,997 6,912 6,837 6,771

12
13 14 15 16 17 18 19 20

24 30
40 60 120

6,017 5,849
5,686 5,528 5.375 5.227

6,106 5.932
5.764 5.601 5.443 5.29Q

6,186 6,008
5,835 5.667 5.505 5,348

6,261 6,078
5,900 5,728 5.562 ~t4QQ

l"

6,330 !6,394 60143 6 , 7 0 3 5,96] 6e017 5,785 5,837 5,614 5,662 5~448 5,493

6,453 6,259 6,069 5,886 5,708 5,535

6,311
6,119 5,931 5,750 5,574

6,563 6,361 6,165


5,974 5.790 5.~_IL

Computations of some multivariate distributions


Table 5 (Continued) a=0.01

825

20
1 2 3 4 5 6 7 8 9 i0 11 12 t3 14 15 16 17 18 19 20 24 30 40 60 120 298,0 37,95 19,77 14,40 11,93 I0,54 9,646 9,027 8,573 8,226 7.952 7,731 7,548 7,395 7,264 7,152 7,053 6,968 6,891 6,823 6,612 6,407 6,209 6,015 5,827 5,64~

22
304,7 38,76 20,17 14,68 12,16 10,73 9,815 9,182 i8,717 !8,361 8,080 7,853 7,665 7,508 7,374 7,258 7,158 7,070 6,992 6.922 6,705 6,404 6,289 6,090 5,897 5.709 40 344.8 43,61 22,~o 16,37 13,52 11,00 ]0,85 10,13 9,594 9,187 8.864 8,603 8,387 8,204 8,040 7,916 7,700 7,606 7,605 7,523 7,270 7,023 6,782 6,546 6,316 6,092

24
310,8 39,49 20,53 1@,93 12,36 10,91 9,970 9,322 8,847 8,483 8,196 7,964 7,772 7,611 7,474 7,356 7,253 7,163 7,082 7,011 6,789 6,572 6,362 6,158 5,959 5,766 5O 358,9 45,33 23,45 16,98 I]4,00 12,qi 1t,23 I]0,47 ~9,912 !9,486 19,148 8,875 8,648 8,457 8.295 8,154 R,O~] 7,994 7,828 7,742 7,476 7,215 6,960 6,710 6,467 6,228

26 316,3 40,15 20,86 15.16 12,54 11,06 lO,ll 9,450 8,966 8,595 8,303 8,066 7,870 7,705 7,566 7,445 7,340 7,247 7,]66 7,092 6,865 6,644 6,429 6,220 6,016 5,818 60 370,1 46.70 24,13 17,46 14,3 o 12,65 11,52 ]0,75 10,17 9,726 9,377 9,094 8,859 8,661 8,492 8,347 8,2]q R,I07 S,008 7.019 7,642 7,570 17,104 6,Ba3 6,588 6,338

28 321,3 40,76 21,16 ]5,37 12,71 II,21 i0,24 9,569 9,075 8,698 8.400 8,]59 7,060 7,7921 7,650 ! 7,5.27 7,420 7,325 7,242 7,168 6,936 6,710 6,490 6,277 6,069 5,866 70 379,4 47,83 24,71 17,86 14,72 12,92 11,77 10,97 10,38 9,927 9,568 9,277 9,035 8,832 8,658 8,507 8,377 8,261 8,150 8,067 7,7~0 7,500 7,225 6,954 6,689 6,420

30 326,0 41,32 21,44 15,57 12,87 11,34 10,36 9,678 9,177 8,794 8,491 8,246 8,043 7,873 7,728 7,602 7,493 7,398 7,313 7,2~7 7~001 61772 60547 6,330 6,117 5,911 80 387,3 48,80 25,19 18,20 14,99 13,16 11,99 11,17 10,57 I0,I0 9,732 9,434 9,187 8,978 8,800 8,646 8,511 B,393 8,2~8 8,194 7,900 7,611 7,328 7,050 6,776 6,507

32 330o3 41,84 21,70 15,75 13,02 11,47 10,47 9,770 9,271 8,883 8,575 8,327 8,121 7,948 7,800 7,673 7,563 7,465 7,379 7,302 7,062 6,828 6,600 6,578 6,]62 5,9~12 90 394,1 49,64 25,67 18,50 15,23 13,37 12,17 Iia34 10,73 10,25 9,875 9,571 9,318 91106 8,924 8,767 8,630 8,508 8J40! 8,305 8,004 7,709 7,419 7,133 6,852 6,578

34 334,3 42,33 21,95 15,92 13,15 11,~8 I0,58 9,874 9,360 8,966 8,654 8,a02 8,193 8,018 7,869 7,739 7,627 7,528 7,440 7,362 7,110 6,881 6,650 6,424 6,204 5,990 I00 400,1 50,38 25,9 18,77 15,45 13,55 12,34 11,49 I0,87 10,39 I0,00 9,693 9,436 9,219 ,035 8,874 8,735 8,611 8.502 8.404 8,097 7,796 7,500 7,207 6,9]9 6,636

36

338tQ 42,78 22,17 16,08 13,28 11,69 10,67 9,964 9,443 9,0a4
8,728 8,a73 8,262 8,084 7,932 7,8n2 7,687 7,587 7,498 7,419

7,173 6,932 6,697 6,467 6,244 6,02

v.~ 1 2 3 4 5 6 7 8 9 I0 ii 12 13 14 15 16 17 ]8 19 20 24 30 40 60 120

38
341.5 43,21 22,39 )6,23 13,40 ]1,80 ]0,77 I0,05 9,521 9,117 8,798 8,539 8,326 8,146 7,992 7,860 7,745 7,64q 7,553 7,473 7,223 6,978 6,740 6,507 6.281 6.060

826

P. R. Krishnaiah

Table 6 Percentage Points of the Bivariate Chi-Square Distribution = 0.05 .6 5 6 7


8

7 7.054 7.831 8.588 9.328 10.051 10.768 11.472 12.165 12.856 13.535 16.859 20.083 23.237 26.337 29.390

.6 6.166 6.967 7.740 8.492 9.230 9.953 10.662 11.364 12.059 12.744 13.423 16.736 19.946 23.092 20.182 29.232

-9 6.026 6.819 7.584 8.329 9.000 9.770 10.485 11.179 11.867 12.550 13.224 16.512 19.70~ 22.638 d5.914 28.94b

9 10
11

12 13
14

15 20 25 30 35 40

6.401 7.207 7.988 8.748 9.491 10.220 10.941 11.648 12.346 13.038 13.722 17.059 20.296 23.461 26.572 29.639

[ | | | |

6.391 7.202 7.983 8.742 9.485 10.213 10.933 11.640 12.338 13.029 13.715 17.051 20.289 23.453 26.563 29.632

|6.382 |7.192 |7.971 |8.732 |9.474 10.205 10.921 11.631 12.328 13.020 13.702 17.041 20.274 23.440 26.549 29.616

7.173 7.954 8.713 9.457 10.186 10.905 ll.610 12.310 13.000 ].3.685 17.019 20.255 23.419 26.526 29.591

I 7.149 7.927 ] 8.68~ 9.429 i0.160 [ 10.876 [ ii.584 [ 12.278 I 12.972 13.654 16.988 20.222 23.380 26.488 29.553

6.307 7.113 7.889 8.649 9.387 10.115 ~ i0.832 11.537 12.233 12.924i 13.6071 16.938 20.166 23.324 26.427 29.488

a ~ 0.01
_ .7 .8

.9 1 1 1 1 1 1 1 8.075 8.967 9.824 10.655 11.464 12.255 13.032 13.797 14.551 15.294 16.030 19.599 23.039 26.390 29.b71

5 6 7
8

9 10 11 12 13 14 15 20 25 30 35 40

8.371 9.271 10.135 10.976 11.793 12.592 13.382 14.152 14.913 15.666 16.411 19.995 23.462 26.835 30.140 33.392

8.371| 9.271~ 10.139| 10.980 | 11.789 I 12.588 I 13.372 14.146 14.905 15.657 16.399 19.992 23.462 26.843 30.132 33.383

8.368 9.264 10.131 10.968 11.788 12.592 13.367 14.141 14.902 15.654 16.395 19.992 23.453 26.838 30.131 33.373

| 8.359| | 9.258| |10.120| | 10.960 | 11.7801 12.578 13.362 14.136 14.896 15.643 16.384 19.985 23.450 26.823 30.127 33.371

8.343 9.244 10.109 lO 944 11.762 12.560 13.347 14.117 14.877 15.628 16.368 19.962 23.432 26.801 30.101 33.349

I I I I I I I I I I

8.318 9.220 9.175 I 9.102 10.082 10.040 I 9.964 10.919 10.878 I 10.799 11.736 11.691 I 11.616 12.536 i2.488 I 12.4i2 13.322 i3.27i I i3.190 14,091 14.042 i3.959 14.849 i4.80i 14.7i9 15.599 15.551 15.464 16.341 16.290 16.203 19.936 ! 19.883 I 19.788 23.397 ! 23.343 I 23.244 26.768 26.709 I 26.604 30.068 30.007 29.895 33.312 33.247 33.133

Computations of some multivariate distributions Table 7 Bounds on Percentage Points of the Multivariate Chi-Square Distribution a =0.05 p = 3

827

%
5 6
7

.i
13 .791 ( .9490) 15 .456 ( .9490)
17 .064

.3

.5

.7 13.2o2 (.9353) 14.852 (.9357)

.8

13.74o (.9479 15.4o4 (.9480 17.011 (.9480 18.574 (.9481 20.101 (.9481 21.597 (.9481 23.068 (.9481 24.517

13.590 (.9447 15.252 (.9448 16.857 (.9450 18.416 (.9451 19.941 (.9452 21.434 (.9452 22.903 (.9453 24.350 (.9453 25.778 (.9454 27.189 (.9454 28.586 (.9455 35.389 (.9456 41.973 (.9457 48.402 (.9458 54.716 (.9459 60.936 (.9459

12.786 (.9236)

14.419 (.9241) 15.997 (.9246) 17.532 (.9249) 19.032 (.9252) 20.503 (.9255) 21.950 (.9257) 23.576 (.9259) 24.785 (.9261) 26.176 (.926~) 27.554 (.9264) 34.272 (.9270) 40.780 (.9274) 47.141 (.9277 53.392 (.9279 59.552 (.9281)

( .9490)

16.446 (.936o) 17.995 (.9362)


19.5o9 (.9364)

8
9

18 .627 ( .9490)
20

( .9490)

.154

i0
11

21 .651 ( .9490)
23 .122 ( .9490

20.994 (.9366) 22. 452 (.9367) 23.890 (.9368) 25.310 (.9370) 26.713 (.9371) 28.101 (.9372) 34.868 ~ (.9376) 41.419 (.9378) 47.819 (.9380) 54.106 (.9382) 60.299 (.9383)

12 13 14
15 2O

24 .572 ( .9490 26 .O03 ( .9490 27 .417 ( .9490


28 .816

(.94%1

25.948 (.9482 27.361 (.9482 28.760 (.9482 35.574 (.9482 42.167 (.9483 48.605 (.9483 54.926 (.9483
61.155 (.9483

( .9490 35 .633 ( .9491

25 30
35 4o

42 .229 ( .9491 48 .670 ( .9491


54 .993 ( .9491 6z .224 ( .9491

828 Table 7 (Coatmue~ a = 0.05, ~ 4

P. R. Krishnaiah

.1

.3

.5

.7

.8

5 6 7 8 9 i0 ii 12 13 14
15

14.490 (.9489) 16.188 (.9489) 17.827 (.9489) 19.418 (.9489) 20.972 (.9489) 22.495 (.9489) 23.990 (.9489) 25.464 (.9489) 26.917 (.9489) 28.352 (.9489) 29.772 (.9489) 36.683 (.9489) 43.362 (.9489) 49.878 (.9489) 56.271 (.9489) 62.567 (.9489)

14.422 (.9474) 16.120 (,.9475) 17.759 (.9476) 19.350 (.9476) 20.903 (.9477) 22.425 (.9477) 23 .920 ( .9477) 25 .393 ( .9477) 26 .846 ( .9478)
28.281 (.9478)

14.210 (.9427) 15.905 (.9429) 17.541 (.9431) 19.129 (.9433) 20.680 (.9434) 22.199 (.9435)
23.692 (.9436)

13.574 (.9254) 15.255 (.9265) 16.876 (.9271) 18.451 (.9276) 19.988 (.9281) 21.494 (.9284) 22.973 (.9287) 24.430 (.9289)
25.868 (.9292)

12.725 (.8956) 14.380 (.8973) 15.976 (.8987) 17.528 (.8998) 19.043 (.9007) 20.528 (.9015) 21.988 (.9o21) 23.426 (.9027) 24.846 (.9033) 26.247 (.9037) 27.636 (.9041) 34.397 (.9058) 40.942 (.9069) 47.336 (.9077) 53.618 (.9084) 59.802 (.9089)

25.161 (.9437) 26.611 (.9438) 28.042 (.9438) 29.459 (.9439) 36.355 (.9441) 43.019 (.9443) 49.521 (.9444) 55.902 (.9445) 62.184 (.9446)

27.289 (.9294) 28.694 (.9296) 35.539 (.93O3) 42.156 (.93O8) 48.616 (.9312) 54.980 (.9316) 61.203 (.9318)

29.699 (.9478) 36.608 (.9479) 43.284 (.9479) 49.797 (.9479) 56.187 (.9480) 62.480 (.9480)

20
25 3o 35 4O

Computations of some multivariate distributions

829

Table 7 (Continued) a =0.05, p = 5


.i

.3

.5

.7

.8

5 6
7
8

15 .028

( .9488)

14.947 (.9471) 16.670 (.9471) 18.332 (.9472) 19.945 (.9473) 21.519 (.9473) 23.059 (.9473) 24.573 (.9474) 26.062 (.9474) 27.532 (.9474) 28.983 (.9475) 30.417 (.9475) 37.396 (.9476) 44.134 (.9476) 50.704 (.9477)
57.145

14.675 (.9408)
16,396

13.722 (.9126) 15.429 (.9141) 17.072 (.9153) 18.668 (.9163)


20.223 (.9171)

11.387 (.7789) 13.198


(.8ooo) 14.88o (.8121)

16 .752 ( .9488)
18 .413
20

(.9411)
18.056 (.9414)
19.665

( .9488)

( .9488)

.026

(.9416)
21.236 (.9418)

16.490 (.8206) 18.048 (.8269) 19.566 (.8318)


21.051 (.8358) 22.510 (.8391)

21 .6OO ( .9488)

i0 ii 12
13 14 15 2o 25 3o 35 4o

23 .141 (.9488) 24.655 (.9488) 26.146 (.9488) 27.615 (.9488) 29.067 (.9488)
30.502 (.9488)

22.773 (.9419) 24.285 (.9421) 25.771 (.9422)


'27.238 (.9423)

21.747 (.9178) 23.239 (.9182)

24.712 (.9188) 26.164 (.9192) 27.598 (.9196) 29.016


(.92oo)

23.949
(.842o)

28.684 (.9424) 30.i17 (.9425) 37.081 (.9428) 43.806 (.9430) 50.363 (.9432) 56.794 (.9433) 63.123 (.9434)

25.363 (.8443 26.766 (.8465 33.569


(.854o

37.484 (.9488)
44.226 (.9488)

35.918 (.9214) 42.582 (.9223) 49.086 (.9230) 55.471 (.9236) 61.747 (.9240)

40.136 (.8588 46.545 (.8623 52.840


(.8650

50.798 (.9488) 57.243 (.9488)


63.587 (.9489)

(.9477) 63.487 (.9477)

59.025 (.8668

830

P. R. Krishnaiah

T a b l e 7 (Continued) a = 0.05, p = 6

.1

.3 15.373 (.9467) 17.116 (.9468) 18.796

.5 15.043 (.9389) 16.784 (.9393) 18.463 (.9397) 20.089 (.9400) 21.678 (.9403) 23.230 (.9404) 24.756 (.9406) 26.255 (.9407) 27.736 (.9409) 29.195 (.9410) 30.640 (.9411) 37.659 (.9415) 44.433 (.94Z8) 51.033 (.9420) 57.505 (.9422) 63.870 (.9423)

.7
13 .631 ( .8912) 15.371 (.8946) 17.041 (.8972) 18 .659 ( .8992) 2O .234 ( .9009) 21 .774 ( .9O22) 23 .28O ( .9032) 24 .766 ( .9042) 26 .231 ( .9O5O) 27 .677 ( .9048) 29 .106 ( .9065) 36 .058 ( .9092) 42 .760 ( .9108)

15.466 (.9487) 17.209 (.9487) 18.889 (.9487)

(9469)
20.426 (9470) 22.016 (9470) 23.572 (9471) 25.100 (9471) 26.604 (9472) 28.086 (9472) 29.550 (9472) 30.996 (9472) 38.032 (.9473) 44.820 (.9474) 51.434 (.9474) 57.917 (.9475) 64.297 (.9475)

2o.518 (.9487)
22.108 (.9487) i0 Ii 12

23.665 (.9488)
25.193 (.9488) 26.698 (.9488) 28.181 (.9488) 29.645 (.9488) 31.092 (9488) 38.130 (9488) 44.922 (9488) 51.539 (9488)

13
14 15 20 25

3o
35 4O

49 .298 ( .9121)
55 .717 ( .9132) 62 .016 ( .9138)

58.025 (9488)
64.408 (9488)

Computations of some multivariate distributions Table 7 (Continued) a =0.05, p = 7

831

.i
5

.3 15.732 (.9464) 17.491 (.9465) 19.186 (.9466) 20.831 (.9467) 22.434 (,9468) 24,002 (,9468) 25.542 (.9469) 27,057 (.9469) 28.551 (.9470) 3o.o25 (.9470) 31.482 (.9470) 38.564 (.9471) 45.392 (.9472) 52.044 (.9473) 58.561 (.9473) 64.973 (.9473)

.5 15.343 (.9371) 17,102 (.9376) 18,798 (,9381) 20.438 (.9384) 22.041 (.9388) 23.605 (.9390) 25.144 (.9392) 26.654 (.9394) 28.146 (J9396) 29.615 (.9397) 31.071 (.9398) 38.136 (.9403) 44.950 (.9407) 51.586 (.9409) 58.092 (.9412) 64.488 (.9413)

.7 12,948 (,8329) 14.835 (.8489) 16.591 (.8584) 18.267 (.8648) 19.886 (.8696) 21.462 (.8733) 22.991 (.8759) 24.5Ol (.878~) 25.987 (.88o5) 27.451 (.8823) 28896 (.8839) 35.914 (.8898) 42.656 (.8932) 49.229 (.8957) 55.684 (.8980) 62.000 (.8991)

15.835 (.9487) 17.594 (.9487) 19.289 (.9487) 20.932 (,9487) 22.536 (.9487)

io ii 12
13

24.1o5 (.9487) 25.645 (.9487) 27,160 (,9487) 28.654 (.9487) 3o.129 (.9487) 31.587 (.9487) 38.671 (.9487) 45.504 (.9487) 52.158 (.9487) 58.679 (.9487) 65.094 (.9487)

14 15 20 25 3O 35 4O

832

P. R. Krishnaiah

~D 0
m

-~
~J

-~

-~

~-

-~

.~

-.7- ~

.~

/c

Computations of some multivariate distributions

833

o
. . . . . . o . . . . . . . . .

d
~

d
~

d
o

d
o

d
~

d
o

d
~

d
~

d
~

d
~

d
~

d
~

d
~

~
o

~
~

~
~

d
I

w~

oo oh

oo o~

oo oh

oo oh

oD ~

oo o~

oo on

oo ~

oo o~

oo ~

oo oh

oo e~

oo o~

co o'I

oo o~

oo on

oo o~

oo o~

o
o

c;

c;

c;

834

P. R. Krishnaiah

c~

~ .

..S

~S

.S

..

m m m

~J

o
o

o
~

~
S

~
S

Computations of some multivariate distributions

835

"

"

o~

"

o~

O0

o~

o~

0"~

0"~

O~

o'~

0"~

o~

O~

o~

C~

O~

o~

o~

O~

O~

oo

O~

O~

o~

O~

O~

O~

0"~

0"~

o~

O~

o~

o~

O~

o~

O~

oo

oo

oo

O~

o~

O~

O~

o~

O~

o~

O~

O~

O~

O~

oo

oo

oO

oo

oo

oo

O~

o~

O~

oo

O~

oo

cO

oO

oO

oo

oO

oo

oO

oO

oO

oo

oo

oo

oo

oo

oO

~0

oo

oo

oO

oo

oo

oO

oo

oo

oo

oo

oo

oO

oO

I~.

c~

c~

c~

c~

836

P. R. Krishnaiah

Table 9 P e r c e n t a g e Points of the Bivariate F D i s t r i b u t i o n (z ~ 0.05 m 3, 5 6 7 8 9 10 11 12 13 14 15 20 25 30 35 40 2 7.87 6.89 6.28 5.85 5.55 5.32 5.14 4.99 4.8'7 4.77 4.69 4,41 4.25 4.15 4.07 4.02 4 6.69 5.78 5.20 4.81 4.53 4.31 4.14 4.01 3.90 3.80 3.72 3.46 3.31 3.22 3.15 3.10 6 6.19 5.31 4.76 4.38 4.11 3.90 3.73 3.60 3.50 3.40 3.33 3.07 2.93 2.83 2.77 2.72 p ~ 0.1 8 5.90 5.05 4.51 4.13 3.87 3.66 3.50 3.37 3.27 3,18 3.].1 2.86 2.71 2.62 2.55 2.51 10 5.71 4.88 4.34 3.98 3.71 3.51 3.35 3.23 3.12 3.04 2.96 2.7]. 2.57 2.47

2.41
2.36

a = O.O1 m 5 6 7 8 9 10 11 12 13 14 15 20 25 30 35 40 2 17.41 14.04 12.08 10.82 9.94 9.30 8.8] 8.43 8.12 7.86 7.65 6.96 6.58 6.34 6.17 6.05 4 14.48 11.37 9.64 8.52 7.75 7.19 6.76 6.42 6.15 5.93 5.74 5.14 4.81 4.61 4.47 4.36 6 13.42 10.28 8.68 7.62 6.89 6.36 5.96 5.64 5.39 5.18 5.00 4.4~ 4.13 3.93 3.80 3.70

p ~ :t:O.1 8 12.92 9.65 8.16 7.13 6.42 5.91 5.52 5.22 4.97 4.77 4.60 4.05 3.75 3.56 3.43 3.34 10 12.65 9.22 7.83 6.81 6.13 5.63 5.25 4.95 4.71 4.50 4.34 3.81 3.5l 3.32 3.20 3.11

Computations o f some multivariate distributions


Table 9 (Continued) c = 0.05 m 5 6 7 8 9 10 11 12 :[3 14 15 20 25 30 35 40 2 7.80 6.83 6.22 5.81 5.51 5.28 5.10 4.96 4.8,1 4.74 4.66 4.38 4.22 4.12 4.05 4.00 4 6.64 5.74 5.16 4.78 4.50 4:28 4.12 3.99 3.88 3.78 3.71 3.45 3.30 3.21 3.14 3.09 6 6.14 5,28 4.73 4.35 ~.08 3.87 3.71 3.58 3.48 3.39 3.31 3.96 2.92 2.82 2.76 2.71 p = 4-0.3 8 5.86 5.02 4.48 4.11 3.85 3.64 3.49 3.36 3.25 3.17 3.09 2.84 2.70 2.61 2.55 2.50 10 5.68 4.85 4.32 3.95 3.69 3.50 3.:4 3.21 3.11 3.02 2.95 2.70 2.56 2.47 2.41 2.36

837

c~ =

0.01

p ~

:J:O.3

ra
5 6 7 8 9 10 11 12 13 14 15 20 25 30 35 40

2 17.30 13.97 12.02 10.77 9.91 9.27 8.78 8.40 8.09 7.84 7.63 6.94 6.56 6.33 6.17 6.05

4 14.38 11.30 9.59 8.48 7.72 7,16 6.73 6.40 6.13 5.91 5.73 5.13 4.81 4.60 4.46 4.36

6 13.31 10.22 8.63 7.58 6.86 6.34 5.94 5.62 5.37 5.16 4,99 4.42 4.12 3.93 3.79 3.70

8 12.80 9.60 8.11 7.09 6.39 5.89 5.50 5.20 4.95 4.75 4.58 4.04 3.74 3.56 3.42 3.33

10 12.52 9.17 7.78 6.77 6.09 5.60 5.22 4.92 4.68 4.49 4.32 3.79 3.50 3.32 3.19 3.10

838

P. R. Krishnaiah

Table 9 (Continued) a=O.05 m 5 6 7 9 10 11 12 13 14 15 20 25 30 35 49 2 '7.62 6.69 6.10 5.70 5.41 5.19 5.02 4.88 4.76 4.67 4.59 4.32 4.17 4.07 4.00 3.95 4 6,51 5.63 5.05 4.70 4.43 4.22 4.06 3,93 3.82 3.74 3.66 3.41 3.26 3.17 3.1] 3.06 6 6.05 5.19 4.66 4,29 4.03 3.82 3.67 3.54 3.44 3.35 3.28 3.03 2.89 2.80 2.74 2.69 p= 8 5.27 4,94 4.41 4.06 3,79 3.60 3.45 3.32 3.22 3.13 3.06 2.82 2.68 2.59 2.53 2,48 ~0.5 10 5.60 4.78 4.26 3.91 3,65 3.45 3.39 3.18 3,07 2.99 2.92 2.68 2.54 2.45 2.39 2.34

= 0.01
m
~b

p = =50.5
4 6 8

10 12.30 9.08 7.69 6.70 6.03 5.54 5.17 4.88 4.64 4.45 4.29 3.77 3.48 3.31~ 3.18 3.08

5 6 7 8 9 10 11 12 13 14 15 20 25 30 35 40

16.93 13.71 11.82 I0.61 9.76 9.14 8.67 8.30 8.06 7.75 7.55 6.88 6.51 6.28 6.12 6.01

14.11 11.13 9.45 8.37 7.62 7.07 6.66 6.34 6.07 5.85 5.67 5.09 4.77 4.57 4,44 4.33

13.08 10.09 8.52 7.49 6.79 6.27 5.88 5.57 5.32 5.11 4,94 4.39 4.09 3.90 3.77 3.68

12.56 9.49 8.01 7.01 6.33 5.82 5.44 5.15 4.91 4.71 4.55 4.01 3.72 3.54 3.41 3.32

Corrqputations of some multivariate distributions


Table 9 (Continued) ~0.05 m
q't

839

p-4 6.30 5.46 4,93 4.57 6 5.87 5,05 4.53 4.18 3.92 3.73 3.58 3.46 3.36 3.28 3.21 2.97 2.84 2.75 2,69 2,64 ~8

0.7 10 5.46 4.66 4.16 3.82 3.57 3.38 3.23 3,11 3.02 2.93 2.86 2.63 2.50 2.41 2.35 2.31

2 7,32 6.44 5,89 5.51 5.23 5.03 4.86 4.73 4.63 4.54 4.46 4.20 4.06 3.97 3.91 3.86

5 6 7 8 9 lO 11 12 13 14 15 20 25 30 35 40

5.62 4.81 4.31 3,96 3.71 3,52 3,37 3.25 3,15 3.07 3.00 2.77 2.63 2.54 2.49 2,44

4.31 4.11 3.96 3.83 3.73 3,65 3.58 3.33 3.20 3.11 3.05 3.00

~x = O.O1 m 5 6 7 8 9 I0 11 12 13 14 15 20 25 30 35 40 2 16.36 13.29 i1.49 10,33 9.52 8.93 8.48 8.12 7.83 7.60 7.40 6.76 6.41 6.18 6.03 5.92 4 13.67 10.83 9.2i 8,17 7.45 6.93 6.53 6.21 5.96 5.75 5.58 5.01 4.70 4.51 4,38 4.28 6 12.66 9,84 8.32 7.33 6.64 6.15 5.76 5.47 5.23 5.03 4,87 4.33 4:.04 3,86 3.73 3.64

p -- 0 . 7 8 12,18 9,28 7.83 6,87 6.20 5.72 5.35 5.06 4.83 4.64 4.48 3.96 3.67 3.50 3.37 3.28 10 11.93 8.90 7.52 6.57 5.92 5.44 5.08 4.80 4.57 4.38 4.23 3.72 3.44 3,26 3.15 3.06

840

P. R. Krishnaiah

Table 9a Exact Percentage Points of the Multivariate F Distribution a = 0 . 0 5 , r e = l , 0=0.1

n/p
5 6 7 8 9 10 1! 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

1
6.61 5.99 5.59 5.32 5.12 4.96 4.84 4.75 4.67 4.611 4.54 4.49 4.45 4.41 4.38 4.35 4.32 4.:10 4.28 4.26 4.24 4.22 4.21 4.16 4,18 4.17 4.10 4.15 4.14 4.13 4,12

2
9.54 8.49 7 83 7.38 7.05 6,811 6.60 6.45 6.32 6.21 6.11 6,04 5.97 5.91 5.85 5,81 5.76 5..73 5.69 5.611 5.63 5 611 5.58 5.56 5.54 5.52 5.50 5.,i8 5.47 5.45 5.44

3
11.53 10.17 9.32 8.73 8.31 7.99 7.74 7.53 7..37 7.211 7.11 7.01 6,92 6.85 6,78 6.72 6.66 6.02 6.57 6.53 6.50 6.46 6.43 6.40 6.38 6.35 6.33 6.3! 6.29 6.27 6.25

4
1:1.05 11.45 10.44 9.75 9.20 8.88 8.58 8.35 8.15 7.911 7.85 7.73 7.63 7.54 7.411 7.:19 7.:13 7.27 7.22 7.17 7.13 7.09 7.116 7.112 6.99 6.96 6.94 6.91 6.89 6.87 6.85

5
14.30 12.49 11,36 1t).58 111.02 9.60 9.26 9.00 8.78 8.60 8.44 8.31 8.19 8.09 8.111 7,93 7.86 7.71) 7.7:1 7.68 7.6:1 7,59 7.55 7.51 7,48 7,45 7.42 7. :1!1 7.36 7.34 7.32

6
15,35 13.37 12.13 1|.28 10.66 10.20 9.84 9.55 9.31 9.11 8.94 8.79 8.67 8.56 8.46 8.37 8.30 8.2:1 8.16 8.11 8.05 8.01 7.96 7.92 7.88 7.85 7.82 7.79 7.76 7.73 7,71

7
16,27 14.13 12.80 11.88 il.22 10.72 10.33 10.02 9,76 9.54 9.30 9,21 9.07 8.95 8.85 8.711 8.67 8,60 8.53 8.47 8,41 8. :16 8.31 8.27 8.23 8.19 8.16 8.12 8.09 8.o7 8.114

8
17.08 14.81 13.39 12.42 11.71 11.18 10.77 10.43 10.10

9
17.81 15.42 13.92 12.89 12.15 11.59 11.16 10.80 10.52

10
18.46 15.97 14,40 13.33 12.55 11.97 11.51 11.14 10.84 10.59 10.37 10.19 10.03 I).89 9.77 9.66 9.56 0.48 9.40 9.:|3 9.26 9.20 9.14 9.09 9.05 9.00 8.96 8.92 8.89 8.85 8.82

1t,9"1 10.27 9.74 10.07 9.57 9.43 I).30 9.i9 9.09 9.00 8.92 H.85 8.79 8.73 8.67 8.62 H.58 8.53 8.49 8.46 8.42 8.39 8.36 8.33 9.90 9.74 O.fll 9.50 9.39 9,.30 9.21 9.14 9.07 9.01 8.95 8.90 8.85 8.80 8.76 8.72 8.68 8.65 8.62 8.59

Computations of some multivariate distributions Table 9a (Continued) a=0.05, re=l, p=0.3 n/p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 6.61 5.99 5.59 5.32 5,12 4.06 4.84 4.75 4.67 4.60 4.54 4.49 4.45 4,41 4.38 4.35 4.32 4,30 4.28 4.26 4.24 4.22 4.21 4.19 4.18 4.17 4.16 4.15 4.14 4.13 4.12 2 9.42 8~40 7.75 7.30 6.98 6.74 6.54 6.39 6,26 6.15 6.06 5.99 5.92 5.86 5.81 5.78 5.72 5.68 5.65 5.62 5.59 5.56 5.54 5.52 5.50 5.48 5.46 5.44 5.4:} 5.41 5.40 3 !1.30 9.98 9.16 8.59 8.18 7.87 7.62 7.43 7,26 7.13 7,02 6.92 6.83 6.76 0.69 6.63 6,58 6.54 6.49 6.45 6.42 6.39 6.36 6.33 6.30 6.28 6.26 6.24 6.22 6.20 6.18 4 12.73 11.18 10,21 9:55 9.07 8.71 8,42 8.19 8.01 7.85 7.72 7.60 7.51 7.42 7.34 7.27 7.21 7.15 7.11 7.07 7.02 6.99 6.95 6.92 6.89 6.86 6.84 6.81 6.79 6.77 6.75 5 13.88 12.15 11.06 10.32 9.78 9.38 9.06 8.81 8.60 8.42 8.27 8.15 8.04 7.94 7.86 7.78 7.71 7,65 7.60 7.55 7.50 7,46 7.42 7.39 7.35 7.32 7.29 7.27 7.24 7.22 7.20 6 14.85 [2.96 11.77 10.97 10,38 9.94 9.59 9.32 9.09 8.90 8.74 8.60 8.48 8.38 8.28 8.20 8.13 8,06 8.00 7.95 7.90 7.85 7.81 7.77 7.74 7.70 7.67 7.64 7.62 7.59 7.57 7 15.68 13.66 12.39 11.52 10.89 10.42 10.05 9.75 9.51 9.31 9.13 8.99 8.86 8.75 8.65 8.56 8.48 8.41 8,35 8.29 8.24 8,19 8.14 8.10 8.06 8.0:} 8,00 7.96 7.94 7.91 7.88 8 16,42 14.27 12.93 12.01 11.35 10.84 10.45 i0.14 9.88 9.66 9.48 9.33 9.19 9.07 8.97 8.87 8,79 8.72 8.65 8.59 8.53 8.48 8,43 8.39 8.35 8.31 8.28 8.24 8.21 8.18 8.16 9 17.08 14.82 13.41 12.45 11.75 11.22 10.81 10.48 10,21 9.98 9.79 9.63 9.49 9.36 9.85 9.15 9.07 8.99 8.92 8.85 8.79 8.74 8.69 8.64 8.60 8.56 8.53 8.49 8.40 8.43 8.40 [o 17.67 15.32 13.84 12.84 12.11 11.56 11.13 10.79 10.51 10.27 10.07 9.90 9.75 9.82 9.51 9.40 9.31 9.23 9.16 9.09 9.03 8.97 8.92 8.87 8.83 8.79 8.75 8.71 8.68 8~65 8.62

841

842 Table 9a (Continued) a = 0.05, m = I, P ~ 0.5 n/p 5 6 7 8 9 10 II 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 6.61 5.99 5.59 5.32 5.12 4.66 4.84 4.75 4.67 4.60 4.54 4.49 4,45 4.41 4.38 4.35 4.32 4.30 4.28 4.26 4,24 4.22 4.21 4.19 4.18 4.17 4.16 4.15 4.14 4.13 4.12 2 9.18 8.20 7.57 7.14 6.83 6.60 6.41 6.26 6.14 6.04 5.95 5.87 5.81 6.75 5.70 5.66 5.62 5.58 5,55 5.52 5.49 5.47 5.44 5.42 5.40 5.30 5.37 5.35 5.34 5.32 5.31 3 10.84 9.61 8.83 8.2~ 7.91 7.61 7.38 7.20 7.05 6.92 6.81 6.72 6.64 6.57 6.51 6.45 6.40 6.36 6.32 6.28 6.25 6.22 6.19 6.16 6.14 6.12 6.10 6,08 6.06 6.04 6,03 4 12.08 10,65 9,75 9.14 8.69 8.35 8.09 7.88 7.71 7.56 7.44 7.33 7.24 7.1fl 7.09 7,03 6.97 6.92 6,87 6.83 6.80 6.76 6.73 6.70 6.67 6.65 6.62 6.60 6.58 6.56 6.54

P. R. Krishnaiah

5 13.07 11.48 10.49 9.81 9.31 8.94 8.65 8.42 8.23 8.07 7.93 7.81 7.71 7.63 7.55 7,48 7,42 7.36 7.31 7,26 7.22 7.18 7.15 7.12 7.09 7.06 7.03 7.01 6.99 6.96 6.94

6 13.89 12.17 ll.lO 10.36 9.83 9.43 9,11 8.86 8.65 8.48 8.34 8.21 8,10 8.01 7.92 7.85 7.78 7.72 7.67 7.62 7.57 7t53 7.49 7.46 7.43 7140 7.37 7.34 7.32 7.30 7.27

7 14.60 12.76 11.62 10.84 10.27 9.84 9.51 9.24 9.02 8.84 8.98 8.55 8.43 8.33 8.24 8.16 8.09 8.03 7.97 7.92 7.87 7.83 7.79 7.75 7.72 7.68 7.65 7.63 7.60 7.58 7.55

8 15.21 13.28 12.07 11.25 10.65 10.20 9.85 9.57 9.34 9.15 8.98 8.84 8.72 8.61 8.52 8.44 8.36 8.30 8.24 8.18 8.13 8.08 8.04 8.00 7.97 7.93 7.90 7.87 7.85 7.82 7.80

9 15.77 13.74 12.48 11.62 11.00 10.52 10.16 9.86 9.62 9.42 9.25 9.10 8.98 8.86 8.77 8.68 8.60 8.53 8.47 8.41 8.36 8.31 8.27 8.23 8.19 8.15 8.12 8.09 8.06 8.04 8,01

10 16.25 14.16 12.84 !1.95 11.30 10.81 10.43 10,12 9.87 9.66 9.49 9.34 9.20 9,09 8.99 8.90 8.82 8.74 8.68 8.62 8.56 8.52 8,47 8.43 8.30 8.35 8.32 8.29 8.26 8.23 8.20

Computations of some multivariate distributions


Table 9a (Continued) a=0.05, m = l , ~/p 5 6 7 8 9 10 !1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 6,61 5.99 5,59 5,32 5,12 4,96 4,84 4,75 4.07 4.60 4.54 4.49 4.45 4.41 4.38 4.35 4.32 4.30 4.28 4.26 4.24 4.22 4.21 4.10 4.18 4.17 4.16 4.15 4.14 4.13 p=0.7 2 8.76 7.84 7.26 6.86 6.57 6.35 6.17 6.03 5.92 5.82 5.74 5.67 5,61 5,56 5.51 5.47 5.43 5.40 5.37 5.34 5.31 5.29 5.27 5.25 5,23 5.21 5.20 5.18 5.17 5,16 3 10.08 8.97 8.27 7.70 7.44 7.17 6.97 6.80 6.66 6.55 6.45 6,37 5.30 6.23 6.18 6.13 6.08 6.04 6.01 5,97 5,94 5.92 5.89 5.~7 5.84 5.82 5.81 5.79 5.77 5.76 4 11.04 9.78 8.99 8.45 8.06 7.77 7.53 7.35 7.19 7.06 6.96 6.86 6.78 6.71 6.65 6,59 6.54 6.50 6.46 6.42 6.39 6.36 6.33 6.30 6.28 0.26 6.23 6,22 6.20 6.18 5 6 7 12.94 11.39 10.42 9.76 9.28 8.92 8.64 8.41 8.22 8.07 7.94 7.82 7.72 7.64 7.56 7.50 7.44 7.38 7.33 7.29 7.25 7.21 7.18 7.14 7.12 7.09 7.06 7.04 7.02 7.00 8 13,40 11.78 10.76 10.07 9.57 9.19 8.90 8.66 8.47 8.31 8.17 8.05 7.95 7.86 7.78 7.71 7.65 7.59 7,54 7.49 7.45 7.41 7.38 7.34 7.31 7.28 7.26 7,23 7.21 7.19 9 13,80 12.12 11.07 10.35 9.83 9.44 9.13 8.88 8.68 8.52 8.37 8.25 8.15 8.05 7.97 7.90 7.83 7.77 7.72 7.67 7.63 7.59 7.55 7.52 7.49 7.40 7,43 7.41 7.38 7.36 10 14.17 12.43 11.34 10.60 10.06 9.65 9.34 9.08 8.88 8,70 8.56 8.43 8.32 8.23 8.14 8.07 8.00 7.94 7.88 7.83 7.79 7.75 7,71 7.67 7.64 7.01 7.58 7.56 7.53 7.51

843

11.80 ' 1 2 . 4 2 10.42 9.56 8.97 8.55 8.22 7.97 7.77 7,60 7.47 7.35 7.25 7.16 7.08 7.02 6.95 6.90 0.85 6.81 6.77 6.73 6.70 6,67 fl.64 6.61 6.50 6.57 6.55 6.53 6.51 10.95 10.03 9.40 8.95 8.60 8.33 8.12 7.94 7.79 7.67

7.56
7.47 7.38 7.31 7.25 7.19 7.14 7.09 7.05 7.01 6.98 6.94 6.01 6.89 0.86 6.84 6.81 6.79 0.77

4.12

5.14

5.74

6.16

6.49

6.75

6.98

7.17

7.34

7.49

844

P. R. Krishnaiah

Table 9a (Continued) a = 0 . 0 5 , r e = l , pffiO.9 n/p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 10 20 21 1 0.61 5.99 5.59 5.32 5.12 4,96 4.84 4,75 4.67 4.60 4.54 4.49 4.45 4.41 4.38 4.35 4.32 2 7,96 7.16 6.65 6,:)0 0.05 5.85 5.70 5.58 5.48 5.39 5.32 5.26 5.20 5.16 5.12 5.08 5.05 3 8.73 7,81 7.25 6.85 6.57 6.35 6.18 6.04 6.93 5.83 5.75 5.68 5.62 5.57 5.5:1 5.48 5.45 4 9.28 8.27 7.67 7.24 6.93 6.69 6.5! 6.36 6.24 6.14 6.05 5.98 5.91 5.86 5.81 5.76 5.72 5 9.67 8.59 7.98 7.52 7.20 6,95 6,76 6.60 6,47 6.37 6.28 6.20 6.13 6.07 6.02 5.97 5.93 o 10.03 8.89 8,25 7.76 7.42 7,|6 6.96 6.80 6.66 6.55 0.46 6,37 6,30 6!24 6.19 6.14 6,10 7 10.28 9.09 8.45 7.9,t 7.90 7.33 7.12 6.96 6.82 6.70 6.61 6,52 6.45 6.39 6.33 6.28 6,23 8 10.55 9.32 8,66 8.12 7.76 7,48 7.27 7.10 6.95 6.83 6.73 6,65 6.67 6.51 6.45 6.40 6.35 9 10.72 9.45 8.79 8.24 7.89 7.61 7.39 7.22 7.07 6.95 6.85 6.76
6.68

10 10.95 9.64 8.98 8.39 8.02 7.73 7.50 7.32 7.17 7.05 6.95 6.86 6.78 6.71 6.65 6.59 6.55

6.61 6.56 6.50 6.46

Computations of some multivariate distributions


Table 9a (Continued) a=0.01, m = l , p=0.1 n/p 5 6 7 8 1 16.26 13.75 12,25 2 22.07 3 26.01 4 29.09 23.52 20.30 18.22 5 31.55 25.38 21.82 19.52
o

845

7 35.54 2S.33 24,22 21.57 19.74 18.40

8 37.28 29.57 25.21 22.41 20.48 19,06

0 38,7o 30.62 26.09 23.17 21.14 19.56

1o ,iO h5 31.65 20,91 23,85 21.74 20.21 19.o4 18.12 17.37 1fi.78 16.2H t5.85 15.49 15.17 14.81) 14.65 14.43 14.24 14.06 13.}}1 13.7fi 13.63 13.51 13.40 13.30 13.21 13.12 13.04 12.911 12.89 12.83

33.75 26.97 23,11 20.62 18.90 17.64 16.68 15.93 15.32 14.83 14.41 14.05 13.76 13.50 13.27 13.07 12.89 12.73 12.58 12.45

115.22 21.22 15,97 16,43 16.62 15.:16 14.43 13.73 13.17 12.72 12.35 12,04 ll.7tt 11.56

11.211 14,50 13,47 12.71 12.13 ll.6H 11.31

O 10.58 lO 10.04 11 12 13 14 |5 19 17 115 19 20 21 22 23 24 25 26 27 215 2~ 30 31 32 33 34 35 II,fl5 9.33 9.07

16.715 17.93 15.72 14,~2 14.28 13.77 13,35 13.00 12.71 12.45 12.23 12.03 11.88 11.71 11.57 11.45 11.34 11.24 ll.15 11.06 16,76 15.t18 15.18 14.62 14.16 13.77 13.44 13.16 12,92 12.71 12,52 12.35 12.20 12.07 11.95

17.315 17.01} 18.54 16.58 15.93 15,4! 14,97 14,59 14.27 13.99 13.75 13.54 13.35 13. IS 13.02 1'7..15 16.47 15.92 15.45 15.06 14.72 14.43 14,17 13.95 13.75 13.57 13.41 17.66 16.95 16.37 15,89 15.48 15.12 14.82 14.55 14.32 14.11 13.92 13.75 13.60 13.46 13.34 13.22

15.86 ll.O0 8.98 8.53 10~75 10.53

1 5 . 4 0 10.34 8,28 8.1K 8.10 15,02 7.95 7.1515 7.82 7.77 7.72 7.68 7.64 7.00 7.56 7,53 7.50 7.47 7.44 7.42

10.115 11,36 10.04 9.91 9.80 9,70 9.61 9.53 I}.45 9.38 9.32 9.26 9.21 9.16 U.ll 9.07 9.03 15.99 8.96 11.19 11.04 10.90 10.78 10.67 10.57 10.48 10.40 10,33 10.28 10.19 10.13

12,815 13.26 12.76 12.64 12.54 12.44 12.35 12.27 12.19 12.12 12.06 11.99 11.94 13.13 13.01 12.90

11.154 12.33 11.74 11.64 12.22 12.13 12.03 11,95 11.87 ll.80 11.73 11.67 11.61 11.56

10.915 11.56 lO. l 11.48

12.150 13.12 12.70 12.t12 12.54 12.46 12.39 12.33 12.27 13,02 12,93 12.84 12.77 12.69 12.63 12.56

10.155 l l . 4 1

10.015 10,715 11.34 10.03 9.98 9.94 9.90 10.73 10.67 10.62 10.58 11.28 11.22 ll.16 11.11

846 Table 9a (Continued) ~0.01, ~/p 5 6 7 8 9 10 11 12 13 14 15 16 17 18 10 20 21 22 23 24 25 20 27 28 29 30 31 32 33 34 35 r e = l , $)=0.3 1 16.26 13.75 12.25 11.26 10.56 10.04 9.65 9.33 9.07 8.80 [~.08 8,53 8.40 8.28 8.18 8.10 8,02 7,95 7,88 7,82 7,77 7.72 7.68 7,64 7.60 7.50 7.53 7.50 7.47 7.44 7.42 2 21.86 18.07 |5.tt5 14.40 13.39 12.64 12,-07 11.62 11.25 10.95 10.70 14),48 10.30 10.14 lO.O0 9.87 9.76 9,66 9.57 9.49 9.42 9.35 0.29 9.23 9.18 0.13 9.08 9.04 9.00 8.97 8.93 ~ 25.59 20.93 lU.20 16.43 15.20 14.29 13.60 13.06 12,62 12.25 11.95 11.09 11.47 11.28 11.11 10.90 10.83 10.71 10.61 10.51 10.42 10.34 10.27 10.20 10.14 10.08 10.03 0.98 9.93 9,89 9.85 4 28.48 23,1)8

P. R. Krishnaiah

5 30.74 24.82

0 32.78 26.29 22.5t~ 20,19 18.53 17.32 16.40 15.68 15.10 14.62 14.22 13.88 13.59 13.33 13.11 12..92 12.75 12.59 12.45 12.32 12.21 12.11 12.01 11.02 11.84 11.76 11.69 11.63 11.57 11.51 11.46

7 34.38 27.5,1 23.61 21.07 19.31 18.03 17.05 16.29 15.67 15.16 14.74 14.38 14.07 13.81 13.57 13.37 13.18 13.02 12.87 12.74 12.62 12.5l 12,41 12.:11 12.23 12.15 12.07 12.01 11.94 11.88 11.82

8 36.00

9 37.21

lO 38~5~ 3o.61~ 2(1.00 23.17 21.17 19.71 18.60 17.73 17 02 16.45 15.97 15.57 15.22 14.92 14.65 14.42 14.21 14.03 13.86 13.71 13.58 13.45 13.34 13.23 13,13 13.05 12.96 12.tt8 12.81 12.74 12.6ti

2~1.68 2tl.66 24.52 21.~5 20.00 18.65 17.63 18.82 16.17 15.64 15.20 14.82 14.50 14.22 13.97 13.76 13.57 13.39 13.24 13.10 12.97 12.86 12.75 12.65 12,57 12.48 12.40 12.33 12.27 12.20 12.14 25.3:1 22.54 20.61 19.21 18.14 17.30 16.62 16.07 15.61 15.21 14.88 14.59 14.33 14.11 13.91 13.73 13.57 13.42 13.29 13.17 13.06 12.96 12.87 12.78 12.70 12.62 12.55 12.49 12.43

19,tJdt 21.38 17.95 16.55 15.52 14.74 14.12 13.62 13.22 12.87 12,58 12.34 12.12 11.93 11.76 11.61 11.48 11.36 11.25 11.15 11.06 10.08 10.91 10.84 10.77 10.71 10.66 10.60 10.55 10.51 19.17 17.63 16.59 15.64 14.97 14,43 13.98 13.61 13,20 13.02 12.78 12.58 12.30 12.23 12.09 11.96 11.84 11.73 11.63 11.54 ll.4fl 11.'.19 1t.32 11.25 11.19 11.13 11.08 11.03

Computations of some multivariate distributions


Table va (Continued) a=0.01, r e = l , p=0.5 n/p 5 6 7 8 9 10 11 12 13 14 15 10 17 18 19 '~0 21 22 23 24 | 16,~6 13.75 12.25 11.20 lO.b6 10.04 V.05 9.33 9.07 8.86 8.08 8.5:t 8,40 8.28 8.18 8,10 8.02 7.95 7.88 7.82 7.77 7.72 7.08 7.04 7.60 7,56 7.53 7.50 7.47 7.44 7.42 2 21.41 17.74 15,59 14.18 13.20 12.47 !1.91 J1.47 11.12 10.83 10.58 10,37 10.19 10,0.1 9.90 9.78 9.67 9.57 9.48 9.40 9.33 9.27 9.21 3 2.t.73 20.30 17.70 16.02 14.84 13.98 13.32 12,80 12.38 12.03 11.74 11,49 11.28 11.10 10.94 10.79 10.67 10.55 10.45 10.36 10.27 10.20 10.13 4 27,25 22,20 19.27 17,37 16,05 15,08 1,t,34 13,76 13,29 12,90 12,58 12~31 12.07 11,87 11,60 11.53 11.39 11.26 11.15 11.04 10.95 10.87 10.79 10.71 10.65 10.59 10.53 10.48 10.43 10.38 10.34 5 29.19 23,71 20.51 1~.44 17.00 15.95 15.15 14.52 14.01 13.59 13.24 12.95 12,(39 12.47 12.28 12.1O 11.95 11.82 11,60 11.58 11.48 11.39 11.30 11,23 11,15 11.09 11.03 10.97 10.92 10.86 10,82 0 30.92 24,08 21.54 10.33 17.80 16.68 15.82 15.15 14.61 14.16 7 :12.20 26.05 22.43 20.10 18.,18 17.29 10.30 15,69 15,12 14.65 8 33.63 27.01 23.20 20.76 19.07 17.83 16.89 16.15 15.56 15,07 14.66 14.31 14.02 13.76 13.53 13.33 13.18 1"|.00 12.85 12.73 12.61 12.50 12.40 12.31 12.23 12.15 12.08 12.01 11.95 11.89 11.84 9 34.00 27.84 23.80 21.36 19.60 18.31 17.3.1 16.57 15.95 15.44 15.02 14.66 14.35 14.0~ i3.85 13.6.t 13.46 13.30 13.15 13.01 12.89 12.78 12.68 12.59 12.50 12.42 12.35 12.28 12.21 12.15 12.10 IO 3,~.77 38.01 24.51 21.~9 ~0.07 18.7,t 17.73 16.94 16.30 15.78 15,34 14.97 14.65 14.38 14.14 13.02 13.73 13.56 13.41 13.27 13.15 13.03 12.93 12.83 12.74 12.66 12.58 12.51 12.45 12.38 12.33

847

13.7~9 14.25 13.47 13.20 12.97 12.76 12.58 12.42 12.27 12.14 12.02 11.92 11.82 11.73 11.65 11.57 11.50 1'1.43 11.37 11.32 11.26 11.21 13.92 13.64 13.39 13.17 12.98 12.81 12.66 12.52 12.40 12.29 12.18 12.09 12.00 11.92 11.85 11.78 11.72 11.66 11.60 11.55

25
26 27 28 29 30 31 32 33 34 35

9 . 1 5 " 10.06 9.10 9.05 9.01 8.97 8.93 8.89 8.88 10,00 9.95 9.89 9.84 9.80 9.76 9.72

848 Table 9a (Continued) a = 0 . 0 1 , m = l , 0=0.7

P. R. Krishnaiah

n/p
5 6 7 8 9

1
16.26 13.75 12.25 11.26 10.56

8
~0.03 24.40 21.13 19.02 17.55 16.47 15.65 15.01 14.49 14.06 13.70 13.40 13.14 12.91 12.71 12.54 12.38 12.24 12.12 12.00 11.90 11,80 11.72 11.64 11.56 11.50 11.43 11.37 11.32 11.27 11.22

~
30.79 25.Ct 21.65 19.47 17.95 16.84 16.00 15.33 14.80 14.36 13.99 13.67 13.40 13.17 12.97 12.79 12.63 12.48 12.35 12.23 12.13 12.03 11.94 11.86 11.78 11.71 11,05 11.59 11.53 11.48 11.43

1o
31.60 25.59 22.12 19.87 18.31 17.17 16.31 15.62 15.07 14.62 14.24 13,92 13.64 13.40 13.19 13.01 12.84 12.70 12.56 12.44 12.33 12.23 12.14 12.06 II.98 11.91 11.84 11.78 11.72 11.67 11.62

20.61 17.15 15.10 13.77 12.84 12.14 11.62 11.20 10.89 10.58 10.34 10.15 9.97 9.82 9.69 9.58 9.47 9.38 9.30 9.22 9.15 9,09 9.03 8.98 8.93 8.88 8.94 8.80 8.76 8.7:! 8.70

23.28 19.23 16.84 15.29 14.20 13.40 12.79 12.31 11.92 11.60 11.33 11.10 10.90 10.73 10.58 10.45 10.33 10.22 10.13 10.04 9.96 9.89 9.82 9.76 0.71 9.65 9.61 9.56 9.52 9.48 9.44

25.24 20.73 18,09 16.37 15.18 14.30 1:1.63 13.10 12.67 12.32 12.03 11.78 11.56 11.37 11.21 11.06 10.93 10.82 10.71 10.62 10.53 10.45 10.38 10.32 10.25 10.20 10.15 lO. lO 10.05 10.01 9.97

26,74 21.91 19.06 17.22 15.94 15,00 14.28 13.72 13.26 12.88 12.57 12.30 12.07 11.87 11.69 11.54 11.40 11.28 11.17 11.07 10.98 10.89 10.81 10.74 10.68 10.62 10.56 10.51 10.46 10.42 10.37

28.03 22.87 19.86 17.92 16.56 15.57 14.81 14.22 13.74 13.34 13.01 12.73 12.49 12.27 12.09 11.93 11.78 11.65 11.54 11.43 11.33 11.25 11.17 11.09 11.02 10.96 10.90 10.85 10.79 10.75 10.70

29.06 23.69 20.54 18.51 17.09 16.06 15.2~t 14.64 14.14 13.73 13.38 13.09 12.84 12.62 12.42 12.25 12.10 11.97 11.85 11.74 11.64 11.55 11.46 11.38 11.31 11.25 11.19 11.13 11.08 11.03 10.98

lO 10.04 11 12 13 14 16 16 17 18 19 20 21 9.05 9.33 9.07 8.86 ~.6~ 8.53 8.40 8.28 8.18 8.10 9.02 7.95 7.88 7.82 7.77 7.72 7.68 7.64 7.60 7.56 7.53 7.50 7.47 7.44 7.42

22
23 24

25
26 27 28 29 80 31

32
33 34 35

Computations of some multivariate distributions Table 9a (Continued) a---0.01, m = l , n/p 5 0 7 8 9 10 11 12 13 14 15 10 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 1 16.26 13.74 12.25 11.26 10.54) 10.04 9.65 9.33 9.07 8.86 8.68 8.53 8.40 8.28 8.18 8.10 8.02 7.95 7.88 7.82 7.77 7.72 7.68 7.64 7.60 7.56 7.53 7.50 7 47 7,44 7.42 #=0.9 2 10.04 15.95 14.11 12,91 12.06 11.44 10.~6 10.58 10.27 i0.02 9.81 9.62 0.47 9.33 9.21 9.10 9.01 8.93 8.85 8.78 8.72 8.66 8.61 8.50 8.51 8.47 8.43 8.40 8.36 8.33 8.30 3 20.63 17.20 15.19 13.86 12.92 12.23 11.70 11.28 10.94 10.66 10.43 10.23 10.06 9.91 9.78 9.66 9.56 9.47 9.38 9.31 9.24 9.18 9.12 9.07 9.02 8.97 8.93 8.89 8.85 8.82 8.79 4 21.71 18.06 15.87 14.48 13.46 12.74 12.21 11.76 11.40 11.11 10.86 10.65 10.47 10.31 10.17 10,05 9.94 9.84 9.75 9.67 9.60 9.53 9.47 9.41 9.36 9.31 9.27 9.23 9.19 9.15 9.12 5 22.59 18.71 16.43 15,01 13~95 13.18 12.60 12.11 11,75 11.44
/

849

6 23.19 19.25 16.86 15~36 14.23 13.46 12,90 12.41 12.04 11.71 11.44 11.21 11.02 10.85 10.70 10.57 10.45 10.35 10.25 10.16 10.09 10.01 9.9~ 9.89 9.83 9.78 9.73 9.69 9.65 9.61 9.57

7 23.86 19,68 17.24 15.75 14.62 13.79 13.18 12.64 12.26 11.93 11.66 .11.43 11.22 11.05 10.90 10.76 10.64 10.53 10.43 10.35 10.27 10.19 10.13 10.06 10.01 9.95 9.90 9.86 9.81 9.77 9.74

8 24.21 20.07 17.55 15.97 14.76 13.94 13.38 12.85 12.47 12.13 11.84 11.60 11.40 11.22 11.06 10.93 10.80 10.69 10.59 10.50 10.42 10.34 10.28 10.21 10.15 10.10 10,05 10.00 9.96 9.92 9.88

9 24.81 20.39 17.83 16.28 15.10 14.24 13.61 13.01 12,63 12,29 12.00 11.76 11.55 11,37 11.21 11.07 10.94 10.83 10.73 10.64 10.55 10.48 10.4i 10.34 10 28 10.23 lO.18 10.13 10.08 10.04 10.00

10 24.97 20.69 18.08 16.43 15.15 14.31 13.75 13.19 12.81 12.45 12,14 "11.90 11.69 11.50 11.34 11.20 11.07 10.95 10.85 10.76 10.67 10.59 10.52 10.46 10.40 10.34 10.29 10.24 10.19 10.15 10.11

11.18 10.96 10.77 10.61 10.46 10.34 10.22 10.12 10.03 9.95 9.87 9.80 9.74 9.68 9.62 9.57 9,53 9.48 9.44 9.41 9.37

850

P. R. Krishnaiah

Table 10 Bounds on the Probability Integrals of the Multivariate F Distribution a=0.05, p = 3 , p=0.1

n"~"~ 5 6 7 8 9 I0 11 12 13 14 15 20 25 30 35 40

2 9.031 (.9343) 7.907 (.9376) 7.192 (.9398) 6.698 (.9414) 6.337 (.9425) 6.062 (.9434) 5.845 (.9441) 5.671 (.9446) 5.528 (.9451) 5.408 (.9454) 5.306 (.9458) 4.965 (.9468) 4.772 (.9473) 4.647 (.9476) 4.560 (.9479) 4.496 (.9480) 7.262 (.9223) 6.348 (.9282) 5.748 (.9321) 5.328 (.9350) 5.o18 (.9370) 4.780 (.9387) 4.592 (.9399) 4.440 (.9409) 4.314 (.9418) 4.209 (.9425) 4.119 (.9431) 3.816 (.9450) 3.643 (.9460) 3.531 (.9466) 3.453 (.9470) 3.395 (.9473) 6.442 (.9115) 5.658 (.9199) 5.120 (.9251) 4.743 (.9290) 4.462 (.9319) 4.244 (.9341) 4.072 .9359) 3.931 .9373) 3.814 (.9385) 3.716 .9395) 3.632 .9404) 3.347 .9431) 3.183 .9447) 3.077 (.9456) 3.002 (.9462) 2.946 .9467) 5.953 (.9023) 5.240 (.9122) 4.744 (.9184) 4.399 (.9233) 4.137 (.9269) 3.934 (.9298) 3.772 (.9320) 3.639 (.9339) 3.529 (.9354) 3.436 (.9367) 3.356 (.9377) 3.083 (.9413) 2.925 (.9433) 2.821 (.9445) 2.748 (.9453) 2.694 (.9459)

i0 5.637 (.8949 4.943 (.9048 4.485 (.9121 4.164 (.9180 3.917 (.9222 3.725 (.9256 3.571 (.9283 3.445 (.9305 3.339 ( .9323 3.250 (.9338 3.173 (.9351 2.910 (.9395 2.755 (.9419 2.654 (.9435 2.582 (9445) 2.529 (.9452

Computations of some multivariate distributions

851

Table 10 (Continued) a=0.05, p=3, p=0.3 io 5 6 7 8 9 I0 ii 12 13 14 ]5 20 25 30 35 40 8.823 (.9313 7.745 (.9347 7.058 (.9371 6.582 (.9388 6.~34 (.9400 5,969 (.941o 5.76], (.9417 5.593 (.9423 5.454 (.9428) 5.339 (.9433) 5.240 (.9436) 4.911 (.9448) 4.724 (.9454) 4.603 (.9458) 4,519 (.9461) 4.457 (.9463) 7.103 (.9188) 6.224 ( .9249 5.647 (.9291) 5.243 (.9320) 4.944 (.9343) 4.714 (.9360) 4.532 (.9374) 4.385 (.9385) 4.263 (.9394)
4 .,161

6,305 (.9075 5.548 (.9161) 5.032 (.9216) 4.668 (.9257) 4,397 (.9288) 4.187 (.9312) 4.020 (.9331) 3,884 (.9347) 3,771 (.9360) 3.675 (.9371) 3.594 (.938o) 3,318 (.9410) 3.159 (.9427) 3.055 (.9438) 2.982 (.9445) 2.928 (.9450)

5.829 (.8979 5.138 (.9o81) 4.664 (.9147) 4.330 (.9198) 4,078 (.9236) 3,882 (.9266) 3.725 (.9290) 3.596 (.9310) 3.489 (.9326) 3.399 (.9340) 3.321 (.9352) 3.057 (.9391) 2.903 (.9412) 2.8O2 (. 9426 ) 2.731 (.9436) 2.678 (.9442)

5,519 ( .8902 4.846 (.90O2) 4.409 (.9o81) 4.099 (.9142) 3.861 ( .9187 3.676 (.9222 3.527 (.9251) 3.404 (.9274) 3.302 (.9294) 3,215 (.931o) 3.141 (.9324) 2.885 (.9371) 2.735 (.9398) 2.636 (.9415) 2.567 (.9426) 2.515 (.9434)

(.9402) 4.074 (.9408) 3.780 (.9429) 3.613 (.9441) 3.504 (.9449) 3.428 (.9453) 3.372 (.9457)

852 Table 10 (Continued) a=0.05, p ~ 3 , p~0.5 4 5 6 7 8 9 i0 ii 12 13 14 15 20 25 3O 35 4o 8.359 .9237) 7.381 .9276) .754 .9303) 6.319 .9323) 6.000 .9337) 5.756 .9349) 5.564 .9358) 5.409 (.9365) .282 .9372) 5,175 .9377) 5.084 .9381) .779 .9397) .606 .9405) 4.494 .9411) 4.415 .9415) ~.358 .9418)

P. R. Krishnaiah

6 5.997 (.8975) 5.303 (.9069) 4.833 (.9131) 4.499 (.9177) 4.249 (.9213) 4.055 (.9240) 3.900 (.9262) 3.774 (.9280) 3.669 (.9295) 3.581 (.9308) 3.505 (.9319)" 3.247 (.9355) 3.098 (.9376) 3.001 (.9389) 2.933 (.9399) 2.882 (.9406)

8 5.546 (.8869) 4.91o (.8978) 4.481 (.9054) 4.174 (.9111) 3.942 (.9154) 3.761 (.9188) 3.615 (.9216) 3.496 (.9238) 3.397 (.9257) 3.313 (.9273) 3.241 (.9287) 2.994 (.9333) 2.849 (.9359) 2.755 (.9376) 2.688 (.9388) 2.639 (.9397)

i0 5.246 (.8781) 4.631 (.8890) 4.237 (.8981) 3.951 (.9048) 3.733 (.9099) 3.562 (.9139) 3.424 (.9171) 3.311 (.9198) 3.216 (.9220) 3.135 (.9239) 3.066 (.9255) 2.827 (.9311) 2.686 (.9342) 2.594 (.9363) 2.528 (.9377) 2.479 (.9388)

6.747 (.9099) 5.946 (.9167) 5.419 (.9214) 5.048 (.9248) 4.772 (.9274) 4.561 (.9294) 4.393 (.9310) 4.256 (.9323) 4.144 (.9334) 4.o49 (.9343) 3.968 (.9351) 3.695 (.9377) 3.538 (.9392) 3.436 (.9402) 3.365 (.9409) 3.313 (.9413)

Computations of some multivariate distributions

853

Table 10 (Continued) a~0.05, p=4, ~=0.1 i0


5 6 7

.626 .9228) .526 .9295) .795 .9337) .274 .9366) .887 .9387) .588 .9402) .351 .9413) .159 .9422) .000 .9430) 5.866 .9436) 5.753 .9441) 5.37O .9456) 5.151 (.9464) 5.OLO (.9469) 4.911 (.9473) 4.839 (.9475)

7.119 (.8922) 6.477 (.9086) 5.980 (.9181) 5.601 (.9243) 5.306 (.9286) 5.073 (.9318) 4.883 (.9342) 4.727 (.9360) 4.597 (.9375) 4.486 (.9387) 4.391 (.9397) 4.066 (.9429) 3.877 (~9445) 3.754 (.9455) 3.668 (.9461) 3.604 (.9465)

5.787 .8532) 5.463 .8841) 5.122 .9002) 4.841 .9106) 4.609 .9176) 4.417 .9226) 4.258 .9264) 4.124 .9293) 4.Oll .9317) 3.914 .9336) 3.829 (.9351) 3.535 (.9400) 3.362 (.9424) 3.247 (.9439) 3.166 (.9449) 3.1o6 (.9455)

4.851 (.8o27) 4.745 (.8525) 4.554 (.8790) .359 .8951) 4.177 .9o53) .019 .9126) .884 .9180) .768 .9221) .668 .9254) .581 .9280) .505 .9302) .234 .9369) .o71 .9403) .962 .9423) .884 .9436) .826 .9445)

3.8o5 (.6936) 4.128 (.8o73) 4.116 (.8536) 3.998 (.8772) 3.861 (.8915) 3.735 (.9016) 3.622 (.9089) 3.521 (.9144) 3.433 (.9187) 3.355 (.9221) 3.285 (.9250) 3.035 (.9336) 2.879 (.9380) 2.775 (.9405) 2.7O0 (.9422) 2.643 (.9433)

8
9 10 11 12 13 14 15 2O 25 3O 35 4O

854 Table 10 (Continued) a=0.05, p = 4 , p=0.3

P. R. Krishnaiah

n'<..
5 6 7 8 9 10 11 12 13 14 15 2o 25 30 35 40

8 9.224 (.9160) 8.226 (.9237) 7.554 (.9286) 7.072 (.9319) 6.712 (.9342) 6.433 (.9360) 6.211 (.9374) 6.030 (.9384) 5.881 (.9393) 5.755 (.9400) 5.648 (.94o6) 6.768 (.8806) 6.221 (.8998 5.781 (.9108 .438 .9179 .169 .9229 .953 .9266) .778 .9294 4.632 .9315 .510 .9333 4.406 (.9347 4.317 (.9358 4.010 (.9396 3.830 (.9416 3.713 (.9428 3.631 (.9436 3.570 (.9441 5.424 (.8329 5.206 (.8705 4.930 (.8899 4.687 (.9021 4.481 (.9103 4.307 (.9162

io

4.386 (.7596)
4.468 (.8308) 4.357 (.8644) 4.203 (,8837) 4.o49 (.8960) 3.912 (.9046) 3.791 (.9109) 3.686 (.9157) 3.594 (.9195) 3.514 (.9225) 3.443 (.9250) 3.19o (.9328) 3.o35 (.9368) 2.932 (.9392) 2.858 (.9407) 2.802 (.9418) 3.814 (.7701) 3.903 (.8323) 3.834 (.8618) 3.73o (.8796) 3.626 (.8916) 3.528 (.9002) 3.439 (.9067) 3.359 (.9117) 3.288 (.9158) 3.225 (.919o) 2.992 (.9291) 2.846 (.9342) 2.747 (.9372) 2.676 (.9392) 2.622 (.9405)

4.162 (.9205
4.039 (.9239 .933 .9266 .843 .9288 .764 .9306 .487 .9363 .323 .9392 .214 .9410) 3.136 (.9422) 3.079 (.9430)

5.286 (.9426)
5.079 (.9436) 4.945 (.9443) 4.851 (.9447) 4.782 (.9450)

Computations of some multivariate distributions

855

Table 10 (Continued) a=0.05, p=4, p---0.5 i0


5 6 7

8.276 (.8963) 7.511 (.9070) 6.977 (.9138) 6.585 (.9184) 6.287 (.9217) 6.053 (.9242) 5.866 (.9262) 5.713 (.9277)
5.585

5.903 (.8436) 5.597 (.8729) 5.297


(.8891)

.437 .7532) .554


.825L)

3.700 (.7436)

.449 .8575) .304 .8763) .162 .8885) .034 .8971) .922 .9034 ) .825 .9083) 3.740 .9121) 3.665 .9152) 3.600 .9178) 3.365 .9258) 3.222
.9301)

3.838 .8139)
3.8o5 (.8471)

3.282

(.7436) 3.396 (.8076)


3.39O

5.042 (.8994) 4.833 (.9065) 4.660 (.9117) 4.517 (.9156) 4.396 (.9187) 4.294 (.9211)

9 i0 ii 12
13 14 15 2o

3.727 .8669) 3.640

(.8399) 3.345 (.8598) 3.289 (.8734) 3.230 (.8833) 3.174 (.8909) 3.z21 (.8968) 3.072
(.9015)

(.88oi)
3.556 .8895) 3.478 .8966) 3.407 (.9021) 3.344 (.9064)
3.287 (.9100) 3.077 (.92z1)

(.9290)
5.477 (.93oo) 5.385 (.9309)

4.206 (.9231)
4.130 (.9248) 3.866 (.9302) 3.709 (.9331) 3.605 (.9349) 3.532 (.9361) 3.478 (.9370)

5.071 (.9338)

2.884 (.9160)

25
3o 35 4o

4.890 (.9355)
4.772 (.9365) 4.689 (.9372) 4.628 (.9378)

2.944 (.9268) 2.854 (.9302) 2.788 (.9325) 2.739 (.9341)

2.760 (.9232) 2. 674 (.9275) 2.611 (.9304) 2.564 (.9324)

3.126 .93~7) 3.057 .9344) 3.006 (.9356)

856 Table 10 (Continuea) et=0.05, p=5, p=0.1

P. R. Krishnaiah

i0 5 6 7 8 9 i0 ii 12 13 14 15 2O 25 30 35 40 9.752 (.9060) 8.852 (.9189) 8.184 (.9264) 7.679 (.9312) 7.289 (.9344) 6.982 (.9367) 6.735 (.9385) 6.533 (.9398) 6.364 (.94O9) 6.222 (.9417) 6.i0o (.9424) 5.687 (.9446) 5.449 (.9457) 5.295 (.9463)
5.187 (.9468) 5.107 (.9471)

5.984 (.8096) 6.145 (.87n) 5.936 (.8957) .684 .9o91) .450 .9175) .247 .9232) .073 .9273) .925 .9304) .797 .9328) .687 .9347) .592 .9362) .257 .9408) .o58 .9431) .927 .9444) .835 .9453) .766 .9459) 4.662 (.8415 4.668 (.8762 4.572 (.8944 4.453 (.9057 4.335 (.9134 4.226 (.9190 4.128 (.9231) 4.04O (.9264) 3.961 (.9290) 3.674 (.9366) 3.497 (.9403) 3.377 (.9423) 3.292 (.9436) 3.229 (.9445) 3.855 (.8156) 3.945 (.8594) 3.922 (.8817) 3.861 (.8954) .790 .9047) .719 .9115) .650 .9166) .587 .9206) .340 .9319) .178 .9371) 3.o68 (.94oo) 2.988 (.9419) 2.927 (.9431) 3.34O (.7911) 3,477 (.8449) 3~491 (.8707) 3.464 (.8863) 3.421 (.8970) 3.373 (.9047) 3.324 (.9106) 3.114 (.9266) 2.968 (.9337) 2.864 (.9376) 2.788 (.9400) 2.73O (.9416)

Computations of some multivariate distributions

857

Table 10 (Continued a=0.05, p = 5 , 0=0.3

n'%
5 6 7 8 9 lO Ii 12 13 14 15 2O 25 30 35 40 9.044

IO

(.89o9
8.352 (.9077 7.799 (.9173 7.365 (.9233 7.024 (.9273 6.751 (.9303 6.530 (.9325 6.347 (.9342 6.194 (.9355 6.064 (.9366 5.953 (.9375 5.572 (.9404 5.352 (.9419 5.2o8 (.9428 5.zo7 (.9434 5.033 (.9438 .594 .8409) .560 .8771) .398 .8953) .219 .9063) .o53 .9137) .906 .9189) 4.778 (.9229) 4.666 (.9259) 4.568 (.9282) 4.482 (.9302) 4.178 (.9361) 3.994 (.9390) 3.873 4.129 (.7862 4.333 (.8482 4.322 (.8752 4.253 (.8909 4.169 (.9o12 4.083 (.9o85 4.002 (.9139) .927 .918o) .859 .9214) .604 .9310) .442 .9356) 3.332 .9382) .252 .9400) 3.193 .9411) 3.193 (.6979) 3.623 (.8193) 3.690 (.8564) 3.678 (.8767) 3.639 (.8897) 3.589 (.8988) 3.536 (.9056) 3.485 (.9108) 3.273 (.9252) 3.128 (.9318) 3.026 (.9355) 2.952 (.9378) 2.895 (.9394) .148 .7885) .267 .8383) .290 .8633) .278 .8789) 3.250 .8898) 3.217 .8977) 3.047 .9187) 2.918 .92'{6) 2.824 .9325) 2.754 .9355) 2.701 .9376)

(.940T) 3.786
(.9418) 3.722 (.9426)

858 Table 10 (Continued) ~=0.05, p = 5 . p=0.5

P. R. Krishnaiah

I0
5 6 7 8 9 10 11 12 z3 14 15 2O 25 3O 35 4o

7.120 (,8279) 7.o31 (.8662) 6.788 (.8852) 6.543


(.8964)

4.365 (.7807) 4.568 (.8374) 4.573 (.8636) 4.520 (.8792) 4.450 (.8895) 4.378 (.8969) 4.309 (.9025) 4.244 (.9069) 4.185 (.9103) 3.960 (.9207) 3.816 (.9258) 3.718 (9288) 3.647
(.9308) 3.593 (.9322)

6.327 (.9038) 6.143 (.9090) 5.987


(.9129)

3.450 (.7646) 3.636 (.8240) 3.678 (.8517) 3.672 (.8685) 3.647 (.8800) 3.613 (.8884) 3.577 (.8948) 3.411 (.9124) 3.290 (.9205)
3.203

3.010 (.7653) 3.149 (.8189) 3.191 (.8453) 3.199 (.8620) 3.19o (.8736)
3.085 (.9025)

5.854 (.9159) 5.741 (.9183) 5.643 (.9203) 5.558 (.9219) 5.26o (.9270) 5.082 (.9297) 4.965 (.9314) 4.882 (.9326) 4.820 (.9334)

2.847
(.8181)

2.883 (.8423) 2.856 (.89O2) 2.778 (.9071) 2.711 (.9158) 2.658 (.9211) 2.616 (.9247)

2.985 (.9143) 2.9O8 (.9208) 2.850 (.9248) 2.9804 (.9275)

(.9251) 3.139 (.9280)


3.090 (.9301)

Computations of some multivariate distributions Table 10 (Continued) a ~0.05, p = 6 , ~=0.1


2 5 6 7 8 9 4

859

i0

9.365 (.8777) 8.913 (.9042)


8.4o0 5.516

(.9172) 7.954 (.9248) 7.586 (.9296) 7.285 (.9330) 7.037 (.9354) 6.83]_ (.9373) 6.658 (.9387) 6.510 (.9398) 6.383 (.9408) 5.947 (.9436) 5.694 (.9450) 5.529 (.9458) 5.413 (.9463) 5.328 (.9467)

(.8496 5.565 (.8844 5.461 (.9016 5.323 (.9120 5.184 (.9188 5.056 (.9237 4.940 (.9274 4.836 (.9302 4.744 (.9324 4.410 (.9388 4.205 (.9418 4.069 (.9434 3.972 (.9445 3.899 (.9452 3.691 (.7222) 4.259 (.8436) 4.329 (.8762) 4.303 (.8935) 4.246 (.9044) 4.179 (.9119) 4.110 (.9174) 4.044 (.9216) 3.779 (.9330) 3.604 (.9380) 3.482 (.9408) 3.395 (.9425 3.328 (.9436 3.633 (.8457) 3.684 (.8733) 3.678 (.8893) 3.650 (8~98) 3.612 (.9074) 3.413 (.9262) 3.261 (.9338) 3.152 (.9378) 3.071 (.9402) 3.009 (.9418) 3.251 (.8502) 3.285 (.8731) 3.284 (.8873) 3.160 (.9179) 3.032
(.0289)

I0 ii 12 13
14 15 20 25 3o 35 40

2.933
(.9344) 2.858

(.9377) 2.800 (.9399)

860

P. R. Krishnaiah

Table 10 (Continued) a--0.05, p=6, 0=0.3


io 5 6 7
8

8.053 (.8361) 8.102 (.8816) 7.813 (.9012) 7.493 (.9120) 7.207 (.9188) 6.962 (.9234) 6.754 (.9268) 6.579 (.9293) 6.429 (.9313 6.300 (.9329 6.189 (.9342 5.8O0 (.9381 5.572 (.9402 5.422 (.9414 5.316 (.9422 5.237 (.9427 4.493 (.7538) 5.022 (.8475) 5.072 (.8779) 5.o17 (.8941) 4.931 (.9043) 4.839 (.9114) 4.750 (.9165) 4.667 (.9204) 4.592 (.9234) 4.305 . (.9322) 4.122 (.9363) 3.999 (.938?) 3.91o (.9402) 3.844 (.9412) 3.904 (.8292) 3.999 (.864O) 4.oo5 (.8826) 3.978 (.8944) 3.938 (..9027) 3.893 (.9088) 3.683 (.9248) 3.531 (.9316) 3.423 (.9353) 3.344 (.9377) 3.283 (.9392) 3.337 (.8212) 3.428 (.8562) 3.449 .8752) 3.443 .8876) 3.316 .9156) 3.192 .9261) 3.096 (.9215) 3.024 (.9348) 2.968 .9370) 3.057 (.9039) 2.962 (..9195) 2.879 (.9272) 2.813 (.9316) 2.761 (.9345)

9 io ii 12 13 14 15 2O 25 3o 35 40

Computations of some multivariate distributions Table 10 (Continued) a =0.05, p = 6 , 0 ~ 0 . 5

861

i0
5 6 7

5.88o (.8o96) 6.079 (.8512) 6.074 (.8716) 6.009 (.8841) 5.927 (.8925) 5.843 (.8986) 5.764 (.9032) 5.690 (.9068) 5.623 (.9097) 5.369 (.9184) 5.208 (.9228) 5.098 (.9255) 5.018 (.9272) 4.958 (.9285) 3.698 (.7449) 4.039 (.8221) 4.127 (.8506) 4.149 (.8673) 4.143 (.8785) 4.125 (.8867) 3.990 (.9076) 3.875 (.9166) 3.789 (.9215) 3.724 (.9246) 3.674 (.9268) 3.362 (.8413) 3.383 (.8912) 3.310 (.9073) 3.243 (.9154) 3.189 (.9203) 3.145 (.9235) 2.926 (.9078) 2.881 (.9149) 2.843 (.9195)

8
9 10 11 12 13 14 15 2O 25 3o 35 4o

862

P. R. Krishnaiah

Table I la Approximate Percentage Points of T ' ~ 2 x Statistic a = 0 . 0 5 , p = 2 , /3= 1, ~ = 0

20 22 24 26 28 30 :12 34 36 38 40 42 44 46 48 50 55 6O 65 70 80 9O 100
12{)

9.415t 10,656 9. 194[ [0.38fi 9.0151 t0.167 8.861i 9.987 8. 7441 9.837 8. 6391 9.70~ 8,5491 9.6(X 8.4711 9,50~ 8. 4021 9.42] 8.3411 9,34~ 8. 2871 9,28~ 8. 2381 9.22~
8

1.569t12. 422 ,2.900 L3,418t 13.875 14.650 5 . 2 9 4 [5.848116.3311 l. 259I 11. 953 .2. 529 L3,025113,458,14. 195 ,4.807 [5.332115.7981
m

1.1110 11.678 L2. 232 [2.707113. 124 13.830 ,4.416 t4.918015 . 3 5 7


i

). 805 11. 452 L1.989 [2.448112.851 13,532 ~4.096 [4,580[15. (R)2i 3. 634 ! l . 263 L 1.785 t2. 230112.622 13.283 3.83(} 14.299[ 14.707[
I t

9.489 l I . 103 [1.612 12.047112.428 13.072 L3.6O5 [4.060[14.457[ 9. 364 10. 966 I1. 464 1.1.8!X)112.263 12.892 [3.411 13 . 856114 . 243* 9. 256 10. 847 [1.33fi 1.1.753112. 119 12.735 [3.244 3 679 14 058 {). 161 U).742 [1.22,1 11.634111.993 12.598 [3. {}.q~:13.525 13.896 9.077 10.651 1.1. 124 11.528111.882 12.477 12.96~.] 13.388113.754] D. 003 10.568 t1.036 11.434111.783 12.370 [2.854 13.267113.626 9.936 10.495 lo. 957 11.351111.695 12.274 [2.75~ 13.159113.513 9. 875 10.42~ 10.88~ 11.275111.615 12.187 i2.65,c 13.061113.411 9.821 10.36 c 10.82~ 11.207111.543 12.109 12.57~ 12.973113.319 9.771 10.315 t0.76:: 11. 145111.478 12.035 [2.50~ 12.893113.235 9.72fi 10.26~ 10.71( 11.088111.418 11.97~ 12.43( 12.820113.159 [ 9. 629 10.15~ I0.59[ 10.966111.289 11.834 12.281 12.662112.994 9. 548 10.07(. 10.50( 10.865111.184 i1 l. 7lC~ 12.15, 12.533112.858 9.481 9.99~ 10.42] 10.781111.095!11.62.~ 12.05~ 12.425112.74t~ 9.424 9.934 10. 35:. 10.710111.02Cili.541 II.96. 12.334[12.65C 9. 333 9.834 I0.24( I0.595110.90C~11.41L 11.83( 12.187112.4,(~
I

1951 9.17(

8.155 9.12~ 8. 119[ 9.07! 8.086 9.03~ 8.015 8.95~ 7.956 8.88~ 7. ~ 7 .8.82~ 7.865 8,77~ 7.798 8,69,' 7.701 8.58( 7.645 8.50~ 7. 585 8.43~ 7.526 8.363

7.747 8.62 ~, 9.262 9.75( I0.161 10.508110.6o7[11.311 11.72~ 12.074112.3791 9.206 9.69~ I0.09~ lO. 438110.734 l11.23111.63~ 11.985112.2N
i

9.124 9.6O~ I0.00( 10.335i10.62 11.114 11.51'~ 11.854i12.14~ 9.042 9.51f 8.961 9.428 9,90~ 10.233110.51~110.99~ 11.39] 11.723'12.0121 9,80~

150 200

10 133110.414!10 884 11.27( 11 5


I . . . . i

tll
d

Computations of some multivariate distributions

863

Table 1la (Continued) c~=O.Ol, p=2, [3= 1, 6 = 0

2O 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 55 6O 65 70 8O 9O 100 120 150 2OO

14.901116.403 17.507 18. 38611"9.12119,75120. 307 21. 24922.03522,71123. 305 14.411115.828 16.867 17.691118,379 t8.9O~ 19.48720.36621.09721.72622.277 19. 66820. 35720. 949121.467J 14.021115. 371 16.35~ 17.136117. 79118.34C~ 18.83~ 16.692117.31517,84718.314 19.10319.75920.321[20.812[ 13.702[14.99~. 15.94~ I I 16,322116. 92017.43,~ 117.88( 18. 63719. 26519. 803[20. 273] 13.438114.6~I15.60~ 13.214114.4~!15.31~ 16.010116.58,q 17.08~!17.51~ 18.246 18.85l 19.36~19.8211 13. 024114.20~:115.06~ 15.744116.30~ 16.78~ 17.20,~ 17.913 18.49~ 19.000119.437] 12. 859114.01~]14.85~ 15.516116.063 16.525 16.93f 17,626 18.196 18.682119.107 12.714113.84~[14.67( 15.316U5.850 16.30~[16.70~ 17.377 17.932 18.405118.8~ 12. 587113.701114.50~ 15.140115.664 16.11(] 16.50( 17.158 17.701 18.164118.56F 12.475113.571114 . ~ I 14. 985115.49~ 15. 937 [16.31~ 16. 964 17. 496 17. 950118.,34~ 12.374113.454!14.23~ 14.846115.35l 15.782 16.15f 16.791 17.314 17.758118.147 12. 284113.349 14.11~ 14.721115.21915.64316 . ~ 16.63617.15017,586117,970 I 12. 202113.255 !14.014 14.6o9fl5.10(315.51815.88~ 16.49717.00317.434117.810 12.128113.16~ 113.91~ 14.507114.99215.40,~ 15.76~ 1.6.37016.86917.294117.66,~ 12.060113 .og~J!13.83~ 14.414114.89315,30 ! 15.65716.25516.74817.167117.533 11.914112.922 13.64~ 14.214114.68215.07c315.42,~ 16.00816.48716. 895[17.250 11.795112.78't113.49~ 14.051114.50814.89~ 15.231: 15.80616.27416.672117.019 , 15.63816.09716.487116.827 11.6.95112.66S ll3.36~ 13.915114.36414.74615.07 c 11.611[12.572113.261 13.799114.24214.61'~ 14.941: 15,49615.94716.331116.665 11.476,12.41~ i13.08~ 13.615114.04714.41414.73~ 15.26915.70816.082116.40~ ~1.393112.297 i12.95' 13.474113.898 14.25~]14.571 15.095 15.526 15.89t116.20~ 11.2,(~112. 203112.855 13.363113.781 14.135 14.44~ 14.959 15.381 15.742116.054[ 11.169112.063112,702 13.199113.607113.953,14.25414.75715.170 15 . 520115 . 8241 11.050111.926t12.551 13.03813,43713.77~ !14.05~ 14.56014.9O215.304115.6O0! 10.9321ll. 792112.4?4 12,87913.27013.60C '13,8W 14. 36614.75915,091115,3801
I i

864

P. R. Krishnaiah

T a b l e 1l b A p p r o x i m a t e P e r c e n t a g e Points of T~a 2 x Statistic p=2, "'o N I ,8=(N3

1)/N, 8= -- 1 / N
4 5 6 l 7 [ '8 c~=0.05 [ 9 10 1l 12 i 14

2f 2~ 24 2f 2~' 3( 3~ 3t 3f 3~ 4(
4.~

6.88 6.72 6.58 6.47 6.37 6.'29 6.23 6.17 6.12 6.07 6.03
5.94

8.53 8.30 8.13 7.98 7.86 7.75 7.66 7.58 7.51 7.45 7,40
7.29

9.72 9.45 9.24 9.06 8.92 8.79 8.69 8,611 8.51 8.44 8.38
8.25

10.64 10.3t; 10.12 9.92 9.75 9.61 9.49 9,38 9.29 9.21 9.14 8.99 8.87
8.78

11.44 11.10 10,83 111.61 10.42 i 1t).27 10.14

12.10 11.73 11.44 11.20 11.00 10.83 111.69

12.67 12.28
11.97

13.18 13.56 12.76 [ 13.19 12.43 I 12.84 12,16 / 12.55 11.93 i 12.31 11.74 12.1l 11.57 11.43 l l .31 11.20 11.10 10.91 1fl.75 111.63 10.52 10,119 9.88 .1t.78 11.94 11.79 11.67 11.55 11.45 11.24 11.08 10.95 10.84 10.39 111.17 10.06

14.04 13.58 13.21 12.91 12.66 12,45 12.27 12.12 11.98 11,87 11,76 11.54 11.37 11.24 11.13 10.65 10.43 10.32

14.76 t4,26 13.86 13.54 13.28 13.05 12.86 l 12.69 12.55 12.42 12.30 12.07 11.89 11.74 ll.62 11.12 10.88 10.76

11.71 11,49 11.31 11.16 1 l . 112 111.91 10.81 111.71 111.53 10.38 111.26 10.16 9,75 9.55 9.46

10.02 i 10. ~:i 9.92 i 10.45 9.83 ~ 10.35 9'75 !t.59 9.46
9.36

5( 5! 6( 10( 15( 2(1(

5.88 5.82 5.78 5.59 5.50 5.45

7.20 7.13 7.07 6.82 6.71 6.65

8.11 8.06 7.99 7.70 7.56 7.49

10.27 10.09 9.95


9.84

8.71/ 8.37 8.21 8.13

9.27 8.91 8.74 8.64, "

9.75 9.36 9.18 9.09

Computations of some multivariate distributions


Table 1 lb (Continued) , = 2 , f l = ( N - I ) / N, ~ ,~ .N'
3 4 5

865

--I/N
6 7 8 9

10

11

12 i

14

a=0.025 20 22! 24, 26i 28i, 8.46 8,22 8.03 7.87 7.74
10.37

10.05 9.80 9.60 9.42 9.28 9.16 9.05 8.95 8.87 8.80 8.64 8.52 8.4a 8.35 8.01 7.85 7.78

30i 7.63 3,~' 7.53 7,45 7.37 381 7.31 40 ! 7.25 7.13 7.04 6.97 6.51 I 6.65 J 150 6.52 200 6.46._ i

11.73 12.80 11.36 12.38 11.06 112.05 10.82 i 11,78 10.62111.55 10.45 11.36 10.30 11.19 10.18 11.115 11).07 10.93 9.97 10.82 9.88 10.72 9.71 10.52 9.57 10.36 9.45 10.24 9.36 10.14 8.97 9.70 8.78 ; 9.49 I 8.69 I 9.39

13.67 13.21 12.84 12.55 12.30 12.09 1"1.91

14.41 13.91 13.51 13.19 12.93 12.70 12.51 11.75 12.34 11.62 12.211 11.511 12.117 11.39 11.96 11.18 11.72 11.01 11.54 10.87 11.39 10.76 11.27 10.28 111.76 10.06 10.52 9,93 10.40

15.04 14.51 14.09 13.75 13.47 13.23 13.02 12.85 12.69 12.56 12.44 12.19 12.00 11.84 11.71 11.18 10.92 10.79

15.61 15.05 14.60 14.24 13.94 13.69 13.47 13.29 13.13 12.99 12.86 12.60 12.39 12.23 12.10 11.53 11.27 11.13

16.11 16.56 15.52 15.95 15.05 15.46 14.68 15,117 14"36 14.74 14.10 14.47 13.87 14.23 13.68 14.03 13.51 13.8t}
13.36 13.711

13.23 13.57 12.96 13.28 12.75 13.06 '12:58 12.89 12.44 12.74 11.85 12.13 12.62 11.57 I 11.84 ] 12.31 ] 11.43 11.70 12.16

17.36 15.70 16.18 15.75 15.411 15.11 14.86 14.65 14.46 14.29 1,t.15 13.&l 13.61 13.42 13.27

9
18.44 17.69 17.10 16.62 16.22 15.88 15.60 15.35 15.14 14.95 14.79
14.45

lO'r
19.08 18.29 17.67 17.16 16.75 16.40 16.10 15.84 15.62 15.42 15.25 14.89 14.62 14.39 14.21 13.45 13.10 12.92

11 l
19.66 18.83 18.18 17.66 17.22 16.85 16.54 20.17 19.31 18.64 18.09 17.64 17.26 16.94 16.66 16.28 1'6. O4 16.42 15.84 16.21 15.66 16.03 15.29 15.64 15.00 15.:t4 14.77 15.10 14.58 14.91 13.79 14,09 13.42 13.71 13.24 13.52

14 i 21.07

ot=O.O1 20 22 24 26 28 311 32 34 36 38 40 45 ,rXI 10.72 12.99 10.36 12.53 10.07 12.16 9,84 11.86 9,64 11.63 9,47 11.40 9,33 11.22 9.21111.06 9.10 10.93 9.01 10.81 8.93 10.70 8.75 10.48 8.62 10.31 8,51 10.18 8.42 10.07 8.05 9.59 9.37 9.26 14.611 14.07 13.63 13.28 12.99 12.74 12.54 12.36
12.20

55 6C 101: 151:i 7.87' 21~) 7,79'

12.06 11.94 11,69 11.49 11.33 11.20 10.66 10.40 10.28

15.85 15.24 14"76 14.37 14.05 13.77 13.54 13.34 13.17 13.01 12.88 12.60 12.38 12.2l 12A~ 11.46 11,18 11.04

16.86 16.20 15.68 15.25 14.90 14.60 14.35 14.13 13.95 13.78
13.63

17.71 17.00 16.44 15.98 15.62 15.30 15.02 14.79 14.59 14.41 14.26 13.93 13.68 13.48 13.32 12.63 12.30 12.14

13.3:} 13.09 12.90 12.75 12,10 11.79 11.62

14.18 13.97 13.80 13.07 12.73 12.57

20.16 19.44 18.86 18.37 17.97 17.63 17.:~| 17.0~ 16.86 16.0i 11;.25 15.93 15.1~ 15,4B 14.61 14.21 14.01

866

P. IL Krishnaiah

Table 1 lc ,2 Approximate Percentage Points of 7m~ , Statistic a=0.05, p=2, ]3=2, 6 = 1

,t51
16 18 20 22 24 26 28

8110

1211,!1

19. 747] ~2. 164 23.878125.19~ 26.28.ti ,)7. 153127. ,~J'11129. 150130. 130130.933 31.61,8 19 033121.31222.929124.17425.20t: ~6.01.t~26.73627.91(128.82329.60530.251 18.489 20.66622.208123.397 24.374 25. 15925.844 26.96827.86228.597 29.21,) 18.06t 20. 159 21.644122.787 23.727 ,94.484125.146126.230127.094127.806 28.409 I 17.71619.75021.189122.297 23.20~ 13 . 942124 . 583125 . 636126. 476127 .169 27.756 I 17. 432 19.414 20.815121.894 22.77~ 23. 496124.121125.147125.979126.644 27.219 I 17.195 19.132 2(I. 502121. 556 22.42(: ,~3.123123. 734[24. 739 25. 542126. 205 26.770[

30
32 34 36 38 40 42 44 46 48 50 55

16 ~318 . 893 2,, . 236,21 27~22 11~ 22 80~12~ 40~[24 39~25 ~81125 8~3126 38~ I
16.81[ 18.688 20.008121.024!21.85~ ;~2. 534 23 .124124 . 094 24 . 870 25 . 513 26 . 060 [ 16.66~ 18.509 19.810t20.811 21.62~ ;~2. 298 22 . 879 23 . 835 24.601 25.23525.774 16.536 18.353 t9.636[20.624 i21.43(: d2.091122.664123.60fl 24.364124. ~H[25. 525 [ 16.418 18.215 19.483120.458 ]21.25~ el. 909122.475123.408 24.155124.775125.303 I 16 . 31318 . 09219 . 346120 . 311121. 09~ )1.746122.306123.23l)23.956124.584125.106 16.22017.981 19. 223120 .17~120 . 21. 600122 .155123 . 070 23 . 803 24 . 41124 . 930 16.135 17.882 19. !13120.06[20.83~. 21.469122.019122.926 23.653 24.257 24.772 16 . 05917 . 79219 . 014119 . 953 20 . 71~ 21.350121.896[22.79623.517!24.117124.628 15.989 17.71018.922119.855,20.61f 21. 242 21. 784122 . 677[23 . 394 23 . 989 24 . 497 15 . 92617 . 63518 . 839119 . 76~ 20 . 521 2 1 . 1 4 3 2 1 . 6 8 1 2 2 . 5 6 8 2 3 . 2 8 1 2 3 . 8 7 2 2 4 . 3 7 7 15.789 17.473 18 . 660119 . 573 20.31~ 20.929121.460122.335123.037123.621124.118

6o 15.675:17.34018.512119.414 20.147 20.754121 278122 ,42122 836123.413123.91~[


65
70 80 90

15.~1117.22918.389,10.283:20.00,

e0.607121.126121.982122.669123.240123.727 I

15.500'17.134, 18. 284119.16~19. 887 15.371i 16.982 18.116118.987 19.69~ el).

zo.48312o.997121.845122.526123.o9:~}23576 i 28212o.789121.625 22.297122.856123.333 [

15.271!16.865 17.986118.8414 19.54~ ;~0. 129120.630121. 456 22.121122. 674123.147[ ,~ . 7120 . 50,121. 323 21. t 15.,93i~6:77317.8~118.737,19 431.0 082122 . 5~0122 . 6 0.()q .i 15.075,~6.~ 17 . 732118. 57419. 25~ 19.8~6120.316121 12~2,.775122.316122.7~,, I 14.960116.500 17.581118.412 19.08~ 19.648120.131120.92,q 21.571122.105[22.563]

,~, 120
150

2,~)

14.~,5116.3~ 17.433,18.25~ 18.91~ 19.472119.048,20.736 21.362121.89712~ 3,91

Computations of some multivariate distributions


Table 11 c (Continued) a~O.Ol,p--2, f l = 2 , 8 ~ 1

867

t 3 L2_J
16 18 20 22 24 26 2~ 30 32 34 36 ~8 40 42 4,1 46 48 5O 55 6~ 65 70 8~ 10~ 12C, 15C 20~

I '

I 1,1

,61

t2.341 35.475137.710 t9.437 10.83~ 12.005143.00644.645 15.94~ t0.701 ~3.588q:~5.645 17.236 38 526 39.605140.531 42.050 1,3.26(] ,~9.474 32.179134.104 15. 595 ]6. 804 37. 817138. 686 40. 114 11.25( !8. 522 ~1. 087132. 913 14. 325 ]5. 472 36. 433137. 259 38,618 39.70( ~7.763 ~0.21713t.964 13.315 ~4'412 35,332136.12337.42(~ ]8.471 !7.143 59.5l~131. 191 12.492 ~3.54 c 34.436135.198 36.45~ 37.46~ ,~6.628 ~8.920130.549 II. 8011 32.83~ 33.692134.431 35.65C 36.63f ,~6.193 ~14.d 2;3110.(X)7 II. 233 32.22!: 33.065133,784 34.971 ~5.92f
i

~,5.822 t7.999l~9.545 ~0.741 ~1.71~ 32,528133.23034.391 35.324 ,a.4,)9 ~7.631129.145 10.316 H.267 32.067J32.75433.89(] 34.80~ ~5.218 57.311l~8.797 ~9.945 ~0.87 c 31.663132.33733.453 34.351 ~4.97C ~7. 029i 28. 489 ,~9.61,q 30.53f 31. 30713l . 971 33.06~ 33.95~ ~4.75C ~6. 778 i28. 217 ~9.32~ 30. 235 30. 992131 . 646 32. 727 33.59~ ~4.553 ~6.555127.974 ,~9.071 59.96~ ~0.71 l[31.35632.422 33.28~
i

~4. 377 ~6. 354 27. 755 ~8.83~ 59.71~ 30. 458131. 095 32,14~ 32.99~ ~4.217 ~6.172 27.558 ,~8.628 5 9 . 4 ~ 30.230130.85931.901 32.74~ 54.072 26.008 27. 379 t8.43S 29.2.~. 30. 023130. 646 31. 677 32. D . 9 4 f 25.85727.215 ~8.26d ~9.11~ 29.834130.451 31.47-~ 32.29~ ~3.65~ 25. 534 26. 865 ~7. 891 ~8.72f 29. 428J30.032 3 l . 03~ 31.84( ~3. 422 ~5. 269 26. 575 57.58~ ,-,8.40f 29.095J29.68S 30. 672 31.46( 53.227 25.047 26. 335 ~/. 33(i 28.13~ 28.817129.402 30.37~ 3 1 . 1 ~ 53. 062 ~4. 858 26.132 ~7.114 27.912 28. 582129.16(139.117 30.89( ~2. 792 24. 557 25.80~ ,-,6.76S 27.54. 28. 206128.772 29.71f 30.46~ t2. 594 24. 330 25. 557 ~6. 505 27.27~ 27. 919t28. 474 29.39~ 30.14~ ~2. 434 24.148 25.35C ~6. 294 27.05~ 27. 691128.241 29,15~ 29.88~ ~2. | 9~ 23. 880 25.06S ,-,5.98.~ 26.73( 27. 355127. 894 I28.78~ 29.51( 21 .,c~5~ 23.61524.782 25.681 26.41] 27.025127.55~128.431 29.13! 21 . 73~ 23.357 24.50~ ~5.38~ 26.09~ 26.700127.2l~ i28.08f 28.77~

868

P. R. Krishnaiah

r~ 0 o

~5

~5

&

Computations of some multivariate distril~tiona"

869

~5

I
eq

[.

7,

870

P. R. Krishnaiah

~5

w.

t"q

Computations of some multivariate distributions

871

~5

O eq

e~

872

P. R. Krishnaiah

.q

II

t"q 0

Computations of some multivariate distributions

873

~5

c5

II

.z

874

P. R. Krishnaiah

i
q

,,~o6

() xl oo

Computations of some multivariate distributions Table 13 Percentage Points of the Largest Root of the Wishart Matrix

875

p=2

0.10 3 5 7 9 :1` I 13 15 17 19 " : ~ 1 23 25 27 29 31 35 37 39 41 43 45 47 49 p=3 0.10 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 :1.3.26].6 :1.6 9 2 3 7 20.3014 23.5041 26~5852 29 5 7 5 5 32.4947 35,3560 38,1694 40,9419 43.6793 4 6 3 8 5 8 49.0651 51,7201 543532 56,9665 59,5617 62,1404 64,7038 67,2531 69,7893 72~3134 74.8260 77 3 2 8 0 9,0027 :I.2,5202 15o7229 18.7500 216613 24,4890 27,2525 29,9647 32.6348 35,2695 378738 40,45:1.8 43.0065 45.5407 48,0564 50,5554 53.0392 555092 57~9665 60o4121 6?8467 65o2713 67,6863 70,0926

0 ~ 05 10,7403 14,4887 17,8778 21,0659 24.1212 27,0800 29.9651 32,7910 35,5683 38,3047 41,0060 43.6768 46~3209 48,9411 51,5399 54. 1 1 9 4 56,6814 59,2273 61,7585 64~2761 667810 69,2743 71 ~ 7 5 6 6 74,2286

0,025 12,4160 16 ~ 3 6 0 4 J.9 ~ 9 0 9 1 23.2359 26.4153 29.~ 4 8 7 8 32o4781 35.4027 38.2731 41. 0 9 7 9 43~8836 46.6353 49,3571 52,0523 54~7235 573733 60,0034 62.6155 65~2112 67.7917 70.3580 72,9112 75,4522 77,9817

0,01 1.4o5681 18 ~ 7 3 4 8 22~4665 25~9528 29.2757 32.4797 35,5922 38~6313 41~6100 44 ~ 5 3 7 8 47,4218 50~2677 53~0801 55~8626 58,6184 61.3501 64~0597 66.7492 69,4202 72.074:1 74 ~ 7:i.22 773355 79o9451 82~5418

0.05 :1`5 2 4 0 7 19,0887 22,6240 25,9661 "..'.).9.1736 32 ~ 2 8 0 4 35~3081 38.2718 41,1819 44,0465 46.8720 49,6631 52,4239 55o1575 57,8667 60,5539 63,2208 65~8692 68,5006 71o1163 73,7173 76.3047 788793 81 4 4 2 0

0,025 17,1198 21o1270 24,7980 28~2607 31.5778 34 7 8 5 9 37,9082 40.9608 43.9554 46,9005 49.8030 52,6681 55~5001 58,3026 61.0785 63,8302 665599 69.2695 71,9605 74,6343 77,292]. 79.9350 82 ~ '.:';641 8~:;~1 8 0 0

0,01 19,5014 23,6908 27.5183 31.1206 34~5650 37. 8 9 0 8 4:1.. : 1 ` 233 44~2798 47.3730 50o4121 53~4046 56.3562 59~2716 62,1547 65.0086 67~8361 70~6394 73.4207 76,1816 78,9236 816481 84.3562 87~0491 89 7 2 7 6

876 Table 13 (Continued) p=4


0 10
17.4596 21,2019 24.6905 78.0116 31,2117 34,3:1.88 37, ~.~5:L8 40,3238 4Z.2445 46,1212 48.9597 51,7646 54~5396 57~2879 60,0121 62,711.42 65.3963 68,0600 70,7065 73,3374 75,9535 785559

P. R. Krishnaiah

005
19,6300 23,5305 27~1572 30,6028 33,9168 37,1300 40,2624 4Z o 3 2 8 4 46,3384 49,3004 52,2208 55,1044 .57,9554 60,7771 63~5724 66,3437 69,0930 71,8221 74,5325 77,2256 79,9027 825647

0,025
21,6721 257087 29.4550 33,0085 36,4219 39,7274 42,9467 46 0 9 4 8 49,1830 52o2197 55.2117 58~1643 61,0819 639680 66.8258 69,6577 72,4660 75.2525 78o0191 (30 7 6 6 9 83~4976 86,2120

0,01
24.2397 28,4331 32,3178 35,9967 39,5257 42,9390 46,2597 49,5038 .52,6835 55,8077 58.8838 61.9:!.73 64.9130 67,8747 70.8058 73,7090 76,5866 79,4408 82,2732 85,0855 87,8791 906552 93,4150

9
:L 3 15 17

19
2 :L 23 25 27 29 31 35 37 Z9 41 43

45
47 49

81.1456

85.2127

88o9113

p=5
0

10

0~05 23~9543 27 ~ 8 8 6 5 31,5764 35,0973 38,4917 41.7868 45,0012 48,1483 5;L 2 3 8 Z 54. 2 7 8 9 57,2763 60,2355 63+1606 66~ 0 5 4 9 68~9214 71,7624 74.5802 77.3765 80,1529 82,9110 85,6518 88,3766 91 0 8 6 4

().025 26,1349 30~ 1 8 7 2 33.9846 37,6040 41 ~ 0 8 9 6 44,4702 47,7654 50. 9 8 9 2 54o1524 57,2632 60~3280 63~3523 66,3402 69. 2 9 5 4 7 .......... .... ~ ..~o. Ll 75+1197 77,9934 80~8444 83,6742 86,4843 89,2762 92,0509 94,8096

0,01 28,8616 33 0 5 2 7 36o9751 40,709.1 44,3013 47,781('3 51,1715 54,485:[ 57~7342 60 9 2 7 3 64~0714 67,1721 70~2340 73 2 6 0 9 76;2[";6:[ 79,2224 82.1621 85.0774 87,9699 90,8415 93~6935 96 ~ 5 2 7 1 99 ~..~434

6 8 .1.0 12 14 .1,6 .I. 8 20 22 24 26 28 ~30 32 34 36 38 40 42 44 46 48 50

21 6 2 2 5 25 4 1 5 9 28 9 8 1 8 Z2 3 8 9 6 35 6 7 9 4 38 8 7 6 6 41 9 9 8 6 45 0 5 8 2 48 + 0 6 4 6 51. 0252 53+9458 56,8308 59,6843 62 5 0 9 2 65. 3 0 8 4 68 0 8 4 1 70 8 3 8 2 735724 76 2 8 8 2 78,9871 (31 6 7 0 0 84 3 3 8 1 86 9 9 2 3

Computations of some multivariate distributiorm

877

Table 13 (Continued)
p~6 0,10
7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 4','3 47 49

0,05 28.2351 32.1884 35,9245 39.5031 42,9608 46,3216 49.6026 52,8166 55.9729 59.0792 62,:[414 65.1645 68~1524 71.1086 74.0358 76.9367 79.8134 82,6676 85.5011 88.3152 91,1113 93.8905

0.025 30.5374 34.5987 38.4331 42,1025 45~6449 49,0857 52.4425 55~7286 58.9541 62,1268 65.2530 68.3379 71.3856 74.3998 77.3836 80.3394 83,2696 86,1760 89,0605 91,9246 94,7696 97.5968

0.01 33.4045 37,5912 41,5401 45~3155 48,9571 52,4914 55,9370 59~3079 62,6146 65.8653 69.0667 72,2243 75.3425 78,4251 81,4753 84.4958 87.4890 90,4572 93.4019 96.3249 99.2275 102.1112

25.7624 29.5910 33o2141 36.6886 40,0492 43.3186 46.5132 49.6448 52,7224 55.7531 58~7425 61.6952 64~6151 67.5052 70.3684 73.2069 76,0228 78.8177 81,5932 84,3507 87~0914 89,8163

p=7 0.10 8 I0 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 29,8857 33,7401 37.4074 40~9350 44,3531 47,6823 50,9374 54.1297 57.2678 60.3584 63.4070 66.4181 69.3955 72.3424 75,2614 78,1549 81,0249 83~8731 86,7012 89.5104 92.3021 95~0772 0.05 32,4846 36.4525 40.2240 43.8485 47,3573 50,7723 54.1091 57.3793 60,5921 63.7545 66.8725 69,9508 72.9932 76.0032 78,9836 81.9369 84,8652 87.7704 90,654:[ 93.5178 96,3628 99,1902 0,025 34.8960 38.9627 42.8250 46.5339 50.1223 53,6125 57~0208 60,3595 63.6378 66.8636 70,0425 73o1798 76.2794 79.3451 82,3795 85.3855 88.3653 91.3206 94.2534 97.1652 100.0573 102.9309 0,01 37~8897 42.0711 46.0394 49.8473 53,5287 57,1070 60,5994 64.0184 67,3741 70,6741 73.9249 77.1317 80.2987 83,4299 86,5282 89,5963 92,6367 95.6513 98.6421 101~6106 104.5583 107,4864

878 Table 13 (Continued)

P. 1~ Krishnaiah

p=8
0~I0 9 Ii 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 33.9966 37.8706 41,5731 45~1441 48,6100 51,9895 55.2962 58~5407 61,7310 64,8737 67.9740 71~0365 74~0645 77o0616 80~0300 82.9723 858905 88~7863 91.6612 94~5168 97.3540 0~05 36.7102 406888 44~4883 48~1499 51.7013 55~1619 58.5461 61,8647 65~1262 68.3376 71 ~ 5 0 4 2 74.6309 77~7214 80~7790 83,8065 86.8064 89~7807 92.7313 95,6599 98.5680 I01~4569 0025 39.2212 43.2909 47.1750 50~9159 54.5422 58~0740 61.5261 64 ~ 9 0 9 8 68~2340 71.5058 74 ~7 3 0 9 77o9142 81~0597 84~1707 87.2502 90~3008 93~3247 963238 99 ~ 2 9 9 8 102o2543 105,1886 0~01 4~. 3 3 0 5 46 ~5066 50.4895 54~3233 580375 61~6529 65o1848 68~6452 72,0431 75 ~3 8 6 0 78~ 6 8 0 0 81~9300 85;.1402 88,3143 91 ~4552 94.5651!1 97.6482 100~7045 103,7365 106~7458 109~7339

p=9 0 I0 0.05 0. 0 2 5 0. Ol

10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 5;0

38.0977 41.9872 45~7183 49~3254 52~8318 56.2544 59.6057 62~8957 66o1317 69~3203 72,4665 75~5744 78~6477 81 ~6894 I!14,7022 87~6884 90~6499 93,5886 96 ~ 5 0 5 9 99,4033 102,2820

40~9168 44.9035 48~7255 52.4180 560053 59.5050 629300 66.2907 69,5950 72.8494 76,0592 79.2?90 8?.3623 85.4624 88. 5 3 2 2 91,5739 94~5897 97.5814 100~5506 103o4989 106.4275

43o5196 47o5912 51~4926 55,2599 58~9182 62~4855 65.9754 69~3983 72~7626 76~0750 79.3409 82~5651 85~7514 88 ~ 9 0 3 0 92 ~ 0 2 3 0 95.1137 98~1774 i01~2160 104+2311 107~2243 110.1970

46~7359 50.9068 54~9010 58~7561 62.4978 66~ 1446 69.'7107 73~2069 76.6419 80.0226 83~3547 86~6431 89.8918 93o1043 96 ~ 2 8 3 6 99.4322 102.5525 105~6465 108.7158 111 ~ 7 6 2 2 114.7870

Computations of some multivariate distributions

879

Table 13 (Continued)

p=10
0 ~ I0 .1.1 .13 . 1 . 5; 17 19 21 23 27 29 31 42.1910 46 0928 49~8477 ':53 48~;4 ~7 0266 60.4865 63.8769 67~2068 70~4835 73 7129 76 8999 800487 83~1626 86.2448 892977 92~3236 95.3246 9 8 ~ ~3022 101.2'?;81 104~ 1938 0.05 4~;. 1 0 8 1 4 9 ~ .1.010 52.941~; ~56 ~ 6 6 0 1 60. 2782 63.8116 67.2724 70~6702 74~0123 7 7 ~ 30[:.10 8 0 ~ 5~';34 83~7617 86.9336 90.0721 93.;[801 9 6 2,~597 99.3132 0 ~0 2 5 47.7962 PJ.1.o 8 6 8 9 '5~. 7 8 4 3 ~i9 ~ ' , " . 7 = ;3 8 6 3 2~'594 66.8E~7E; 70~3803 73~8377 7 7 ~ 237,~5 8 0 [;860 83 o8884 87~ 1493 90,3722 9 3 ,~.";606 96~7171 99.8443 102.9441 106~0186 I 0 9 ~069-4 1.1.2 0 9 8 0 0 o01 51.1123 55 2781 59~2814 6 3 ~ 154:1 66~9192 70.5932 74. 1890 77 7168 81 ~ 1844 84~E~987 87~9649 91 ~2878 9 4 ~;711 97 8183 101 0 3 2 3 1 0 4 2 1 ~',~; 107 3703 1 1 0 498~; 113 6019 116~6822

33
3t] 37 39 41 43 .45~ 47 49

102.3423
1 0 5 . 348~:;

108,333~-;

p=ll
0,10 1? 14 16 18 ?0 '22 24 26
28

0,05 49.2864 53~2843 ~57~ 1 4 0 4 60~8812 6 4 , ',';2 ~9 68~0890 7 1 . E:;814 7~-~~ ()120 78 ~3879 8 1 71'50 84.9981 88.2413 91~4482 94~6216 97.7644 100~8787 103~9666 107,0299 110~0702 113.0890

0o 02~9 52~0547 56.1277 6() ~ 07~48 63~8632 6 7 ~:5724 71~1973 7 4 ~ 7,492 78~2377 81 6 6 8 7 8'5 049'5; 883848 91o6788 9 4 o 93~-12 9 8 1,~!;68 101~3467 1()4 ~ ',~.;071 107.6400 1 1 0 ~ 747~?; 113.8310 116~8922

0,01 55~4643 59~6255 63.63,~59 6 7 ~,~5237 71o3087 7~;~ 0 0 6 4 78~6284 8 2 ~ 184.1, 8'5; ~ 6 8 1 0 8 9 ~ 12~;3 92. ~223 9~5.8763 99~1911 102.4698 10~5. 7 1 5 4 108~9302 1 1 2 ~ 1:1.6~!!~ 11',"); 2762 118.4109 121 ~5222

30 3? 34 36 38 40 4? 44 46 48 ",50

46~2777 50~ 1897 '5~3~ 9 6 4 7 ~7~6286 6.1. ~ 1 9 9 9 646927 6 8 ~ 117'5; 71.4829 7 4 ~ 79~';9 78 ~0620 8 1 ~ 28~=;9 8 4 ~471~3 87~6223 90~7412 93~8306 96~8929 99.9298 102~9431 1()~ 9 3 4 ~ 1()8 90~';3

880 Table 13 (Continued)


p=12 0.10 0.05

P. R. Krishnaiah

() ().~.,., "=

0.01

:L5 :1.7 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

50,3589 54.2793 58,0716 61,7581 65~3560 68~8777 72~3332 757305 79.0761 823753 85,6326 88.8518 92,0363 95.1888 98~3117 101,4072 104.4773 107,5237 110.5477

53.4539 57~4557 61 3 2 5 0 65 0 8 5 0 68 7531 72 3 4 2 3 75 13628 79 3 2 2 8 82 7 2 9 2 86 0 8 7 3 89 4 0 1 9 92 6 7 6 8 95.9156 99,1210 102.2957 10,':]. 4 4 1 9 108.56:L5 111.6564 114. 7281

56.2976 60,3707 64,3076 68,1322 71o8620 75,5105 79.0883 826037 86,0636 89.4737 92~8388 96.1630 99,4497 102,7019 105~9224 109.1133 112o2768 115.4146 118~5284

59,7957 63.9524 67,9687 71,8691 75,6717 79~3902 83,0355 86,6161 90. 1 3 9 3 93.6107 97,0356 100~4179 103.7613 :1.07 ~0689 110,3434 113,5872 116~8026 119~9912 123,1549

p=13 0.10 0.05

0,025 60,5273 64+6001 68.5453 72,3839 76,1319 79,8015 83,4024 86,9425 90,428:1. 93,8648 97o2571 100,6089 103,9234 107,2037 110o4523 113,6714 116,8631 120,0290 123,1709

0,.01 64.1092 68,2616 72,2827 76~ 1 9 4 2 80o0121 83.7491 87,4151 91,0182 94,5651 98,0612 101o5114 104.9196 108~2893 111.6234 114o9247 118,1953 121,4375 124.6530 127,8435

14 16 :1.8 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

54,4352 58.3629 62,1700 65,8766 69,4978 73,0453 76.5284 79,9544 83,3296 86.6.589 899467 93,1967 96o4120 99,5952 102,7491 105,8753 108,9761 112,0529 115.1073

57,6120 61,6170 65,4976 69.2744 72,9631 76.5755 80.1212 83,6078 87,0417 90~4281 93,7713 97.0753 100.3434 103,5782 106,7823 109.9578 113,1068 116.2309 119,3317

Computations of some multivariate distributions


Table 13 (Continued) p = 14

881

15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

58.5075 62.4414 66,2614 69~9856 736277 77.1984 80,7063 84o1584 87.5605 90,9173 94,2331 97.5113 100,7550 103,9668 107,1491 110.3040 1134332 116,5384

61.7619 65.7695 69~6599 734516 77o1585 80o7918 84,3602 878709 91,3298 94.7420 98o1116 1014423 104.7373 107,9992 111.2305 114,4333 117.6095 120.7609

64,7451 68~8175 72,7699 76,6209 80,3850 84,0735 87.6952 91.2575 94,7666 98.2276 101.6448 105.0219 108~3622 111.6684 114.9430 118~1883 121.4062 124.5983

68.4070 72.5555 76.5807 80,5016 84o3331 88,0865 91,7712 95.3946 98,9630 102~4817 105~9552 109,3871 112o7810 116.1396 119~4655 122.7611 126~0282 129~2687

p~15

0.10
16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

0,05 65,9046 69,9144 73.8134 77.6182 81~3415 84,9936 88.5826 92o1152 95.5970 99.0329 1024267 105.7821 109,1020 112.3890 115,6456 1188738 122,0755 125,2522

0,025 68.9529 73,0247 76.9831 80,8451 84,6236 88.3289 91o9695 95~5522 990828 1{)2,5660 106,0061 109~4066 112,7707 116,1009 119.3997 122,6693 125~9116 129.1281

0,01 72,6909 76.8358 808643 84,7936 88,6371 92.4054 96o1069 99.7488 103o3369 106o8763 110,3711 113.8250 117.2412 120o6225 123.9714 127,2901 130,5805 133,8443

62~5762 66~5154 70~3469 740868 77,7475 81,3391 84,8696 88,3454 91.7721 951543 984958 101,8001 105.0701 108,3084 111.5172 114o6986 1178544 120,9861

882

P, R. Krishnaiah

Table 14 Percentage Points of the Smallest Root of the Wishart Matrix p=2
0,01 3 5 7 9 11 ,0101. 11312 5 8 5 8 1.1720 1,8938 2,7200 3,6294 4~6068 5,6410 6,7236 7.8482 9.0094 i0~2030 11,4256 12,6743 13,9466 15,2405 16,5541 17,8859 19,2345 20,5986 21,9774 23o3697 24,7747 0.025 ,0253 .2948 ,13278 1,5437 2,3904 3.3355 4.3579 5.4428 6.5798 7.7610 8.9802 1.0,2326 11,5143 12,8221 14,1534 15,5060 16.8779 18,2675 19.6734 21,0944 22~5294 23,9773 25,4375 26~9090 ()~05
.0513

0o10
~I()54 ~6429 1~4593 2,4417 3.5361 4.7124 5,951.6 7,2415 8,5730 9.9397 :[1.3364 12,7593

13
15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
p=3

,4313 1,0893 1.9253 2.8848 3,9358 5.0579 6,2374 7,4642 8.7312 10,0326 11.3640 12,7219 14,1034 15,5060 16,9277 18.3669 19,8219 21,29:[6 22~7749 24,2706 25,7781 27,2964 28,8249

14,2053
15,6718 17,1567 18,6581 20,1747 21o7051 23.2481 24.8029 26,3685 27,9443 29,5294 31.,1234

O, 01 4 6 8 10 12 1.4 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 0 0 6 7 1322 4 4 9 7 9 3 0 0 1.5388 2,2501 3.0449 3,9094 4,8328 5,8071. 6.8258 7,8835 8,9761 10.0999 11,2520 12,4298 13.6311 14,8540 16.0969 17,3583 18,6369 19,9316 21,2413 22,5651

0. 0 2 5 ,0169 .2149 .6345 I.o 2 2 2 7 1.9382 2,7527 3.6470 4,6071 5,6226 6,6856 7,7898 8,9304 10,1.033 11.3049 12,5327 13~7842 15,0572 16,3502 17.6615 18,9897 20,3337 21,6925 23,0650 24,4505

0 05 0342 3142 8337 1 5221. 2 3340 3 2407 4 2228 5 2667 6 3626 7 5027 8 6812 9 8934 ii 1 3 5 5 1.2 4044 13,6973 15.01.21 16.3468 17,6998 19.0697 20,4553 21.8554 23.2690 24,6953 26.1334

O. 10 0 7 0 2 .4677 1.1146 1 9 2 5 5 2. 8 5 3 0 3,8686 4 9538 6 0 9 5 8 7 2 8 5 0 8,5146 9.7791. 11 ~ 0 7 4 0 12 3 9 6 0 13,7421 15. 1 0 9 9 16~4975 17 9 0 3 0 19. 3 2 5 0 20 7 6 2 1 22,2133 23,6776 25.1540 26.6417 28 1402

Computations of some multivariate distributions Table 14 (Continued)


p=4 0.01
5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

883

0.025 ,0126 .1698 .5177 1.0200 1,6436 2,3643 3.1647 4,0319 4,9559 5,9292 6~9454 7,9998 9,0883 10,2073 11,3541 12,5263 13.7215 14,9382 16.1745 17,4292 18.7009 19,9886 21,2913

0.05 ,0257 ,2481 ,6797 1.2685 1.9770 2.7800 3,6596 4,6029 5,6003 6,6443 7,7290 8,8497 10,0024 1].,1838 12,3913 13.6225 14.8754 16.1482 17.4395 18,7479 20.0723 21.4116 22,7648

0 ~ 10 ,0527 ,3691 ~9078 1.6026 2.4131 3,3134 4,2860 5.3183 6.4010 7.527]. 8.6910 9,8880 11,1147 12,3678 13.6450 14,9440 16.2629 17,6001 18.9543 20.3242 21.7088 23.1069 24,5179

0 0 5 0 1044 3671 7 7 6 6 1,3064 1.9351 2,6460 3.4263 4,2662 5,1580 6.0953 7,0730 8,0870 9,1336 10,2099 11,3132 12,4413 13,5923 14.7645 15.9563 17,1665 18.3939 19,6373

p=5 0,01 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
. 0040 +01365 3 1 1 0 6 6 8 7 1,1393 1,7045; 2.3496 3,0632 3.8363 4,66].3 5,5325 6,4447 7.3940 8,3768 9,3901 10.4314 II . 4 9 8 4 12,5892 13,7020 :[4,8[{53 15.9879 17,1584 18,3458

0.025
0101 . 1405 ~.4383 .8779 1,4323 2,0809 2.8078 3.60].3 4,45;21 5.3529 6.2977 7.2816 8,3007 9,3516 10,4314 11,5376 12.6680 13,8209 14.9946 16~1875 17.3985 18.6263 19,8700

0,05
.0205 .2053 .5753 i .0911 1,7217 2.4448 3.2441 4,1077 5,0264 5,9929 7.00].5 8,0474 9.1267 10,2362 11,373]. 12.5350 13,7199 14.92519 16.1515 17,3954 18,6563 19,9332 2]. ,2249

0,10
.042:1 .3054 .7680 1.3775 2~0996 2.91].0 3.7955 4.7410 5,7386 6.71..]14 7.8637 8,9810 10.1297 11 ~ 3 0 6 5 ].2,[;089 13.7347 14.9818 :[6.2487 :[7.5338 :[[..I,8359 2(). 1 5 3 8 21,4865; 22,1!1330

884 Table 14 (Continued) p=6


0.0.1.
7 9 Ii 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

P. R. Krishnaiah

0.025 .0084 .1199 .3806 .7719 1,2719 1.8625 2,5297 3,2627 4.0527 4.8929 5,7775 6,7020 7.6622 8.6550 9.6774 10,7270 11.8017 12,8996 14.0191 15.1586 16.3169 17.4928

O. 05 .0171 .1752
.4994

0. I0 .0351 ~2606 .6664 1.2103 1.8625 2.6025 3.4151 4.2893 5,2].64 6.1896 7.2034 8.2535 9.3360 10.4480 11.5866 12.7497 13,9353 15.1417 16.3674 17o6110 18.8713 20.1473

.0033 0 7 3 8 .2701 .5882 1.0122 1.5264 2.1182 2.7770 3,4945 4,2638 5,0793 5.9361 6.8305 7.7588 8.7182 9.7062 10,7205 11.7593 12.8208 13,9034 15.0058 16.1269

.9591 1.5282 2.1871 2.9211 3.7192 4,5726 5.4743 6,4189 7,4017 8,4188 9.4670 10,5436 11.6461 12,7725 13.9211 15.0901 :[6.2782 17.41341 18.7067

p=7
0,01 0.025 .0072 ,1047 .3365 .6895 1.1452 1.6880 2.3054 2.9873 3.7257 4+5141 5.3470 6,2200 7,1292 8.0714 9.0437 10.0438 11.0696 12.1191 13.1908 14.2831 15.3947 16,5245

0.05
,0147 ,1529 .4415 .8565 1.3756 1.9815 2,6610 3.4038 4.2017 5,0481 5.9377 6.8659 7.8291 8.8240 9.8479 10.8985 11.9736 13.0715 14.1907 15.3295 16.4869 17.6616

0.10
,0301 ,2274 .5890 1.0804 1.6758 2.3567 3.1094 3.9234 4.7905 5.7042 6.6593 7.6512 8.6765 9.7320 10.8151 11,9234 13.0551 14.2084 15.3817 16.5737 17.7832 19.0091

8 i0 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

0029 0644 2388 .5256 9 1 1 7 1.3840 1.9312 2.5438 3.2142 3,9358 4.7034 5.5124 6.3590 7.2398 8.1521 9.0934 10.0615 11.0544 12.0706 13.1083 14.1664 15.2435

Computations of some multivariate distributions

885

Table 14 (Continued) p=8


0,01 ,0025 ,0572 .2142 .4753 ,8300 1,2670 1,7764 2~3493 2,9790 3,6591 4,3849 5,1518 5.9564 6,7953 7,6658 8.5656 9,4924 10.4444 11.4199 12o4174 :[3,4357 0,025 0063 0929 ~3018 ,6234 1,0423 1,5449 2,1199 2,7579 3,4517 4,1951 49828 5~8106 6,6748 7,5723 85002 9.4563 10o4385 11.4448 12.4737 13,5237 14,5934 0,05 ,0128 ,1357 .3959 ,7742 I~2517 1.8130 2,4461 3,1414 3~8913 4,6896 5~5311 6~4115 7,3271 8,2749 9,2522 10,2566 11,2861 12+3389 13.4133 14,5081 15,6219 0~10 0264 ,2018 ~5281 ,9764 i~5245 2,1556 2,8572 3.6195 4,4347 5.2967 6,2003 7~1413 8,1162 9,1218 10,1557 11.2154 12~2991 13,4050 14~5316 15.6775 16,8414

11 13 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
'

p=9

0,01 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46
48

0,025 ,0056 .0835 ,2736 o5691 ,9570


1~4251

0,05 ,0114 ,1220 ,3589 .7067 1,1490 1~6720 2,2650 2~9190 3,6268 4.3827 5~1816 6.0195 6~8928 7.7985 8,7340 9.6969 10.6853 11,6973 12,7315 13,7864 14,8606

0~10 .0234 1814 ~4787 ,8911 1,3991 1 9 8 7 5 2.6449 3 3621 4,1319 4,9483 5 8 0 6 4 6.7022 7 6321 8 5932 9 5829 10 5 9 9 0 11 6 3 9 5 12 7 0 2 7 13 7871 14 8 9 1 2 16 0 1 3 9

50

,0022 ,0514 .1942 ,4339 ~7621 1,1691 1o6456 2.1842 2,7781 3,4218 4,1105 4,8401 5,6072 6,4086 7,2417 8.1040 8,9937 9,9087 10,8475 11.8086 12o7906

1,9634 2,5633 3o2181 3,9217 4,6695 5.4572 6,2813 7,1387 8,0269 8,9434 9,8862 10o8536 1:[,8438 12.8555 13,8873

886 Table 14 (Continued) p~10


0,01 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 ~0020 ,0467 1777 3 9 9 3 .7048 1.0856 1,5336 2,0419 2.6042 3,2155 3.8711 4.5672 5,3006 6.0681 6.8673 7,6958 8.5517 9.4330 10,3383 11,2661 0,025 ,0051 .0758 ,2503 ,5237 ,8849 1.3232 1.8294 2,3958 3,0159 3~6843 4,3963 5.1480 5,9360 6.7574 7,6095 8,4902 9.3973 10,3291 11.2841 12,2608

P. R. Krishnaiah

0.05 .0103 .1108 ~3283 ~6502 1,0622 1,5522 2,1100 2,7276 3,3982 4o1163 4,8773 5,6770 6.5:122 7~3799 8.2775 9.2029 10.1539 11o1289 12,:1263 :13.1447

0,10 .02111. ,1647 " ~4378 .8198 1,2932 1,8446 2,4633 3,1409 3,8704 4.6462 5,4637 6,3189 7,2084 8,1293 9,0791 10,0556 11.0569 12.0813 13,1271 14.1932

p~ll
0.01
12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 ,0018 +0428 .1638 ,3699 ,6557 1,0:136 1.4364 1,9178 2,4519 3,0340 3,6599 4,3258 5,0286 5,7653 6,5336 7,3312 8,1561 9.0067 9.8812 10~7784

0.025
0046 0695 2307 485:1 8231 1 2352 1 7132 2 2497 2 8389 3 4757 4 1555 4 8747 5,6301 6.4186 7,2379 80858 8,9603 9.8596 10,7822 11,7267

0,05
,0093 .1015 ,3025 ~6022 9880 1~4488 1o9757 2,5609 3~1983 3,8825 4.6092 5.3745 6.1752 7+0083 7o8715 8,7625 9,6793 10~6204 11~5841 :12,5690

0.10
,0191 .1509 ~4035 .7592 1~2027 1o7215 2~3060 2.9483 3,6418 4.3813 5.1622 5.9807 6,8335 7,7179 8,6313 9.5717 10~5371 11,5259 12,5364 :13,5675

Computations of some multivariate distributions Table 14 (Continued) p = 12


0.10 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 50,3589 542793 58.0716 61.7581 65,3560 68.8777 72,3332 75,7305 79.0761 82.3753 85,6326 88.8518 92,0363 95o1888 98.3117 101.4072 104,4773 107,5237 110.5477 0.05 53.4539 57.4557 61,3250 65.0850 68.7531 72,3423 75.8628 79,3228 82.7292 86,0873 89,4019 92.6768 95.9156 99.1210 102o2957 105,4419 108.5615 111.6564 114.7281 0,025 56.2976 60.3707 64.3076 68,1322 71.8620 75,5105 79.0883 82,6037 86,0636 89.4737 92.8388 96.1630 99~4497 102.7019 105~9224 109.1133 112.2768 115.4146 118.5284 0.01 59.7957 639524 67,9687 71.8691 75~6717 79.3902 83,0355 86.6161 90.1393 93.6107 97.0356 100.4179 103,7613 107,0689 110.3434 113,5872 116.8026 119.9912 123.1549

887

p=13
0.01 0025 0.05 0~ 10

14 16
18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

~0015 .0366 ,1416 ,3226 ,5758 .8955 1o2758 1~7113 2.1973 2.7292 3.3035 3.9167 4~5660 5~2487 5.9626 6,7056 7.4758 8.2715 90913

,0039 ,0595 ,1995 .4229 .7226 1.0910 1.52:[2 2.0071 2.5433 3,1254 3.7495 4~4121 5.1102 5~8412 6.6027 7.3927 8.2093 9~0509 9,9:[59

~0079 ,0869 ~2616 .5250 8673 1.2795 1o7539 2.2841 28644 34902 4o1575 4~8628 5~6030 6,3756 7~1780 8~0083 8~8647 9,7455 10,6491

.0:[62 ~1292 .3488 ,66:[7 1~0555 1.5199 2~0466 2.6288 3.2605 3.937:[ 4~6545 5.4091 6,1978 7~0181 7,8675 8.7441 9,6460 10~5715 11.5192

888

P. R. Krishnaiah

T a b l e 14 ( C o n t i n u e d ) p = 14 001 0.025 0036 ,0555 0.05 ,0073 .0811 0.10 01~:1 ,1205

15 17 :1.9 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

,0014 ,0342

.1327 ,3032 5428


,8464

,1868 ,3975 o6812 1,0311


1,4409

245;0 ,4934 ,8175 I ,2091


1.6612

3 2 6 7 ,6219
,9949

1~2086 1.6245 2,0895 2,5997 3,1514 3o7415 4,3672 5o0261 5,7158


64345

7~1803 7,9515

1,9050 2.4183 2,9767 3,5764 4.2141 4,8870 5.5925 6,3284 70926 7,8834 8,6991

2. :[676 2,7233 3.3237 3o9651 46440 5,3576 6.1033 6,8788 7.6821 8.5115 93653

1,4362 I9382 24945 3,0996 3,7488 4,4384 5.1648 5~9253 6,7172 7.5382 83864 9,2599 10o1572

p=15 ~ 0 O1 O. 0 2 5 0 05 0,10

16 18 20 22 24 26 28 30 32 34 36 38 40 $2 44 46 48 50

.0013 .0320 ,1248 .2860 .5134 8024 1.1483 1.5462 1.9922 2,4823 3,0133 3.5820 4~1860 4,8227 5,4900 6,1861 6,9091 7,6575

,0034 ,05~0 .1757 ,3750 .6443 9776 1.3689 Io8131 2.3054 2o84~0 3,4192 4,0340 4,6836 5,3655 6,0775 6.8178 7,5845 83762

.0069 .0760 ,2305 ,4655 ,7732 1.1463 1,5780 2,0629 2,5960 3,1730 3.7905 4.4451 5,1340 58548 6,6053 7,3835 8.1878 90164

.0141 o1130 .3073 .5867 .9409 1,3614 1.8410 23737 2.9543 3,5784 4,2424 4,9429 5.6772 6.4428 7,2374 8.0592 89063 9,7772

Computations o f s o m e multivariate distributions

889

T a b l e 15 Percentage Points of the Individual Roots of the Wishart Matrixi p=2 a~ m 3 4 5 6 7 ,4 !~ 10 t] 12 13 [4 15 L6 17 IS 19 2o 22 24 26 28 3o 35 40 45 50 4.61 6.09 7.49 8,83 10.15 ll.44 12.70 13,95 15.19 16.41 17.63 18.83 20.03 21.22 22.4[ 23.58 24.76 25.93 28.25 30,56 32.85 35.14 37.41 13.05 48.65 54.21 59.74 3.69 5.04 6.33 7.58 8.81 I0.01 11.21 12,.q~ 13,56 [4.72 15.87 17.02 18.15 19.30 20.43 21.56 22.68 23.81 26.04 28.2ti 30.47 32.67 34,07 40,33 45.76 51.16 56.54 3.00 4.23 5.42 6,59 7.74 8.88 [o.o1 11.12 12.2t 13.34 14.45 15.54 16.64 I7.73 18.81 19.90 20,98 22.06 24.21 26.36 28,50 30.63 32,76 38,06 43.35 48.6] 53.86 2.30 3.411 4.48 5.55 6.61 7.67 8.72 9.76 10.81 11.85 12,89 13,9;I 14.97 16.0!J 17.04 18,07 19.11 20.14 22.21~ 24.25 26,31 28.36 30.41 35.53 tO. 64 [5.75 50.84 .0l .025 i=l .05 .10 m 4 5 6 7 s 9 Io 11 12 13 14 15 16 17 18 19 2o 22 24 26 28 3o ;15 40 45 50 3.07 4.21 5.33 6.42 7.51 8.59 9.66 10.72 11.79 12.84 13.90 14.95 16.01 17.05 18.10 19.15 20.19 22.28 24.36 26.44 28.51 30.58 35.75 40.90 46.05 51 .lS~ 2,46 3.49 4.51 5.53 6.54 7,55 8.55 9.56 10.56 11.57 [2.57 13.58 14.58 15.58 16.58 17.59 18.59 20.59 22.59 24.60 26.60 28.60 33.60 38,61 43.61 48.61 2.00 2.94 3.88 182 5.76 6.71 7,66 ~.62 9.57 10.53 ll,49 12,44 13,41 14.37 15.33 I6.29 17.26 19.19 21.12 23.06 25.00 26.94 31,80 36.68 41.55 46.44 1,54 2.36 3.21 4.07 ~.94 5.82 6.7o 7.59 8.49 9.39 10.30 11.20 12.12 13,03 13,95 14,86 15.79 17,63 19.49 21,35 23,22 25.09 29.7~ 34.50 39.23 43.9~ ct p ,01 =3 ,025 i ,05 =] rl(I

p=4 .01 m 5 t; 7 b 9 Ill 11 12 13 [4 15 16 17 18 LiD 2O 22 2.30 3;23 4.16 5.09 6.02 6.96 7.89 ~,83 9.77 10.71 ll,66 12.60 13.55 t4.50 15.45 16. tO IS.3] 1.84 2.68 3.53 4.39 5,25 6.I~ 7.01 7.89 8.78 9.67 10.57 ll.47 12.37 13,28 14,19 15. TO [6.93 18.77 20.61 .025

1=I .05 ].50 2.26 3.04 3.83 .10 m 1.151.82 2.52 3.24 3.98 4.7.1 5.5] 6.29 7.08 7.88 g.69 ![.51 10.33 11.15 11.99 12.82 14.5o 16.20 17.92 19.64 21.37 5 7 s 9 IO 11 12 13 14 15 16 17 18 I9 2(~ 22 24 26 28 2(* 35 t0 15 5o a

p=4 .01 6.03 7.37 8.67 9.94 11.19 12.42 13.63 14.84 16.03 17.22 t8.39 19.57 20.73 21.89 23.04 24.19 26.48 28.76 31.02 33.27 35.51 II.0~ ~6.60 52,10 57.56 .025 5.20 6.46 7.68 8.88 10.07 11.24 12.40 13.55 14.70 15.84 16.97 18.10 19.22 20.34 21.45 22.56 24.78 26.98 29.18 31,36 :13.54 38.97 44,36 49.72 55.07

.~.=2 .05 4.54 5.73 6.89 8.03 9.16 10.28 ll.40 I2.51 13.61 I4.70 15.80 16.89 17,97 19.06 20.14 21.22 23.37 25,51 27.65 29.78 3[.90 37.20 t2.47 .47.73 52.97 .]0 3.85 4.95 6.04 7.11 8.18 9.24 10,3o 11.36 12.41 13.4~ 14.51 15.55 16.60 17.64 18.68 19.72 21.80 23.87 25.94 28.01 30.07 35.22 /0.35 45.49 50.61

4.64
5.46 6.29 7.12 7.97 8.82 9.67 10.54 11.40 12.27 13.14 14.02 15.75 17.56 19.34 21.13 22.93 27.45 32.01 36.60 11,21

24
26 28 :10 :15 .m 45 5o

20.22
22.13 24,05 25,97 30,7~ 35.6o t0.43 ,15.28

22.46
24.3I 28.97 33.65 38.36 43,07

25.74
30.16 34.61 39.09

890

P. R. Krishnaiah

T a b l e 15 ( C o n t i n u e d ) p=4 01 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 12.21 13.92 15.57 17.16 18.71 20.23 21.72 23.18 24.62 26.05 27.46 28.85 30.23 3].60 32.96 34.31 36.98 39.62 42.23 44.82 47.38 53.72 59.96 66.12 72.2l .025 10.83 12.47 14.05 15.57 17.06 18.52 19.96 21.37 22.76 24.14 25.50 26.85 28.19 29.52 30.83 32.14 34.74 37.30 39.85 42.37 44.87 51.05 57.15 63.18 69.]5

i =3
.05 9.74 11.31 12.82 14.29 15.73 17.13 18.52 19.89 21.24 22.57 23.90 25.2] 26.51 27.86 29.08 30.36 32.89 35.39 37.87 40.34 42.78 48.83 54.81 60.73 66.59 .10

p =5
.01 .025

i =I
.05 .10

IN
8.66 10.05 11.49 12.90 14.27 15.62 16.96 18.27 19.57 20.86 22.14 23.40 24.66 25.91 27.15 28.39 30.84 33.27 35.69 38.08 40.47 46.37 52.20 57.99 63.73 6 7 8 9 lO 11 12 13 14 15 16 t7 18 19 20 22 24 26 28 30 35 40 45 50 1.84 2.63 3.42 4.23 5.05 5.88 6.71 7.55 8.40 9.25 10.11 10.98 11.84 12.71 13.59 15.35 17.11 18.89 20.68 22.47 26.99 31.54 36.11 40.72 1.48 2:18 2.91 3.65 4.41 5.18 5,96 6.76 7.56 8.37 9.18 10.00 10.83 11.66 12.50 14.18 15.88 17.59 19.31 21.05 25.42 29.83 34.29 38.77 1.20 1.84 2.50 3.19 3.90 4.62 5.36 6.11 6.87 7.64 8.41 9.20 9.99 10.79 11.59 13.21 14.85 16.50 18.17 19.85 24.10 28.39 32.74 37.12 .92 1.48 2.08 2.70 3.35 4.02 4.70 5.40 6.11 6.84 7.57 8.31 9.06 9.82 10.58 12.13 13.70 15.28 16.89 18.50 22.60 26.76 30.98 35.24

p=5 ~z m
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 4.73 5.87 6.99 8.10 9.19 10.28 11.36 12.44 13.51 14.58 15.65 16.71 17.77 18.83 19.89 21.99 24.10 26.20 28.29 30.38 35.59 40.78 45.96 51.13 4.08 5.15 6.20 7.25 8.29 9.32 10.36 11.39 12.41 13.44 14.47 15.49 16.51 17.53 18.55 20.59 22.63 24.67 26.70 28.73 33.81 38.88 43.94 49.00 .01 .025

i=2
.05 3.57 4.57 5.57 6.56 7.55 8.54 9.53 10.52 11.51 12.50 13.49 14.48 15.47 16.46 17.45 19.43 21.41 23.39 25.37 27.36 32.31 37.27 42.23 47.19 .lO P.1 3.03 3.96 4.89 5.82 6,76 7.70 8.64 9.58 ]0.52 11.47 12.42 13.37 14.32 15,27 16.22 18.14 20.05 21.97 23.89 25.81 30.63 35.46 40.3] 45.16 6 7 8 9 10 [1 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50

p =5
.O1 9.16 10.62 12.03 13.40 14.75 16.08 17.39 ].8.69 I9.97 21.25 22.51 23.76 25,00 26.24 27,47 29.91 32.33 34.73 37.12 39.49 45.36 51.17 56.93 62.64 .025 8.16 9.55 10.89 12.2] 13.51 14.79 16.05 17.30 ]8.54 19.77 20.99 22.21 23.41 24.61 25.81 28.18 30.53 32.87 35.20 37.51 43.24 48.93

i =3
.05 7.36 8.68 9.98 11.25 12.50 13.73 14.95 16.16 17.36 18.56 19.74 20.92 22.10 23.26 24.43 26.74 29.04 31.32 33.59 35.86 41,48 47.05 52.58 58.09 .10 6.49 7.75 8.98 10.19 11.39 12.57 13.74 14.91 16.07 17.22 18.36 19.50 20.64 21.77 22.90 25.14 27.37 29.59 31,81 34.01 39.49 44.94 50.36 55,75

54.56
60.17

Computations of some multivariate distributions


T a b l e 15 ( C o n t i n u e d ) p =5
a

891

~=4 ,025 14.50 16.19 17.83 19.43 20.99 22.51 24.01 25.49 26.95 28.39 29.81 31.22 32.62 34.01 35.38 38.]0 40.79 43.45 46.08 48.69 55.13 61.46 67.71 73.89 .05 13.26 14.90 16.49 18.03 19.54 21.93 22.48 23.92 25.34 26.74 28.13 29.5l 30.87 32.22 33.57 36.23 38.85 41.46 44.03 46.59 52.90 59.12 65.26 71.34 .I0 11.93 13.50 15.03 16.51 ]7.97 19.40 20.81 22.20 23.57 24.93 26.28 27.61 28.94 30.25 31.56 34.15 36.71 39.24 41.76 44.25 50.42 56.51 62.53 68.49 a

p =6 ,01 .025

i .05

=I
-10

.O1 16.02 17.79 19.49 21.14 22,75 24.33 25.88 27.41 28.91 30.39 31.86 33.31 34.75 36.17 37.58 40.38 43.14 45.86 48.56 51.23 57.81 64.27 70.65 76.95

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28

7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50

1.54 2.21 2.91 3.63 4.36 5.10 5.86 6.62 7.40 8.18 8.97 9.76 10.56 1.1.37 13.00 14.64 16.30 17.98 19.66 23.92 28.24 32.59 36.98

1.23 1.84 2.47 3.]3 3.81 4.50 5.21 5.93 6.66 7.40 8.15 8.90 9.67 10.44 11.99 13.57 15.17 16.78 18.41 22.53 26.72 30.95 35.23

1.O0 1.55 2.13 2.74 3.37 4.02 4.68 5.36 6,06 6.76 7.47 8.19 8.93 9.66 11.16 12.68 ]4.22 15.78 17.36 21.36 25.43 29.56 33.74

.77 ],25 1.77 2.32 2.90 3.49 4.11 4.75 5,40 6.06 6.73 7.41 8.10 8.81 10.23 11.68 13.16 14.66 16.17 20.03 23.98 27.98 32.05

30
35 40 45 50

p =6
.01 7 8 9 10 1] 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 3.90 4.90 5.88 6.86 7.84 8.82 9.80 10.77 11.75 12.73 13.71 14.69 15.67 16.65 18.61 20.57 22.53 24.49 26.45 31.37 36.29 41.21 46.13 .025 3.37 4.30 5.22 6.15 7.08 8.01 8.94 9.88 10.81 ]1.75 12.69 13.63 14.58 ]5.52 17.42 19.31 21.22 23.I2 25.03 29.81 34.61 39.42 44,24

i~2 .05 2.95 3.82 4.69 5.57 6.46 7.35 8.24 9.14 10.04 10.95 11.85 12.76 13.68 14.59 16.43 18.27 20.12 21.98 23.84 28.51 33.20 37.92 42.65 .10 m 2.50 3.31 4.12 4.95 5.78 6.62 7.47 8.33 9.19 10.05 10.92 11.80 12.67 13.55 15.32 17.10 18.89 20.69 22.50 27.04 31.61 36.2] 40.84 7 8 9 I0 ll 12 13 14 15 16 17 18 19 20 22 24 26 28 3(~ 35 40 45 50

p=6 Ol 7.40 8.67 9.91 11.13 12.34 13.53 14.71 15.88 17.05 18.21 19.36 20,51 21.65 22.79 25.05 27.30 29.54 31.77 33.99 39.51 44.99 50.44 55.86 .025 6.61 7,81 9.00 10.16 11.32 12,47 13,61 ]4.74 15,86 16,98 18,10 19,21 20,32 21.42 23.62 25.81 27.99 30.16 32.33 37.72 43.08 48.42 53.74

=3 .05 5.96 7.12 8.25 9.37 .10 5.27 6.36 7.44 8,51 9.58 10.64 11.69 12,75 13.80 14.85 15.89 16.94 17.98 19.02 21.10 23.18 25.25 27.31 29.38 34.53 39.67 44.80 49.93

10.49 11.60 12.70 13~79 14.88 15,97 17.05 18.13 19.21 20,29 22.43 24.57 26.70 28.82 30.94 36.22 41.48 46.72 51.95

892

P. R. Krishnaiah

]'able

15 ( C o n t i n u e d ) p =6
.01

i =4
.025 .05 .lO m

p =6
.01 .025 .05

i=5 .10

m 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 59 12.41 13.94 15.42 16.88 18.30 19.70 21,09 22.45 23.89 25,14 26.47 27,78 29.09 30.39 32.96 35,50 38.02 40.5] 42.99 49.12 55.16 61.14 67.07 11.27 12.74 14,17 15.58 16.95 18.31 19.65 20.97 22.28 23.58 24.87 26.15 27.42 28.68 31,19 33.67 36.13 38.57 40.99 46.99 52.9t 58.78 64.60 10.35 11.77 13.15 14.51 15.84 17.16 18.46 19.75 21,03 22.29 23.55 24.80 26.04 27.27 29.72 32.15 34.56 36.95 39.32 45.21 51.03 56.80 62.52 9.35 10.70 12.03 13.34 14.63 15.90 17.16 18.40 19.64 20.87 22,09 23.30 24.50 25.70 28.09 30.45 32.80 35.13 37.46 43.2] 48.91 54.57 60.19

7 8 9 10 11 12 13 14 15 16 17 18 19 29 22 24 26 28 30 35 40 45 50

19.86 21.65 23.39 25.09 26.75 28.37 29.97 31.54 33.99 34.62 36.13 37.63 39.11 40.58 43.48 46.34 49.17 51.96 54.72 61.52 68.18 74.75 81.23

18.20 19.9~ 21.62 23.27 24.87 26.45 28.00 29.53 31.04 32.53 34.00 35.46 36.91 38.34 41.17 43.97 46.73 49.46 52.16 58.82 65.36 71.80 78.16

16.85 18.54 20.18 21.77 23.34 24.88 26.39 27.88 29.35 30.81 32,25 33.67 35.09 36,49 39.26 42.00 44.71 47.39 50.94 56.58 63.00 69.34 75.61

15.38 17.01 18.69 20.14 21.66 23.15 24.62 26.07 27.50 28.9] 30.31 31.70 33.08 34.45 37.15 39.82 42.47 45.08 47.68 54.08 60.38 66.60 72.75

.01 m 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 35 40 45 50 1.32 ]..91 2.54 3.18 3.84 4.51 5.20 5.91 6.62 7.34 8.07 8.81 9.56 11.07 12.60 14.16 15.73 17.32 21.35 25.45 29.61 33.82

.025

.05

.10 m

.O1

.025

.05

.10

1.05 1.59 2.16 2.75 3.36 3.99 4.63 5.29 5.96 6.65 7.34 8.04 8.75 10.20 11.67 13.16 14.68 16.2] 20.11 24.08 23.13 32.22

.86 1.34 1.85 2.40 2.97 3.56 4.17 4.79 5.43 6.07 6.73 7.41 8.09 9.47 10.89 12.33 13.79 15.27 19.05 22.92 26.86 30.86

.66 1.08 1.54 2.03 2.55 3.10 3.66 4.24 4.84 5.45 6.07 6.70 7.35 8.66 lO.O1 11.39 12.79 ]4.22 17.86 21.61 25.43 29.32

8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 ~0 35 40 45 50

3.33 4.20 5.08 5.97 6.85 7.75 8.64 9.54 10.44 11.34 12.25 13.16 14,07 15.90 17.73 19.58 21.42 23.28 27.93 32.61 37.3] 42.02

2,87 3.69 4.52 5.35 6.19 7.04 7.89 8.75 9.61 10.48 11.35 12.23 13,10 14.87 16.64 18.43 20.22 22.02 26.55 31.12 35.71 40.32

2.51 3.28 4.06 4.85 5.65 6.46 7.28 8.10 8.93 9.77 10.61 11.45 12.30 14.01 15,73 17.47 19.22 20.97 25.39 29.86 34.35 38.88

2.14 2.84 3.57 4.31 5.07 5.83 6.61 7.39 8.18 8.98 9.78 10.59 11.41 13.06 14.72 16.40 18.09 19.79 24.08 28.43 32.82 37.24

Computations of some multivariate distributions

893

T a b l e 15 ( C o n t i n u e d ) =7 .01 m
8 9

i .025 .05

=3
.10

p .01

=7 .025

J= = 4 .05 .10

6.24 7:36 8.47 9.57 lO.6,1 11.75 12.83 13.90 14.97 16.04 17.11 18.17 19.23 21.35 23.46 25.56 27.65 29.75 34.9b 40.18 45.37 50.55

5,57 6.6.t 7.70 ~.75 9.8[~ 1i).84 11.88 12.91 13.95 14.98 16.01 17.04 18.07 20.13 22.18 24.22 26.27 28.31 33.41 38.50 43.59 48.67

5.03 6.05 7.07 8.08 9.09 10.00 11.10 12.10 13.10 14.10 15.10 [6.11 17.11 19.11 21.11 23.11 25.11 27.10 32.10 37.09 42.09 47.08

4.45 5.42 6.38 7.34 8.31 9.27 10.23 11.20 12.16 13.13 14.1(, 15.00 16.03 17.97 19.91 21..85 23.81) 25.75 30.52 35.51p 40.3~ 45.29

8 9 10 11 12 13 14 15 16 17 1.8 19 20 22 24 26 28 30 35 40 45 50

10.24 11.60 12.93 14.23 15.52 16.79 18.95 19.29 20.53 21.76 22.98 24.19 25.40 27.80 30.17 32.53 34.88 37.21 42.9b 48.70 54.37 60.01

9.32 10.63 11.90 13.16 14.40 15.63 16.85 18.06 t9.26 20.45 21.64 22.82 24.00 26.33 28.65 30.95 33.24 35.52 41.18 46.78 52.35 57.88

8.57 9.83 11.06 12.28 13.49 J4.68 15.86 17.04 18.21 19.37 20.53 21.68 22.83 25.11 27.38 29.63 31.87 34.11 39.66 45.17 50.65 56.10

7.76 8.96 10.14 11.3] 12.48 13.63 14.77 15.91 17.04 18.17 19.29 20.41 21.52 23.75 25.95 28.15 30.34 32.53 37.96 43.35 48.73 54.08

11 12 13 14 15 16 17 18 19 21~ 22 24 26 28 30 35 40 45 51}

p =7 a m S !~ 10 II 12 13 14 15 16 17 18 19 21~ 22 24 26 28 30 35 40 15 50 [5.74 [7.33 18,87 20.3S 21.86 23.32 24.75 26.1~ ~7.58 28.97 :30.35 31.72 33.07 35.76 38.4] tl.03 13.62 46.20 52.55 58.80 64.98 71.08 .14.49 16.02 17.51 lb.gb 29.41 21.'83 23.23 24.61 95.98 27.34 28.68 30.0] 31.34 33.9fi 36.55 39.11 41.65 44.17 50.40 56.54 62.60 68.61 .O1 .025

i .05

=5 .10 m

p =7 .O] .025

1 =6 .05 .19

13.45 14.95 16.40 ]7.82 19.22 20.61t 21.97 23.32 24.65 25.9b 27.29 28.60 29.89 32.46 35.00 37.51 40:01 42.48 48.60 54.64 60.61 66.53

12.34 13.77 15.1b 10.55 17.91 19.25 20.57 21.8~ 23.1b 24.47 25.75 27.02 28.29 30.79 33.27 35.73 38.17 40.59 46.59 52.51 58.37 64.19

9 lo 11 12 13 1.4 15 [6 17 18 1 20 22 24 26 28 30 35 40 45 50

2;I.71t 25.52 27.29 29.02 30.71 32.37 34.01 35.62 37.20 38.77 40.32 41.86 43.38 46.3b 49.33 52.24 55.12 57.97 64.05 71.81 78.54 85.18

21.93 23.70 25.42 27.10 28.74 :30.36 31.95 33.52 35.07 36.60 38.12 39.62 41.10 44.04 46.92 49.78 52.59 55.38 62.24 68.96 75.58 82.10

20.48 22.20 23.88 25.52 27.13 28.71 30.27 31.80 33.32 34.81 36.30 37.77 39.22 42.10 44.93 47.73 50.50 53.24 59.9b 66.60 73.11 79.54

18.90 20.57 22.2o 23.79 25.35 26.89 28.40 29.90 31.3b ;32.84 34.25 35.72 37.14 39.95 42.72 45.45 48.17 50.85 57.45 63.96 70.36 76.68

894

P~ R. Krishnaiah

Table 16 Percentage Points of the Joint Distribution of the Extreme Roots of the Mutltivariate Beta Matrix p =2 r
S

c~=0.10O 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9919 0.9930 0.9938 0.9945 0.9950 0.9954 0.9961 0.9966 0.9970 0.9973 0.9976 0.9980 0.9983 0.9986 0.9987 0.9989 0.9990

0.9550 0.9605 0.9649 0.9683 0.9712 0.9735 0.9773 0.9801 0.9823 0.9840 0.9855 0.9882 0.9900 0.9914 0.9924 0.9932 0.9939

0.9067 0.9173 0.9258 0.9326 0.9383 0.9431 0.9508 0.9567 0.9613 0.9650 0.9680 0.9738 0.9778 0.9807 0.9830 0.9848 0.9862

0.8580 0.8720 0.8842 0.8943 0.9027 0.9099 0.9215 0.9305 0.9377 0.9434 0.9482 0.9573 0.9637 0.9684 0.9720 0.9749 0.9772

0.8188 0.8297 0.8435 0.8562 0.8670 0.8764 0.8916 0.9035 0.9130 0.9208 0.9273 0.9397 0.9485 0.9550 0.9601 0.9642 0.9675

0.7987 0.7971 0.8067 0.8199 0.8324 0.8436 0.8620 0.8766 0.8883 0.8980 0.9062 0.9218 0.9330 0.9413 0.9478 0.9530 0.9573

0.8067 0.7793 0.7654 0.7642 0.7718 0.7831 0.8059 0.8248 0.8404 0.8534 0.8644 0.8858 0.9015 0.9132 0.9226 0.9300 0.9362

0.8436 0.8123 0.7831 0.7581 0.7402 0.7311 0.7359 0.7550 0.7743 0.7911 0.8055 0.834l 0.8554 0.8718 0.8849 0.8955 0.9043

0.8828 0.8575 0.8329 0.8093 0.7866 0.7649 0.7262 0.7005 0.6946 0.7045 0.7204 0.7569 0.7851 0.8075 0.8255 0.8405 0.8531

p=2 r
s

a = 0.050 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9961 0.9966 0.9970 0.9973 0.9976 0.9978 0.9981 0.9983 0.9985 0.9987 0.9988 0.9990 0.9992 0.9993 0.9994 0.9994 0.9995

0.9695 0.9734 0.9763 0.9787 0.9805 0.9822 0.9847 0.9866 0.9881 0.9892 0.9902 0.9920 0.9933 0.9942 0.9949 0.9954 0.9958

0.9295 0.9376 0.9441 0.9493 0.9536 0.9573 0.9631 0.9675 0.9710 0.9738 0.9761 0.9804 0.9834 0.9856 0.9873 0.9886 0.9897

0.8859 0.8977 0.9075 0.9157 0.9225 0.9283 0.9377 0.9448 0.9505 0.9551 0.9590 0.9662 0.9712 0.9750 0.9778 0.9801 0.9820

0.8478 0.8582 0.8703 0.8811 0.8902 0.8980 0.9107 0.9206 0.9285 0.9349 0.9404 0.9506 0.9578 0.9632 0.9674 0.9707 0.9734

0.8269 0.8256 0.8351 0.8471 0.8580 0.8676 0.8834 0.8959 0.9059 0.9141 0.9211 0.9343 0.9437 0.9507 0.9562 0.9606 0.9642

0.8351 0.8070 0.7923 0.7913 0.7991 0.8099 0.8305 0.8472 0.8609 0.8724 0.8821 0.9009 0.9145 0.9248 0.9329 0.9394 0.9448

0.8676 0.8381 0.8099 0.7848 0.7658 0.7560 0.7612 0.7797 0.7975 0.8128 0.8259 0.8518 0.8709 0.8857 0.8974 0.9069 0.9148

0.9011 0.8775 0.8544 0.8318 0.8100 0.7889 0.7503 0.7230 0.7166 0.7272 0.7427 0.7768 0.8030 0.8237 0.8404 0.8542 0.8658

Computations of some multivariate distributions Table 16 (Continued) p=3 r


S

895

a =0.100 2 3 4 5 7 i0 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9950 0.9956 0.9961 0.9965 0,9968 0.9971 0,9975 0.9978 0.9981 0,9982 0.9984 0.9987 0.9989 0.9990 0.9992 0.9993 0.9993

0.9692 -0.9323 0.9728 0.9397 0.9756 0.9456 0.9779 0.9504 0.9798 0.9545 0.9815 0.9579 0.9840 0.9635 0.9859 0.9677 0.9875 0.9710 0.9887 0~9738 0.9897 0.9760 0.9915 0.9803 0.9928 0.9832 0.9938 0.9854 0.9945 0.9871 0.9951 0.9884 0.9956 0,9895

0.8925 0.9029 0.9118 0.9192 0.9255 0.9309 0.9396 0.9463 0.9518 0.9561 0,9598 0.9668 0.9716 0.9753 0.9781 0.9803 0.9821

0.8583 0.8666 0.8773 0.8869 0.8953 0.9025 0.9143 0,9235 0.9309 0.9370 0.9422 0.9519 0.9589 0.964l 0.9680 0.9713 0.9739

0.8394 0.8371 0.8447 0.8552 0.8651 0.8740 0.8886 0.9002 0.9096 0.9173 0.9239 0.9364 0,9454 0.9522 0.9574 0.9616 0.9651

0.8447 0.8191 0.8054 0.8037 0.8100 0.8195 0.8383 0.8540 0.8669 0.8776 0,8867 0.9045 0.9175 0.9273 0.9351 0.9414 0.9465

0.8740 0.8461 0.8195 0.7959 0.7784 0.7690 0.7726 0.7893 0.8061 0,8204 0,8328 0,8573 0.8756 0.8897 0.9009 0.9100 0,9176

0.9051 0.882'7 0.8607 0.8391 0.8182 0.7980 0.7610 0.7351 0.7286 0.7377 0.7522 0.7847 0.8097 0.8295 0.8456 0.8588 0.8700

p = 3

a =

0.050 5 7 10 15

r
s

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9976 0.9979 0.9981 0.9983 0.9984 0.9986 0.9988 0.9989 0.9991 0.9991 0.9992 0.9994 0.9995 0.9995 0.9996 0.9996 0.9997

0.9792 0.9817 0.9836 0.9851 0.9864 0.9875 0.9892 0.9905 0.9915 0.9923 0.9930 0.9943 0.9952 0.9958 0.9963 0.9967 0.9970

0.9489 0.9545 0.9590 0.9627 0.9657 0.9684 0.9726 0.9757 0.9783 0.9803 0.9820 0.9852 0.9875 0.9891 0.9903 0.9914 0.9922

0.9138 0.9224 0.9297 0.9356 0.9407 0.9450 0.9520 0.9574 0.9617 0.9652 0.9681 0.9736 0.9775 0.9804 0.9824 0.9844 0.9859

0.8812 0.8891 0.8984 0.9066 0.9135 0.9195 0.9293 0.9370 0.9431 0.9482 0.9524 0.9605 0.9662 0.9705 0.9738 0.9765 0.9786

0.8622 0.8602 0.8677 0.8771 0.8858 0.8933 0.9059 0.9157 0.9237 0.9303 0.9358 0.9465 0.9541 0.9598 0.9642 0.9677 0.9707

0.8677 0.8421 0.8280 0.8264 0.8329 0.8418 0~8588 0.8725 0.8839 0.8933 0.9014 0.9170 0.9283 0.9369 0.9437 0.9491 0.9536

0.8933 0.8673 0.8418 0.8185 0.8003 0.7905 0.7945 0.8106 0.8259 0.8390 0.8501 0.8723 0.8887 0.9014 0.9115 0.9197 0.9265

0.9199 0.8992 0.8785 0.8580 0.8380 0.8186 0.7821 0.7550 0.7482 0.7578 0.7717 0.8021 0.8253 0.8436 0.8584 0.8707 0.8810

896 Table 16 (Continued) p=4 r


s

P. R. Krishnaiah

a=0.100 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9965 0.9969 0.9972 0.9975 0.9977 0.9979 0.9982 0.9984 0.9986 0.9987 0.9988 0.9990 0.9992 0.9993 0.9994 0.9994 0.9995

0.9771 0.9796 0.9817 0.9834 0.9847 0.9859 0.9878 0.9892 0.9903 0.9912 0.9920 0.9934 0.9944 0.9951 0.9957 0.9962 0.9966

0,9477 0.9532 0.9576 0.9612 0.9643 0.9669 0.9711 0.9743 0.9770 0.9791 0.9809 0.9842 0.9865 0.9883 0.9896 0.9907 0.9915

0.9145 0.9225 0.9293 0.9350 0.9400 0.9441 0.9510 0.9563 0.9607 0.9642 0.9671 0.9727 0.9767 0.9796 0.9820 0.9838 0.9853

0,8847 0.8911 0.8996 0,9073 0.9139 0.9196 0.9292 0.9366 0.9427 0.9477 0.9519 0.9599 0.9656 0.9699 0.9732 0.9759 0.9781

0.8673 0.8645 0.8706 0.8792 0.8874 0.8946 0.9065 0.9160 0.9238 0.9303 0.9357 0.9461 0.9537 0.9594 0.9638 0.9674 0.9703

0.8706 0.8471 0.8340 0.8318 0.8371 0.8452 0.8612 0.8744 0.8853 0.8945 0.9023 0.9175 0.9286 0.9370 0.9438 0.9491 0.9536

0.8946 0.8695 0.8452 0.8232 0.8063 0.7971 0.7997 0.8145 0.8291 0.8417 0.8525 0.8741 0.8900 0.9024 0.9123 0.9204 0.9270

0.9201 0.9001 0.8801 0.8603 0.8410 0.8221 0.7868 0.7614 0.7546 0.7629 0.7761 0.8055 0.8280 0.8459 0.8603 0.8723 0.8824

p=4 r
s

a = 0.050 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9983 0.9985 0.9987 0.9988 0.9989 0.9990 0.9991 0.9992 0.9993 0.9994 0.9994 0.9995 0.9996 0.9997 0.9997 0.9997 0.9998

0.9846 0.9863 0.9876 0.9888 0.9897 0.9905 0.9917 0.9927 0.9935 0.9941 0.9946 0.9956 0.9963 0.9968 0.9971 0.9974 0.9977

0.9606 0.9647 0.9680 0.9708 0.9731 0.9751 0.9783 0.9807 0.9827 0.9843 0.9856 0.9882 0.9899 0.9912 0.9922 0.9930 0.9937

0.9316 0.9381 0.9436 0.9483 0.9522 0.9556 0.9610 0.9653 0.9687 0.9715 0.9739 0.9784 0.9815 0.9839 0.9857 0.9871 0.9883

0.9035 0.9096 0.9169 0.9233 0.9289 0.9337 0.9416 0.9478 0.9528 0.9569 0.9604 0.96'70 0.9718 0.9753 0.9780 0.9802 0.9820

0.8863 0.8839 0.8899 0.8976 0.9046 0.9108 0.9210 0.9291 0.9357 0.94ll 0.9458 0.9546 0.9610 0.9658 0.9695 0.9726 0.9750

0.8899 0.8667 0.8534 0.8515 0.8568 0.8645 0.8787 0.8904 0.9000 0.9080 0.9149 0.9282 0.9379 0.9453 0.9511 0.9558 0.9597

0.9108 0.8875 0.8645 0.8429 0.8256 0.8161 0.8190 0.8332 0.8465 0.8580 0.8677 0.8872 0.9016 0.9128 0.9216 0.9289 0.9348

0.9326 0.9141 0.8954 0.8767 0.8583 0.8402 0.8057 0.7794 0.7723 0.7811 0.7937 0.8211 0.8420 0.8585 0.8719 0.8830 0.8923

Computatiott~ of some multivariate distributions

897

Table 16 (Continued) p=5 r


s

a=O.lO0 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9974 0.9977 0.9979 0.9981 0.9982 0.9984 0.9986 0.9988 0.9989 0.9990 0.999t 0.9992 0.9994 0.9994 0.9995 0.9996 0.9996

0.9821 0.9840 0.9856 0.9868 0.9878 0.9887 0.9902 0.9913 0.9922 0.9930 0.9935 0.9947 0.9955 0.9961 0,9965 0.9969 0.9972

0.9580 0.9622 0.9656 0.9684 0.9709 0.9729 0.9762 0.9789 0.9810 0.9827 0.9841 0.9868 0.9887 0.9902 0.9913 0.9922 0.9929

0.9298 0.9361 0.9415 0.9461 0.9500 0.9535 0.9590 0.9633 0.9669 0.9698 0.9723 0.9769 0.9802 0.9827 0.9846 0.9862 0.9875

0.9036 0.9087 0.9156 0.9218 0.9273 0.9320 0.9398 0.9461 0.9511 0.9553 0.9588 0.9655 0.9704 0.974l 0.9769 0.9792 0.9811

0.8878 0.8848 0.8898 0.8969 0.9037 0.9097 0.9197 0.9277 0.9343 0.9397 0.9444 0.9533 0.9597 0.9646 0.9685 0.9716 0.9741

08898 0.8682 0.8557 0.8534 0.8578 0.8648 0.8785 0.8899 0.8993 0.9073 0.9140 0.9273 0.9370 0.9444 0.9502 0.9549 0.9589

0.9097 0.8871 0.8648 0.8443 0.8282 0.8192 0.8210 0.8342 0.8471 0.8582 0.8678 0.8870 0.9012 0.9123 0.9211 0.9283 0.9343

0.931! 0.9131 0.8948 0.8766 0.8586 0.8409 0.8075 0.7827 0.7757 0.7832 0.7953 0.8221 0.8426 0.8589 0.8721 0.8830 0.8922

p = 5

a =

0.050 5 7 10 15

r
S

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9987 0.9989 0.9990 0.9991 0.9991 0.9992 0.9993 0.9994 0.9995 0.9995 0.9996 0,9996 0.9997 0.9997 0.9998 0.9998 0.9998

0.9880 0.9892 0.9903 0.9911 0.9918 0.9924 0.9934 0.9941 0.9947 0.9952 0.9957 0.9964 0.9969 0.9973 0,9977 0.9979 0.9981

0.9684 0.9715 0.9741 0.9763 0.9781 0.9796 0.9821 0.9841 0.9857 0.9870 0.9881 0.9901 0.9916 0,9926 0.9935 0.9941 0.9947

0.9439 0.9490 0.9534 0.9571 0.9602 0.9630 0.9674 0.9709 0.9737 0.9760 0.9780 0.9816 0.9843 0.9862 0.9878 0.9890 0.9901

0.9195 0.9243 0.9302 0.9355 0.9400 0.9440 0.9504 0.9555 0.9597 0.9632 0.9661 0.9717 0.9757 0.9787 0.9810 0.9829 0.9845

0.9039 0.9014 0.9063 0.9126 0.9185 0.9236 0.9321 0.9390 0.9445 0.9491 0.9531 0.9606 0.9661 0.9703 0.9735 0.9761 0.9782

0.9063 0.8852 0.8727 0.8706 0.8751 0.8816 0.8939 0.9039 0.9122 0.9192 0.9251 0.9367 0.9451 0.9516 0.9567 09609 0.9642

0.9236 0.9026 0.8816 0.8617 0.8455 0.8361 0.8384 0.8509 0.8627 0.8728 0.8815 0.8987 0.9116 0.9216 0.9295 0.9359 0.9413

0.9419 0.9252 0.9082 0.8911 0.8740 0.8571 0.8245 0.7992 0.7919 0.7999 0.8115 0.8363 0.8553 0.8704 0.8826 0.8927 0.9012

898

P. R. Krishnaiah

Table 16 (Continued) p =6 r
S

~ =0.100 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9979 0.9981 0.9983 0.9985 0.9986 0.9987 0.9989 0.9990 0.9991 0.9992 0.9993 0.9994 0.9995 0.9995 0.9996 0.9996 0.9997

0.9856 0.9870 0.9882 0.9892 0.9900 0.9908 0.9919 0.9928 0.9935 0.9941 0.9946 0.9955 0.9962 0.9967 0.9970 0.9973 0.9976

0.9654 0.9687 0.9713 0.9736 0.9756 0.9773 0.9799 0.982l 0.9839 0.9853 0.9865 0.9887 0.9904 0.9916 0.9925 0.9933 0.9939

0.9411 0.9461 0.9505 0.9543 0.9576 0.9603 0.9649 0.9686 0.9715 0.9740 0.9760 0.9800 0.9828 0.9850 0.9867 0.9879 0.9890

0.9179 0.9220 0.9277 0.9329 0.9374 0.9414 0.9480 0.9532 0.9575 0.9610 0.9641 0.9699 0.9741 0.9773 0.9797 0.9817 0.9834

0.9035 0.9005 0.9045 0.9105 0.9162 0.9213 0.9298 0.9367 0.9424 0.9471 0.9510 0.9588 0.9644 0.9687 0.9721 0.9748 0.9770

0.9045 -0.9213 0.8847 0.9008 0.8730 0.8803 0.8705 0.8612 0.8742 0.8470 0.8803 0.8372 0.8922 0.8384 0.9021 0.8502 0.9104 0.8617 0.9173 0.8717 0.9232 0.8803 0.9350 0.8974 0.9435 0.9102 0.9501 0.9203 0.9554 0.9282 0.9596 0.9347 0.9630 0.9402

0.9397 0.9232 0.9065 0.8896 0.8728 0.8562 0.8245 0.8005 0.7934 0.8003 0.8114 0.8358 0.8547 0.8697 0.8818 0.8919 0.9004

p =6 r
S

a = 0.050 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9990 0.9991 0.9992 0.9993 0.9993 0.9994 0.9994 0.9995 0.9996 0.9996 0.9996 0.9997 0.9997 0.9998 0.9998 0.9998 0.9998

0.9903 0.9913 0.9920 0.9927 0.9933 0.9938 0.9945 0.9952 0.9956 0.9960 0.9963 0.9970 0.9974 0.9977 0.9980 0.9982 0.9984

0.9739 0,9764 0.9785 0.9802 0.9816 0.9829 0.9849 0.9866 0.9879 0.9890 0.9898 0.9916 0.9928 0.9937 0.9944 0.9950 0.9954

0.9530 0.9571 0.9606 0,9636 0.9662 0.9684 0.9721 0.9750 0.9774 0.9793 0,9810 0.9841 0.9864 0.9881 0.9894 0.9904 0.9913

0.9314 0.9354 0.9402 0.9446 0.9484 0.9517 0.9571 0.9615 0.9650 0.9679 0.9704 0.9752 0.9787 0.9813 0.9B33 0.9850 0.9863

0.9175 0.9149 0.9189 0.9242 0.9291 0.9335 0.9407 0.9465 0.9513 0.9553 0.9587 0.9653 0.9700 0.9736 0.9765 0.9788 0.9307

0.9189 0.8997 0.8880 0.8858 0.8897 0.8953 0.9059 0.9146 0,92t8 0.9279 0.9331 0.9434 0.9509 0.9566 0.9612 0.9648 0.9678

0.9335 0.9145 0.8953 0.8768 0.8615 0.8525 0.8542 0.8654 0.8759 0.8849 0.8926 0.9081 0,9197 0.9286 0,9358 0.9416 0.9465

0.9491 0.9340 0.9184 0.9025 0.8867 0.8708 0.8401 0.8157 0.8084 0.8156 0.8262 0.8490 0.8664 0.8803 0.8915 0.9008 0.9086

Computations of some multivariate distributions


Table 16 ( C o n t i n u e d )

899

p=7 r
S

a =0.t00 2 3 4 5 7 10 15

5 6 7 8 9 t0 12 14 16 18 20 25 30 35 40 45 50

0.9983 0.9985 0.9986 0.9987 0.9989 0.9989 0.9991 0.9992 0.9992 0.9993 0.9994 0.9995 0.9995 0.9996 0.9997 0.9997 0.9997

0.9881 0.9892 0.9901 0.9909 0.9916 0.9922 0.9931 0.9939 0.9945 0.9950 0.9954 0.996l 0.9967 0.9971 0.9975 0.9977 0.9979

0.9709 0.9735 0.9757 0.9776 0.9792 0.9805 0.9828 0.9846 0.9861 0.9873 0.9883 0.9902 0.9916 0.9926 0.9935 0.9941 0.9947

0.9497 0.9539 0.9575 0.9606 0.9633 0.9657 0.9695 0.9726 0.9751 0.9773 0.9790 0.9825 0.9849 0.9867 0.9882 0.9893 0.9903

0.9291 0.9324 0.9372 0.9416 0.9453 0.9487 0.9543 0.9588 0.9626 0.9656 0.9682 0.9733 0.9770 0.9798 0.9820 0.9837 0.9851

0.9159 0.9129 0.9162 0.9214 0.9262 0.9306 0.9380 0.9439 0.9488 0.9529 0.9564 0.9632 0.9682 0.9720 0.9749 0.9774 0.9793

0.9162 0.8980 0.8871 0.8845 0.8877 0.8930 0.9034 0.9121 0.9194 0.9256 0.9308 0.9413 0.9489 0.9548 0.9595 0.9633 0.9664

0.9306 0.9118 0.8930 0.8751 0.8607 0.8523 0.8530 0.8636 0.8739 0.8829 0.8907 0.9062 0.9178 0.9268 0.9341 0.9400 0.9450

0.9464 0.9314 0.9159 0.9002 0.8846 0.8689 0.8389 0.8157 0.8085 0.8148 0.8251 0.8476 0.8650 0.8788 0.8900 0.8994 0.9073

p = 7

a =

0.050 5 7 10 15

r
S

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9992 0.9993 0.9993 0.9994 0.9994 0.9995 0.9995 0.9996 0.9996 0.9997 0.9997 0.9997 0.9998 0.9998 0.9998 0.9998 0.9998

0.9920 0.9927 0.9934 0.9939 0.9943 0.9947 0.9954 0.9959 0.9963 0.9966 0.9969 0.9974 0.9978 0.9981 0.9983 0.9985 0.9986

0.9781 0.9801 0.9817 0.9831 0.9843 0.9854 0.9871 0.9885 0.9896 0.9905 0.9912 0.9927 0.9937 0.9945 0.9951 0.9956 0.9960

0.9599 0.9633 0.9661 0.9687 0.9708 0.9727 0.9758 0.9782 0.9802 0.9819 0.9833 0.9861 0.9880 0.9895 0.9906 0.9915 0.9923

0.9408 0.9441 0.9481 0.9517 0.9550 0.9577 0.9624 0.9661 0.9691 0.9717 0.9739 0.9780 0.9811 0.9834 0.9852 0.9866 0.9878

0.9281 0.9256 0.9289 0.9334 0.9376 0.9413 0.9476 0.9526 0.9568 0.9602 0.9632 0.9690 0.9732 0.9764 0.9789 0.9810 0.9826

0.9289 0.9114 0.9004 0.8982 0.9015 0.9064 0.9157 0.9233 0.9297 0.935t 0.9397 0.9488 0.9555 0.9607 0.9648 0.9681 0.9708

0.9413 0.9241 0.9064 0.8893 0.8748 0.8662 0.8674 0.8774 0.8868 0.8949 0.9019 0.9159 0.9264 0.9345 0.9410 0.9464 0.9508

0.9548 0.9410 0.9266 0.9120 0.8972 0.8823 0.8532 0.8297 0.8224 0.8291 0.8388 0.8598 0.8759 0.8887 0.8990 0.9077 0.9149

900

P. R. Knshnaiah

Table 16 (Continued)
p=8 a =0.100

10

15

5 6 7 8 9 I0 12 14 16 18 20 25 30 35 40 45 50

0.9986 0.9987 0.9989 0.9989 0.9991 0.9991 0.9992 0.9993 0.9994 0.9994 0.9995 0.9995 0.9996 0.9997 0.9997 0.9997 0.9998

0;9900 0.9909 0.9916 0.9923 0.9928 0.9933 0.9941 0.9947 0.9952 0.9956 0.9960 0.9966 0.9971 0.9975 0,9978 0.9980 0.9981

0.9751 0.9773 0.9790 0.9806 0.9819 0.9831 0.9850 0.9865 0.9878 0.9888 0,9897 0.9914 0.9926 0.9935 0.9942 0.9948 0.9953

0,9565 0.9599 0.9630 0.9656 0.9679 0.9699 0:9732 0.9759 0.9780 0,9798 0.9814 0.9843 0,9865 0.9881 0.9894 0.9904 0.9913

0.9380 0.9408 0.9447 0.9485 0.9518 0.9546 0.9595 0.9634 0.9666 0.9693 0.9716 0.9760 0.9793 0.9818 0.9837 0.9853 0.9866

0.9259 0.9230 0.9258 0.930l 0.9344 0.9381 0.9445 0.9497 0.9541 0.9577 0.9608 0.9668 0.9712 0.9746 0.9773 0.9795 0.9812

0.9258 0.9090 0.898'7 0.896l 0.8988 0.9035 0.9128 0.9204 0.9269 0.9324 0.9371 0.9464 0.9534 0,9587 0.9629 0.9664 0.9693

0.9381 0.9209 0,9035 0.8868 0.8732 0.8651 0.8654 0.8750 0.8843 0.8924 0.8994 0.9135 0.9242 0.9325 0.9391 0.9445 0.9491

0.9520 0.9381 0.9239 0.9092 0.8945 0.8798 0.8513 0.8289 0.8217 0.8275 0,8370 0.8579 0.8740 0.8867 0.8972 0.9059 0.9132

p= 8 r
S

a = 0.050 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9993 0.9994 0.9994 0.9995 0.9995 0.9995 0.9996 0.9996 0.9997 0.9997 0.9997 0.9998 0.9998 0.9998 0,9998 0.9998 0.9999

0.9932 0.9938 0.9943 0.9948 0.9951 0.9955 0.9960 0.9964 0.9968 0.9971 0.9973 0.9977 0.9981 0.9983 0.9985 0.9986 0.9987

0.9813 0.9829 0.9842 0.9854 0.9864 0.9873 0.9888 0.9899 0.9908 0.9916 0.9923 0.9935 0.9944 0.9951 0.9956 0.9961 0.9964

0.9653 0.9681 0.9705 0.9726 0.9744 0.9760 0.9787 0.9808 0.9825 0.9839 0.9852 0.9876 0.9893 0.9906 0.9916 0.9924 0 9931

0.9484 0.9510 0.9544 0.9575 0.9602 0.9626 0.9666 0.9698 0.9725 0.9747 0.9766 0.9803 0.9830 0.9850 0.9866 0.9880 0.9890

0.9368 0.9342 0.9371 0.9409 0.9446 0.9477 0.9532 0.9576 0.9612 0.9643 0.9669 0.9720 0.9758 0.9786 0.9809 0.9827 0.9842

0.9371 0.9209 0.9107 0.9084 0.9113 0.9156 0.9238 0.9306 0.9363 0.9411 0.9452 0.9533 0.9594 0.9641 0.9677 0.9707 0.9732

0.9477 0.9319 0.9156 0.8997 0.8861 0.8779 0.8786 0.8877 0.8962 0.9034 0.9098 0.9226 0.9321 0.9395 0.9455 0.9505 0.9545

0.9595 0.9468 0.9336 0.9199 0.9060 0.8921 0.8646 0.8420 0.8348 0,8408 0.8499 0.8692 0.8841 0.8959 0.9056 0.9136 0.9203

Computations of some multivariate distributions


Table 16 (Continued) p=9 r
S

901

a=0.100 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9988 0.9989 0.9991 0.9991 0.9992 0.9992 0.9993 0.9994 0.9994 0.9995 0.9995 0.9996 0,9997 0.9997 0.9997 0.9998 0.9998

0.9914 0.9922 0.9928 0.9933 0,9937 0.9942 0.9948 0.9953 0.9958 0.9961 0.9964 0.9970 0.9975 0.9978 0.9980 0.9982 0.9984

0.9784 0.9802 0.9817 0.9830 0.9842 0.9851 0.9868 0.9881 0.9892 0.990t 0.9908 0.9923 0.9934 0.9942 0.9948 0.9953 0.9958

0.9619 0.9648 0.9674 0.9696 0.9716 0.9733 0.9762 0.9785 0.9804 0.9820 0.9833 0.9859 0.9879 0.9893 0.9904 0.9914 0.9922

0.9453 0.9475 0.9510 0.9542 0.9570 0.9595 0.9637 0.967t 0.9700 0.9723 0.9744 0.9784 0.9812 0.9835 0.9852 0.9866 0.9878

0.9342 0.9313 0.9337 0.9375 0.9411 0.9444 0.9500 0.9546 0.9585 0.9616 0.9644 0.9698 0.9738 0.9768 0.9793 0.9812 0.9829

0.9337 0,9182 0.9085 0.9059 0.9083 0.9124 0.9206 0.9275 0.9333 0.9382 0.9425 0.9509 0,9572 0.9620 0.9659 0.9690 0.9716

0.9444 0.9286 0.9124 0.8968 0.8839 0.8761 0.8761 0.8848 0.8933 0.9007 0.9071 0.9200 0.9297 0.9373 0.9435 0.9485 0.9527

0.9566 0.9438 0.9305 0.9169 0.9030 0.8892 0.8620 0.8405 0.8333 0.8387 0.8475 0.8669 0.8818 0.8937 0.9035 0,9115 0.9184

p=9 r
S

a = 0.050 2 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9994 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9997 0.9997 0.9997 0.9997 0.9998 0.9998 0.9998 0.9998 0.9999 0.9999

0.9942 0.9947 0.9951 0.9955 0.9958 0.9961 0.9965 0.9969 0.9972 0.9974 0.9976 0.9980 0.9983 0.9985 0.9986 0.9988 0.9989

0.9838 0.9851 0.9862 0.9873 0.9881 0.9889 0.9901 0.9911 0.9919 0.9926 0.9931 0.9942 0.9950 0.9956 0.9961 0.9965 0.9968

0.9697 0.9720 0.9740 0,9758 0.9774 0.9787 0.9810 0.9829 0.9844 0.9857 0.9868 0.9888 0.9904 0.9915 0.9924 0.9931 0.9938

0.9544 0.9438 0.9566 0.9414 0.9595 0 . 9 4 3 8 0,9623 0.9471 0.9646 0.9503 0.9666 0.9531 0.9701 0.9578 0.9730 0.9617 0.9753 0.9650 0.9772 0.9676 0.9789 0.9700 0.9822 0.9746 0.9846 0.9779 0.9864 0.9805 0.9879 0.9825 0.9890 0.9842 0.9900 0.9856

0.9438 0.9289 0,9194 0.9171 0.9197 0.9235 0.9307 0.9368 0.9418 0.9462 0.9499 0.9572 0.9627 0.9669 0.9703 0.9731 0.9754

0.9531 0.9385 0.9235 0.9086 0.8958 0.8879 0.8883 0.8965 0.9043 0.9109 0.9167 0.9283 0.9370 0.9439 0.9494 0.9539 0.9577

0.9634 0.9517 0.9394 0.9267 0.9137 0.9005 0.8744 0.8527 0.8456 0.8512 0.8595 0.8775 0.8913 0.9023 0.9113 0,9188 0.9251

902 Table 16 (Continued) p~lO r 0 1 2

P. R. Krixhnaiah

a=O. lO0 3 4 5 7 10 15

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9990 0.9991 0.9992 0.9992 0.9993 0.9993 0.9994 0.9995 0.9995 0.9995 0.9996 0.9997 0.9997 0.9997 0.9998 0.9998 0.9998

0.9925 0.9931 0.9937 0.9941 0.9945 0.9948 0.9955 0.9959 0.9962 0.9966 0.9969 0.9973 0.9977 0.9980 0.9982 0.9984 0.9985

0.9811 0.9826 0.9839 0.9850 0.9859 0.9868 0.9883 0.9893 0.9903 0.9911 0.9918 0.9931 0.9940 0.9947 0.9953 0.9958 0.9961

0.9664 0.9688 0.9710 0.9729 0.9746 0.9761 0.9786 0.9806 0.9823 0.9837 0.9849 0.9873 0.9890 0.9903 0.9913 0.9922 0.9928

0.9513 0.9532 0.9561 0.9590 0.9614 0.9636 0.9673 0.9703 0.9728 0.9749 0.9767 0.9803 0.9829 0.9849 0.9865 0.9878 0.9888

0.9411 0.9383 0.9403 0.9436 0.9469 0.9497 0.9547 0.9588 0.9622 0.9651 0.9675 0.9724 0.9760 0.9787 0.9809 0.9828 0.9842

0.9403 0.9260 0.9168 0.9143 0.9164 0.9200 0.9274 0.9336 0.9388 0.9432 0.9471 0.9547 0.9604 0.9649 0.9684 0.9713 0.9737

0.9497 0.9351 0.9200 0.9054 0.8932 0.8857 0.8855 0.8933 0.9011 0.9078 0.9137 0.9255 0.9345 0.9416 0.9472 0.9519 0.9558

0.9605 0.9487 0.9362 0.9235 0.9105 0.8974 0.8716 0.8508 0.8437 0.8487 0.8568 0.8748 0.8888 0.8999 0.9090 0.9166 0.9230

p =

10

a =

0.050 5 7 10 15

r
$

5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0.9995 0.9995 0.9996 0.9996 0.9996 0.9997 0.9997 0.9997 0.9997 0.9998 0.9998 0.9998 0.9998 0.9998 0.9999 0.9999 0.9999

0.9949 0.9954 0.9957 0.9960 0.9963 0.9965 0.9969 0.9972 0.9975 0.9977 0.9979 0.9982 0.9985 0.9986 0.9988 0.9989 0.9990

0.9858 0.9869 0.9879 0.9888 0.9895 0.9901 0.9912 0.9920 0.9927 0.9933 0.9938 0.9948 0.9955 0.9960 0.9965 0.9968 0.9971

0.9732 0.9752 0.9770 0.9785 0.9798 0.9810 0.9830 0.9846 0.9860 0.9871 0.9880 0.9899 0.9912 0.9923 0.9931 0.9938 0.9943

0.9594 0.9613 0.9638 0.9662 0.9682 0.9700 0.9731 0.9755 0.9776 0.9793 0.9809 0.9838 0.9859 0.9876 0.9889 0.9900 0.9908

0.9497 0.9474 0.9494 0.9524 0.9551 0.9575 0.9617 0.9652 0.9681 0.9705 0.9726 0.9767 0.9798 0.9821 0.9839 0.9855 0.9867

0.9494 0.9357 0.9268 0.9246 0.9268 0.9302 0.9367 0.9420 0.9466 0.9505 0.9539 0.9606 0.9655 0.9694 0.9725 0.9751 0.9772

0.9575 0.9442 0.9302 0.9162 0.9041 0.8966 0.8968 0.9043 0.9112 0.9173 0.9226 0.9333 0.9413 0.9477 0.9528 0.9569 0.9604

0.9667 0.9559 0.9444 0.9325 0.9203 0.9079 0.8831 0.8623 0.8552 0.8604 0.8682 0.8848 0.8977 0.9080 0.9164 0.9234 0.9293

Computations of some multivariate distributions


T a b l e 17 P e r c e n t a g e Points of the Largest R o o t of the Multivariate B e t a M a t r i x
p ~: 6

903

.i00

5 6 7 8 9 I0 12 14 16 18 20 25 30 35 40 45 50 60 70 80 90 100

7946 7542 7168 .6822


.6503

.8238 .7868 .7519 .7:L92


.6887

.6210
.5690

5245 .486:1. .4528 .42i37 .3647 +3199 2849 2567 2335 2142 1837 1608 :1.430 1287 1:1.71

.6604 ~6 0 9 4 .5651 .5265 .-4925 4626 .4()1:1. .3539 ~3164 +286:L ,, 2 6 : [ 0 .2.400
.2066

.1813
.16:L6 + :L457 .1327

13454 .8:1.14 7788 .7480 .7190 .69:1. 7 .6421 .5985 .5600 .5259 .4955 .4325 .3834 .3442 .3:1.22 2856 2631 .2273 .2000 1785 . ].6:1.3 .1470

8622 8307 8003 .... o 7.7 " 1 ~..

:7435
7174 669.4 6266
5885

',5545 v." 0 ,..,..40 4601 4097 +3691 .3358


.3078

. ?842 .2463 ~ 2:1.72 1943 . :i 7 5 7 1,604

.8757 846-4 .8178 .7903 .7639 .7:389 +6925 .6507 .6132 .5794 .5490 .4847 433.4 .3917 .3573 3283 +3036 2639 .2333 2 0 9 0 1893 1730

o8 8 7 8 5 9 J;
.8324

.8063 7812 .7572 .7123 .6717 .63-48 .6015 .5712


.5067

4549

.4124
.3770 3472

.32:1.7 2804 .2484 .2229 .2022 1850

o8 9 5 9 .8702 .8448 .8200 .7961 .7730 7297 .6901 6540 .6212 .59:1.2 5268 .4746 .4315 .3954 3 6 4 8 +3 3 8 5 .29,58 .2626 .2361 .2:1.44 . :i964

P =

6
M _

c~ .... . 1 0 0 :l 0

15

?.0

25

5 6 7 8 9 10 12 :[4 :L6 18 20 25 30 35 40 45 50 60 70 80 90 100

.9037 ,8795 .8554 ,8319 8 0 9 0 7868 7450 7065 .67:1.2 .6388 .6092 o5451 .4926

.9104
.8875

.9162
.8946

~9213
.9008 8801

.8647
.8423

.8729 .85:14
.8304 .13099 .7708

8203 ~7990
.7586

.4491
.4125 .3813

72:1.2 .6866 .6549 .6256 .56:1.9 5094 .4656 .4285

.7344 .7007 .6695 .6406 .5774


.5250

.8595 .8393 .8:196 .78:18 ,746-4 7134 68213


.65.43

.9397 . 9232 .9065 8 8 9 6 8728


.8562

.9510 . 93174
o" " 9~..33

.9 0 9 0
.8946 .8803

5 9 :i 8
.4954 -4o76 .4252 3968
.350:1.

.8238 7928 o7 6 3 4 7356 .7094 6505 .5998


.5560

.8520 ~8 2 4 6 7983 .7730 .7490 .6939


.6455 .6029

.9588 9.471 .9349 .9225 .9100 8974 .872.4 .8479 .8240 .8010 .7789 +727'.:'; .6815

.:~961~I
.3693
+3244 7891

.4809 .4435 .4:1.13


.3835 .3376

6405
.6037 o5 7 0 7 ,5-409

.3544 .3105
.2762 2487 .226.1. .2073

. 3 0 :L4
272:1.
.2481 .2278

.2607 0 ...... .... ~; 73 +2178

~3131 283:1. ..... , 8 4


.2376

.5178 .4844 .4550 4054 3654 .3325 +3050 ~2817

.5652 . ,53:i8 .50:1.9 ,45:[0 .4093 .3745 + 345:1 .75199

.4895 .4469 .4109 3802 ,35~7

904

P. R. KMshnaiah

Table 17 (Continued)
p = 6 c~ = 050

:1.

5 6 7 8
9 10 12 .1.4

,8246 +7861 .7499 ,7:1.60 +6 8 4 5 .6552 +6 0 2 8


.5575 + 5.; :1.8 2

8499 8149 7814 7496 ~ 7.1,98 6917 6408 ,'3961 .5568 .522.1. ~ 49:1.3 ~4276 ,, 3 7 8 2 ,3{'589

o8686 8365 .8054 .7757 + 7.474 .7206 ,6714 +6277


5888

+8830 ,8535 8245 +7966 o7698 ,7442 .6968


.6542

~.8 9 4 5

,8671
., 8 4 0 1 +8:L38 +7884 .,.7640
,71134

,,9040 ... 8 7 8 4 .13531 .8282 8041

16 .1.8 20 25; 30 35

,6160
~ ',:,:; 8:1. 6

,4838 .4535
o3918

.5542
+5 2 3 : L +4583 +4074 .3665

.55;06
.485".;2 ,4333 ,. 3 9 1 1

40 45 50 60 70 80 90 :L 0 0

,3447 3075 ~2776 ,2529


+ 232:1.

3 0 6 9 2804
2581 2226 ~, :L 9 5 6 ,, : 1 . 7 4 4 . :1.574 . :L 4 3 4

+3329 +3049
+2812 .2434 .2144 . :1.9:1.6 ~..1.7.52 ~ 1580

,1995 .1.749 + 1556 o1402 +12"75

,3563 ,3272 3024 ,2625.;


...... ~;. 1 _ d

6770 ,6394 .6055 ~5746 +5091 ,4565 04134 ,3777 ~3475 +3217 + 280:1. +2480

?.075
, :l 8 7 1 3 .17:L6

+2224
o2016 1844

7808 7369 6967 6600 626'.:.:; 5960 5306 4775 4338 3973 3663 3397 2966 2631 2364 2146 1965 ,050
20

9:1.113 8879 8640 8405 8175 7952 7530 7139 6781


,,64,'7".;3 ~.6.1.5:1.

5500 ,4967 .4526


,4:1.54

,3838
~3565 3:1.21 ,2775 ,2497 ~2269
o2080

:= 6

c~ =
10
:L 5;

25

5 6 7 8 9 10 12 14 .1.6 1 I3 20 25 30 35 40 45 50 60 70 80 90 100

,,9185:; .,. 8 9 6 0 +8 7 3 5 8 5 1 . :1. ,. 1 3 2 9 2 , t3078 ,.767.1. 7293 ,6943 .662:L +6324 .5677 ~ 5:1.44 ~4699

.9242 9030 81317 8604 13394 8190 ,..7797 ,, 7 4 3 0

9291 .9091
8888

~9335
,9145

,9491

8952
.8758 .8566'::

13686 8485
8288 7910 7553 7220

8377
8011

,9340 .9184 ,9025 8866 8 7 0 8 8 3 9 7


,8096 ,7809

,9587 9462 .933:L .9198 ,9062 13926


,8656

766,~ ~:
7341 ~ 675.;5 ~ 6.1.28 3600 .51,51 .4766 4434 ,4.1.4,~

,8391
, 8 .1. 3 5

7089 ~6773
6480 ,5840 5307

~4322 4000 3722 3267 2910 2623 2387 2190

4860 4.480
,4154 3871 ,,. 3 4 0 6 3039 2743 +2499

7536 7278 6692 6 1 8 4 ,5742


,5356 ~5017

,78813 7652 +7107 ~6625 .6198


,5819 ,'5481

.9653 9546 ,9434 9318 .9":~0 0 ,9081 8842 8606 ,8375 .8151 .7934 ,7428 .6972 .6562 ,6193 +586 : I .
+5561 ,5041 ,4607 +4241 3928

4 7 1 7
4210 3800

.5179
~ 4 , . > o .L ~4235

.2296

.3662 ,3279 +2969 ,2711 ,2495

3462 3178

,2937

~3880 3 ~578 .3320

.3657

Computations of some multivariate distributions

905

Table 17 (Continued)
p := 6

025
' J : : '

5 6 7 8 9
10

8488

.8708

12 14 16 18
20

.8123 7773 ,7443 7133 6844 .6320

8378 8 0 5 7 7 7 5 0 7458 7 1 8 3 .6677


,6230
.5833

.8870 8569
.8273

.8995
.8719

.8444
o8177

5 8 6 3 .54(.,3
.5111 4800 + .4:1.61
+ 3670 3280 .?965

.5480

25 30 35 40 45 50 60 70 80 90
100

.5165 45:1. 1
4001
3592 3258 2979

.2704

.2486
.2139

.11]}77
1672
1507

2745 2771 2086


1862

.7987 .7712 7450 ~6965 ~6529 6139 5 7 8 9 ~5473 .4812 o4 2 8 7 .3864 351(., .3224 .2977 2580 ~2276
,2036 1841 .1681

.7918 7668 7203 6 7 8 0 6398


605;2

9 0 9 5 8839 .8584 8 3 3 2 8 0 8 7 7851 7 4 0 4 ~6994 .6621 o 6281


5970 5306

.5739
5074 4542 4108 3748 3445
3187

9176 I-]939 8700 8463 8231 8005 7576 7180 6816 6482 6176
5515 497',"3

o9244

,9022 ,8798 ~8574 .8353


8138

~7726 .7343 6988 .666.1.


.6359

1372

:I. 6 8 2 15;33

2771 ?450 2196 1989


1 8 :L 8

.4769 ~4327 3959 ~3647 ,3380 .2948 .26.1.3 .2346 2128 .1947
a -~:: ,25

4528 4153 3834 3559 3113 2765 2486 2258 2069

.5704 5:i. 63 .4712


,4332 4007

.3726 .3267 2908 ~2619 2382


.2185

10

15

20

25

5 6 7 8 9
10

9301 9094 8 8 8 2 .8670

,935

:I

12 14 16
18

8460 8254 .7858 .7487


7:1.4:1. 6821 ~6525 5875

20 25 30 35 40 45 50 60 70 80 90
100

.',:';336 .4883 4.498 4 . I . 68 3882


34:1.3 .3044 274(.) 2501 .2296

~9155 8 9 5 5 o8 7 5 3 8553 8 3 5 6 ~7975 .7616 7280 .6966 .6675 .6032 ~5494 .50-40 .4653 4 3 2 0 .4029 .3551 ~ 3:1.72 .2866 ~2614 .2402

9 3 9 3 9208 .9018 8826 .8635 8 4 4 6 8 0 8 0 .7732 7 4 0 4 .7098


681:1,

.9431 .92':';5; 9 0 7 4 8 8 9 2 8 7 0 8 .8527 ~8173 7 8 3 6


,7517 o7217 ,6937

,9564 ,9426 . 9280 9:1.32 . 8 9 8 :L 8830 ,8530

.9647 9',"?;32
.9411

.9286 .9158 9 0 2 8 .8769


.8513

. ( . , . I . 77

5642
5188 .4798
,4.462 4168 3681

~6311 .5779 532(., ,4935

.8239 7958 7 6 9 0 .7435 6854 .6347 '5903 .5;5;13


.5170

.3295 2982
,2722

,4596 4299 3806 o3412


,3092

.2504

.2826 2602

.486':."; .4349 .3931 3584 .3293 .3046

.8264 8 0 2 3 7 7 9 1 .725;3 .6773 6346 .5;965; .5625 ~5319 4795 .4362


,4000

.9704 9605 9501 9393 9282 9169 8941 8713 8489 8271 .8058 .7560 .7107 .6699 .6330 .5;997 .5695
.5170 .4731
.4359 .4040

.3692 .3428

.3763

906 Table 17 (Continued)


P := 6

P~ R. Krishnaiah

c~ =

, 010

5 6 7 8 9 10 :1.2 14 16 : 1 . 8 20 25 30 35 40 45 50 60 70 80 90 :1.00

~.8 7 4 5 8405 8075 7758 ,7457 7:1. 7 3 ,6654

.6195
5790 ',3431 5:1.1:1. ~4450 3936 ,3527 ,3:194 ~2917 ~2685 ,23:1.5 2034 18.1.4 , :1.637 .149:1.

,8930 8624 8324 8031 ~7750 ,74}-]2 ~6985 ~6539 6140 ,5783 ~5462 ~4790 ,426:1. 3834 34}34 .3:1,92 ,2944 2548 2245 ,2006

,9065
o 8788 ,8512 .8241 ,7978 .7724 7250 ,6819 ~6429 6076 .,5757 , ,5081 4542 4103 ~3739 ~3434 o3:1.75 ~,2757 2436 218:1. :1.974 . :1.8 0 3
6

.9169 8916
.866 .840 :l (.3

,92,.12-~
90:1. 9 [ 878 ? I ,8546 ,8..3.13 8086 "7653 7252 ,6882 ~6543 6231 5559 5011

~8162 ~7922
~7469 o7052 ,6673 ~6326 ,6011

,5335 .4790
~.4J~42 ~3969 ,36[';4 ~.33}!14 2948 2611 ,2342 ~2124 :1.942

4557 4:1.77
3854 3',:';76 3:1.24 2773 2493 2263 2072
et ....

,1813
,1654
p =

9320 ;9104 8883 8661 . }!)441 8225 ,7811 ~7424 7064 ,6732 ,6427 ~5760 ~ 521:1. ~4754 4368 4038 3754 3289 2926 2634 ~2395 2:1.96

,9376 ~9175 ~8968 o8758 8550 8545 .7948 7574 7225 .6901 .6601 .[";942 .5393 .4934

,4543
,4209

o3919
~3443 ~3069 ~2768 ,2520 ~ 23:1.2

~ 0 .1.0

.I. 0

15

20

25

5 6 7 8 9 :1.0 12 14 :1.6 : 1 . 8 20 25 30 35 40 45 50 60 70 80 90 100

,9424 .9235 ,9041 8843 ,8645 }!3449 8068 ,7708 ~7369 7052 6757 6:1.0 7

,9465

o92~7 ,91()3 o8915


,8727 8540 8175 ~. 7826 74}77 ,7188

9500 ~93~3 ,9158 8980 ,8800 86^,. :1. 8270 7933 ~ 76.1.3
.),
. . . . . . )

,9531 ~9373 9207 ,9037 }!)}!165 869J; 8355 8029 ~ 77:1.9 ~. 7424

.6899 .6257
,5715

7028 ~6~96 ,5397 ,5001

~7147
.6524 5990 ,5531 ~..5:1. 3 4 ,4788 4485 ,3977 ,3571 3240 ,2964 ,273:1.

.5561
.5100

,5254
.4}{)[.";9 ,4516 ,4218 ,3725 ,3333 ,30:1. 5 2751 ,2530

.4706 .43;68
4073 3588 3?04 2894 ,263}-] ~2424

,4656
~4355 ,3854 ,3455 3130 2860 ,2633

~9642 o95:1.7 ,9384 ,9247 ,9:1.06 ,8964 8679 ,8399 8:1.2 8 7867 ,76:1.6 ,7042 ,6536 6091 ,5698 5350 ,5041 ,45:1.5 ,4086 ,373:1. ,.:;43:1. ,3:1.7Ei

o9710 ,9607 ~9497 93}.].1. 9262 ,9141 8895 8651 8411 8:1.76 ,7950 ,7421 ~ 69.aA-; 6519 o6137 5794 5486 ,4954 45:1.3 ~ 4:1.43 ,3828 ,35[57

,9756 ,9669 ,9574 9475 9371 9266 9050 8 }!3 Z3 8618 8406 8200 7712 7265 6859 6490 6:1.57 5853

5322
4877 4498 4173 3891

Computations of some multivariate distributions


T a b l e 17 (Continued)
P s ~ :::: 7 1 2 ~x .... . 1 0 0 5

907

5 6 7 8 9 :1.0 12

.'1.4
16 18 20 25 30 35 40 45 50 60 70 80 90 100

.8266 , . 7899 .7552 ,,7226 .6923 ,,6639 .6130 .5687 .5300 4959 .4659 ,- 4 0 4 2 3566 .3190 .2885 .263? .2420 2084

.1830
~. ].6 J;1

:1.47:t.
.1339

.8497 ~.8 1 6 0 .7838 .7532 .7244 .6972 6478 6042 ,.5656 .53:1.4 .5009 ~ 4[:~76 .3881 3486 3162 2893 ?666 2304 ?028 1811 1636 1.491

8672 , 8362 .8062 .7774 ,. 7 5 0 1 7241 .6763 .6336 .5955 ~.56:1.4 ,. 5 L';0 8 4665 .4158 .3748 o3 4 1 0

.3".I. 28 .2888
2504 .2210 ,1 9 7 7 1788 1633

. 7711 7463 .7002 .6586 .6211 .5873 .5568 ,.4971 .440.4 .3983 3635 .33,41 . 3091 2689 2 3 7 8 ~ 213:1. .1931

. 7887 .7650 .7206 .6801 .6434 6:1. 0 0 .5797 .5149 .4627 .4198 .3840 3'538 3?79 ~2860 .2535 .2276 .2065

.1765

.1889
(% =:

.9012 .8763 ~8515 .8273 8038 .7810 .7382 .6990 6630 .6302 .6001 .5355 .4829 .4394 o4 0 3 0 3720 ~3454 3020 o2 6 8 3 ~ 24:1.2 .2:1.92 .2008
100 ?0

9090 8855 .8622 8392 8167 .7950 ~7537 7:1. ',".;6 o6 8 0 4 6482 ,6185 .5542 50:1.,'] .4576 4205 3889 o3617 .317:i .2822 .2542 .2312 2121

p .... 7 7 8 9
10

:1.5

25

5 6
7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 60 70 80 90 100

.9].56 .8935 .8714 .8495 .8281 .8072 .7674 7 3 0 3 .6960 ~6644 .6351 .57:1.4 .5186 o.4744 .4369 .4048 3770 .Z3:t4 .295',:'; .2666 .2428 . ?229

o9213 .9004 .8794 .8586 .8380 .8180 .7796 ~7 4 3 6 .7101 .6791 .6503 .5871 ~5344 .4900 .457.t .4197 ~3915 .344 .3081 o2 7 8 4 2538 , "233.~

o9 2 6 2 ~9065 8865 8666 8469 8276 .7905 .7555 .7229 6925 ~6641 .60:1.6 5;492 .5047 4666 .4338 .4052 ~3578 3202 .2897 .2645 ~ . 2;433

~9 3 0 6 9:1. :1,8 89?7 ~8 7 3 7 .8548 .8363 .8004 .7664 .7345 .7048 .6769 .6151 .5629 .5184 .4802 .4471 ~ 4.1.8:1. .Z700 .33:1.7 .3005 .2746 ~2529

.9465 .9314 ,9:1.5 9 9003 8845 8689 8383 8087 7804 .75J;5 ,.. ,.) (: 7=.d0 .... 70,..
d "? ,,~,,00
t=-'? z "

5380 5043 .4744 3829 3-491


3207

.9564 .9438 9308 .9176 .9041 8906 8639 ,8378 8125 7882 .7648 .71:1.0 .6633 .6210 5834
.5499

5:1.9 8
4683 4258 3903 3602

? S 65

.3344

.9632 ,9525 9412 .9297 ,9180 9062 J8 8 2 5 8 5 9 2 ,8363 .8141 ~7927 ~7427 ,6975 6569 o6 2 0 3 5874 .5575 5058 ~4626 426 : I . 3948 .3677

908 Table 17 (Continued)


]3 :::: 7

P. R. Krishnaiah

.... :~ 3

. ~50

.1.

5 6 7 8 9 : I . 0 12

8523

.8722

.8872

.8 :L76
7843

8406
.8098

.8582 8298
8 0 2 2 ,7757 ,7503 ,7032 ,6608 6225
.5881

,8990 ,8722 ,8457

~9084
,8837

~916:3

8589
.8345
.8108

,8932 ~8 6 9 9
.8470
,8::~44

,7527 7229 .6949


.644

7804

.8 . 1 . 99
7948 .7707
,7255 6842 ,6-468 ,6:1.29 .5820 ~5162

:L

,7523 .7257 ~6766


.6330 ,5941 ~559.4 .5282 ,463:1. ,4119 3707

o7 8 7 8 7444
~7045 6679

~8 0 2 5 ~7608
:72:'!1 6865 ~6537

~9229 9().1.3 ~8794 8576 8361 8:1.52 7750

14
16 18 20 25 30 35 40 45 50 60 70
80

.5994 .5601 52'53

.4944
4305 3809 .3;414 ,3093 .2826

,260:1.
,2244 .1972
.1.7',?;9

.3369 ,3086 ~2847 ,2465 ~ 2:1.72 ~1942


. .1.7',55
. .1.60:[ p = 7

,5570 ,491.4 .4391 .3966 o3615 , :..;320

.634'5 ,6040
5383

.7376 7029 6708


,64.1.1

.6236
~5583

~4632
,4198

,3837
3',:';32 3::~7::~ 2850

4850 ~. 4409 4040


37?7 ,3.458 ,3021 2 6 8 1

5 0 4 8 ~ 46()? 4 2 ? 7
3907
~

3069 ,2665 .2355


.2109

~2524 .2264
2053
.1878

3 6 3 .1. ,3.1.8:1. .2830


~2547
,23.1.6 ~2:L23

2 4 1 0
,2188 2004
....

90

100

1588 :[447

.1.9 0 9 . .1.7 4 4

~5764 ~ '.';229 4 7 8 0 o4 4 0 0 ,407.5 ~3793 .3332 ,2969 2678 ,2437


?237

~ 050

s
5 6 7 8 9 ' I . 0 12 14

:1.0

15

20

: ~~.

.9285
.9082 .8875 8668

16
18 20 25 30 35 40 45 50 60 70 80 90
100

,8463 ,8263 7877 ,75.1.4 ,7176 6862 6570 ,5929


~,5395 .494',?; .4561 .4232 .,3;946 .3474 ,3:L02
2801

.9334 9141 8946 8749 8553 8 3 6 :L 7989 7638 7308 700 : I . 67.1.4
6081 5549 5098 ,47.1.':~

.9376 ,9.1.94

9008
8820 8633 8449 8 0 9 0 ,7749 ,7429

.94.1.:.3 ,9240 ,9063

8884
8704 ,8527 ~ 8.1.81 7 8 5 . 1 . 7538 ,7244 6968 ,6351 5825 .5375 o4 9 8 6 .4649 ,4352 3858 .3463
3.1.41 287-:]

.7 : I . 28
6846 o6 2 2 . 1 . ~5692
,524:1.

9 5 4 8 ~9410 .9267 ~9120 8972 8823 8529 8 2 4 3 ~7967 7704

,9632 .9517 ~9397 ~9273 ,9147

9019
8764

8512
8 2 6 6 .80?9 780.1. 7269 ~6795 ,6372 ,5995 56'.?;7

9 6 9 0 ~ 9'592 9488 .9381 .~9271 9:L'59 .8934 .8709

~4488
,8273 8063 7571 7;1.23 6719

.4853
,4517

6353
602? 5722 5.1.99 4762 4390 407.1. 3794

,2554 ,2346

~4379 o4 0 8 9 3608 3228 ,2920 ,2664


2450

.4224 ,3736 3348 ,3032 ,277.1. 2550

53'53
4830 4398 4036 3727 3462

2647

Computations of some multivariate distributions

909

Table 17 (Continued)
p ....

c~ ....

.025

5 6 7 8 9 : I . 0 .1.2
: I . 4

16 18 20 25 30 35 40 45 50 60 70 80 90 100

8729 .8402 ~.8 0 8 3 7778 .7487 ,,72:[2 .6708 626:[ ,.5864 .55:1. 1 ~ 5:1.95 .4540 .4027 ~36.16 o3280 .~3 0 0 1 ,2765 .2389 .2102 ``:[877 .1695 :1.545

8901 .8605 ~8313 8029 .7757 .7496


70:[ 3

.6579 .6188 5838 .5522


4858 .4:3:31 3905 .35,54 .,3260 3 0 :[ 1 ~26:[0 ~2303 o206:1. :[ 8 6 . 4 1702

~9031 .8760 8491 .8227 .7971 .7724 ~7262 .6841 .6459 .6:1.:[:,'; 5800 .5133 .-4597 .416:1. .3798 .3492 ``3232 , 2811 2487 ``2229 `` 2020 :I. 8 4 6
7

,9133
8884

.8634 .8387 8145 .7912 .7469 ,7062 .6690 .635:1. 6040


~5374

,4834 `` 4 3 9 0 ,4018 .3703 .3433 .2996 .2656 2385 .2164 .: I . 981

.9215 8985 8 7,J .I. 8519 d~-. :2 .I. 8069 7646 7253 6891 65`".;7' 625".:.~ 5589 5047 4,';97 42 . I . 8 3896 36:1.9 -3:1.66 2. 8 1 4 2531 2300 2107 c~ .... 025

9283 9068 .8850 .863:1. .8415 8204 7798 .74:1.8 ~. 7 0 6 6 67.40 6439 5783 5240 o 4786 4402 407`"5 ``3791 3326 2962 2669 2428 ,,2228

o9340 ~9139 .8933 ~8727 .8522 ~8320 .793:[ ~, 7 5 6 4 .7222 .6903 .6607 ~5 9 5 8 o5416 ~ 496:1. ~457:3 ,4240 ~ 395:1. .3476 o3102 .2800 ~ 2`"551 .2342

10

:[ 5

20

25

5 6 7 8 9 10 12 14 . :1.6 18 20 25 30 35 40 45 50 60 70 80 90 100

.9388 .9200 o9006 8810 ~ 86:1.5 o8422 `` 8048 .7694 . 7 3 6 :[ ~.7 0 4 9 .6760 `` 6118 .5579 `';:1.2 2 .4732 .4395 4:1. 0 2 .3617 .3234 ~2923 ,2667 .24',".";:[

~. 9430 .9252 ~9 0 6 9
,8883

.8697
8512 . 8 1 `'.i2 7 8 :[ 0 ~. 7 4 8 6 o 7:[82 o6898

~6264 .5729 .5272 .4880 , 4540 .4244 .375.1. .3360 .3042 .2778 .2556

``9466 .9298 .9124 8947 .8769 . f~592 .8246 .7914 .7599 .7302 7023 .6400 ,5867 .5412 5019 .4677 4378 .3878 .3480 o3154 2885 .2657

9498 .9339 .9173 9004


8834

.8664 .8330 8008 .7702 ~7413 ~,7:1.39 .6524 .5997 .5`'543 .5:149 .4806 `` 4504 .3999 .3594 ~3263 2987 ,. 2 7 5 4

9613 9487 9354 9216 907`'.; 8934 8652 ~8374


8106

.9686 `` 9`'i81 o9469

9353
9234 o9112
8868 8625 `` 8 3 8 7 8155 o7931 .7407

9735 ,9645 .9`".;50 .9449 . ~ 9345 .9240


9024 ,8808

7848 .760:[ o 703:3


~. ~.) ,.J , ) s..

,6092 .5703 . `"5358 505.1. `` 4 5 2 8 .4:[00 .3;746 .3-447

6936 6,5.1.3 6 .1.3`'.; 5795 ~,5489 4961 `` 4522 4153 3839 3`"568

,8`'.;94 o 8385 ~8180 .7695 .7252 .6850 .6485 o6153 ,5851 .5325 4882 .4505 .4180 3899

910 Table 17 (Continued)


p ::= 7

P. R. Krishnaiah

5 6
7 8

9
10 :1.2 :14

16
18 20 25 30 35 40 45 50 60 70 80 90 :L 0 0

,~8 9 4 7 ~.8 6 4 5 .8346 .8055 .7776 .7508 ,, 70:1. 3 6'568 .6169 .5812 5490 .48:1.7
.4286 .3858

3506 .32:1.2 .2963 .2565 2260 2021 18":26 1666

909:1. .88:1.9 .43546 .43278 .8017 .7766 .7294 .6864 .6475 .6122 5803 .'6125 4583 ~ 4:1.42 .3777 .3469 3208 2 7 8 7 ~2462 2206 .1997 1824

.9199 o8 9 5 2
.8701 .8453

.9284 .9057

.9352 .9:1.43
.8927

.9408

,8825
.8593 .8364 .. 13:i 41 .77:1.2 .7314 .6946 .6608 .6297 .5624
.5073

.8710
.8494 .8283

8209 ~7972 .7522

.9214 .9013 .8808 8605


.8403

.9455 .9274 9085 8 8 9 2 8699 8508

7:1, 0 8
.6729
.6383

,6068 o5391 .4843 .4393 40:1.7 .3699 .3427 2987 .2646 ~ 237',:.:; .2:1.54 1970

.7874 .7490 .7133 6803 6497


,5831 .52430 ,4820

.46:1,7 .4233 3908 .3627 .3:1.71 .28:1.6 .2t"532 ~2299 o2106

4431
.4098

. 8 0 :L3 ~7643 .7297 ~6975 .6675 6017 .5467 5005 .46:1.2 .4274
.3981 .3501 .3122

.8134
,, 77713 ~7443 7128

6835
~,6 1 8 4 .5638 .5:1.7Ei ~4779 ~4437 ~4140 3649 ,, 326:1. ~2947 o 26437 .2469

.3811
.3342 .2974
.2678

. ? 816
2 ',fi6 ',"-i . 2 J;'54

~2436 .2234

P :::: "7
m

.... 0:1. 0
:1.0

:1.5

20

25

5 6 7 8 9 10 12 .1.4 :1.6 18 20 25 30 35 40 45 50 60 70 80 90
100

.9496 9325 .9:1.47 8965 8782 8 5 9 9 .824:1. 7 8 9 8 7572 .7266 6 9 7 8 6338 .5795 ,5332 .4934 .4589 4289 3790 .3393 3071 2804 .2'579

~9530 9370 9202 ~9029 .8854 8680 8336 .8004

.7688
7390 ~7109 6478 .5939 .5477 5 0 7 9 .4732 4 4 2 8 .3922
,35:1.8

9560 .9409 9249 9085 8919 13751 8421 8100 .7794 7503 .7227 .6607 .607~'~ ,5614 .52:1.5 4866 .4560
,4048

%5136 94.43 9291 91 i35 8976 8815 8497 8187 7 8 9 0 7606 .7337 6726 o6198 ',';741 5342 4995; 4685

.9682 .9568 .9447 9320 .9189 .9056 8789 .8523 8 2 6 3

80:1. 2
.7770 .721:1. .67:1.3 ,,6272

5881
,5532 .5221 .4689 ~4253 .3889 3';382 3 3 2 0

4:1;6:.7
3751 3409 3124 ? 882

.3637
.3301 .3022 271!!',';

.3188
. ::29:1.5 2684

97.42 9648 9,%46 9439 93?8 92:1.5 8984 8752 8523 8298 8080 .7566 7 0 9 9 .6679 6300 5959 5650 5115 .4669 .4294 .3972 .36,95

.9782 .9702 ~ 961~5 ~9 ' , : ' . 2 i3 ~9427 ,9 3 2 8 ~9125 8 9 :L9 8714 851:1. 8312 7839 7402 7002 6638 6306 .6() 0 ~ .','-;472 5024 o 464:1 . 4 3 1 :i .4024

Computations of some multivariate distributions


Table 17 ( C o n t i n u e d )
p .... 8
1 o

911

(x = 3

0:1. 0 0

5 6 7 8 9 1.0 12 14 16
18

.8517
8183 7862 .7558

.7270 .6999 .6505 .6069 .5683

8702 I.')396 .8099 .7814 .7542 7283 6806 6381

.5999
";658 .5351 .4706

20 25 30 35 40 45 50 60 70 80 90

,5340 5035 4400 ,3904 ,3507


.3182 .2912

.8845 8563 8286 8019 7761 7514 7056 6641 6267 5929 5;623 4974
445;4

8959
.8698 8439 8187

.9130

.8902
.8674 84413 8228
.8013

.9:1.96 .8981 8766


.8552

.7943
.7708

8342
.8136

7267 6864 .6498 .6164


.5861

.7605

.7226 6 8 7 7

.7743 .7376
.7035

.5211
4685 .4254 .3893 3588 .3326

.4:1.96 .3784 ~.3 4 4 4 .3160


2918

.6555 .6259 ,56.1.6 .5085


4.457

6 7 2 0 6428

.5790
.5260 .4816 4 4 3 8

4()~1
3679

.4089
.3776
3507

.2683 ~2319 2 0 4 2 1824 .1647


.1502

3384 313:1.
.2724 .2410

,25;3.1. ,2234
1999

.I. 0 0

1809 1651

.2160 1958 1 789


8

.2902 .2574 .2311


.2097 .1919

3 0 6 9 0 "'" 0 - " . ~. 7,..7


0 ~:" -

. "?2 "2.9 . i:ii() 4 3

.4643 4270 3951 .3675 .3225 .2871 .2587 .2354 ,2159

4114 3833 3372 3008


2715 247Z 2271

p=

c~ = 9

1. 0 0

s ~

10

:i5

20

25

5 6 7 8 9 10 12 14 16 18

.9252 9 0 5 0 .8846 .8642 .8442 .8244 .7866


.75:L 0

.9301 .9:1.10 .8916 .8722 .8530


.8341

.7:i78
6869

.7975 .7631 7307

91944 9163 I!3978 8793 8609 8427 8 0 7 -4


7740 7424 -7" 0 , .i ~.. 8

9 3 8 2 .9209

9520 .9~'~82 9093 .8945 I"i798


.8507

.9033
8856

8679 8504 .8163

.7838
7 5 3 1 .7241 .6969 .6361
5841 .5395 .5010

20 25 30 35 40 45 50 60 70 80 90 1.00

.6582
.5950

.7005 .6722
.6098

8224 .7952 .7692

6851 6234
.571. 1 .5?64 .4880

7445
6878 6 3 8 2 .5947
,.~o64

,5422
.4976

.4595
.4267
.3981

,5572 ,5.125 .4742 ,44.1.0 .412:1.


.3642 .3261 .29',";2 .2695 .2480

.4546
4 ~ ,.J3

.35:1. 0 3138
.21!336

.3767 .3379 3063


2800

.467.4 .4~.W9 .3886


.3492 .3169

492..~ o.441.1. . ~'~ 99~'.~

3647
.3,.';55 .3106

.2587 .2378

? 5 7 9

2901 ~2674

.9608 39492 .9371 .92.47 .9121 .8995 .8741 o8 4 9 2 .8249 .8014 .7788 7263 6 7 9 3 6 3 7 4 6000 .5665 5 3 6 3 .4843 .4413 4 0 5 1 3 7 4 4 .3 4 7 9

.9668 ~ 9,569 .946','; 9357 .9247 .9135 .8911 8 6 8 9 .8469 .1325;6 8 0 4 9 .7561 o7118 ~6717 6 3 5 4 .6026 5 7 2 8 .5208 .4772 4402 .4084 .3 8 0 8

912

P. R. Krishnaiah

Table

1'7 ( C o n t i n u e d ) P .... I!1 o~ =

,050

\
5 6 7 8 9 I0 12 14 16
18

20 25 30 35 40 45 50 60 70 80 90 I 00

.8739 +8 4 2 5 ,8120 71326 .7546 ,7281 .6792 ~6355 .5966 .5619 .5307 .4655 ~4141 ~ 3"727
.3388

13898 .8612

,8330
o8057 ,7794 .7542 ,7072 ~6649 ~6267 ,5923 .5611 ,4953 4428 +4 0 0 1 ,3647 o3 3 5 1 .3098 .269.1. .2378 ,2130 1929 1762
p .... 8

,3104 21364 ,2480


,2186

1954
. .1.766

,9020 13757 ,8496 ,13241 +7993 ,7754 7304 6894 ,6'521 ,6181 ~ 58"7"~ +5212 4680 ,4244 ~ . 3880 +3573 3i?;I 0 ?885 2556 .2293
2 0 I!I 0

.9117 8875 ,8632 ,8392 .8158 ~7930 ,7500 .7103 ,6739 .6405 .6100 +'5442 .4906 .4463 ,4091 .3776 ,350.4
,3063 ,2720 ,2445

9197 8972 8744 8518 8296 8080 7667 7283 6929 6602 6302 5648
5110

4662 4284 3962 3 6 8 4 .3229 ,2873 2 5 8 7


2353

, .1.6:1.2

,1902

,2720 . ?034

2 1 5 7
=

9263 9 0 5 3 L3839 8 6 2 6 o8415 8 2 0 9 ,7813 7 4 4 2 .7097 ,6777 .641!11 5833 ,5;296 ,4845 .4463 .4135 ,3851 ,3384 .3017 ,2722 ,2478 ~2275
050

,9319 o9122 +8 9 2 0 ,8718 8 5 1 8 ,8321 ,7940 ,7582 ,7246 ,6933 ,6642 .6002 .5466 .5013 ,4628 ,4296 ,4007 ,3530 .3154 .21349 259111 .23137

. X

10

15

20

25

5 6

7 8 9 10 12 14 .1.6 18 20 25 30 35 40 45 50 60 70 80 90
100

.9367 ,9181 .8991 8 7 9 9


8608 , 8 4 ;L 9

+9 4 0 9 ,9233 ,9053 +8 8 7 0
+ I]I 6 8 8

o8 5 0 7
. I!I 154 7818

8054 .7706 7380 7 0 7 4 671!19 ,6157 ~5623


+ 5170 4782 ,4446 4:1.53

.3668 ,3283 ~2971 .27.1.2 ,7494

7501 770"2 69?? 6?98 5769 5316 4926 4587 4291 3799 3406 3086 282.1. 2597

9445 ,9279 .9107 +8933 .8758 ,8585 .8245 +7920 ,7610 ,7319 .7045 ,6429 ~5903 ,5452 ,5061 +4721 ,4422 ,3923 ~3524 ,3197 .2926 ,2696

.94713 .9319 .9156 .8989 ,8822 ,1!1655 .1!1327 ,8012 ,7711 +7426 +7.1.57 .6551 .6029 ,5579 .51138 .41347 ,4546 ~4041 ,3636 +3304 .3026 .2792

.9595 9 4 6 8 ,9336 .9199 9 0 6 0 8 9 2 1 ,8643 .8369 ,8105 7851 7 6 0 7 7047 .6552 ,6116 .5730 ,5387 .5081 .4560 ,4134 ,3779 .3479 .3223

9669 9564 9452


9337

92113 9099 8857 8617 8382 8153 7931 7414 6948 6530 6.1.54 5817 5513 4987 4549 4181 3867 3596

9 7 2 0 9630 95;34 9434 9330 9225 9012 8798 8587 8379 13177 7691!1 7260 61!161 6499 6169 5870 5346 4904 4528 4204 3923

Computations of some multivariate distributions


T a b l e 17 ( C o n t i n u e d )
p

913

= 8
m

c~ =

025

i
]

3
,,9243
,90:i 8
,9312 .9103 .8889 8675 ,877?

5 6 7 8 9 10 12 :1.4 16 1~ 3 20 25 30 35 40 45 50 60

,8916 ,, 8 6 2 2

8331 8050

7 7 7 8 7519

.7036
6602 ,6213 5862 ,5546 .4880 ,4352 ,3925 ,3573 ,3278 .3027 ,2625 ,2316 2073 ,1875 ,1712

,9054 ,8786 8 5 2 0 ,8258 8 0 0 4 ,7759 .7299 .6879 ,6498 ,6153 ,5839 ,5170

,9159 ,8915 ,8668 .8425 ,8:i86 .7955 .7515 ~7110 .6740 ,6400 ,6090

8789

8561

,9417 ,9235 9046 8855


,8664

,8336

. 8.1.1 7 ,7697 I 7 3 0 7 ,6946 ,6614 ,6309

8463 8254
78.".;3 7477 7126 6802 6501 '

.8571
8373 ,7988 ,7625 ~7285
.6968

8475

.8:1.07
.7756

,6673
,6024

,4633
,4195 ~3830 ,3523 ,3260 ,2837 ,2510 ,2250 ,2039 ~1864
p =

.5422 l 4881 l 4434


,4060

.5646 .5101
.4649 ,4268

5845
.5301

.374:3
3 4 7 1

~3 9 4 3 ] I ]
,3664

.4845 4458 .4128

.5481 .5024 .4634 .4298


.4007

,7426 ,7117 6828 .6187 ,5;647 ,5189 ,4796

3842

.4457 .4162
,3673 ,3285 o297:[ ,2710 ,2492

3030
2 6 8 7
2414

70
80 90 100

.3207 ~285:1. .:2566


...... ~ 3 o,. .2137

3373
3 0 0 5 .2 7 0 8 ,2465 2261

.3527 .3149
::?.843 ,2591 2 3 8 0

, ,2191

. ::_'005 8
J

. . . . 025
9 10 15

7 I
5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 60 70 80 90 100

20

25

9458 9287 ,9109 ,8928 8 7 4 6 ,8565 ,8212


,7873

9494

952~i

9332

7 5 5 3 ,7251
,6968

,6336
,5800 ,5342 ,4948

,9163 ,8991 8 8 1 8 ,8645 ,8305 7978 7667 7372 7095 6473

.9372 .92:12 9048 .8882 .8716


8389 ,8072 ,7770 7483 ,7212 ,6599 6072 ,5617

.9553 9408 .9255

.9098 .8939
8 7 8 0 8 4 6 5 ,8:1.58 .7864 .7584 .7319 .6716 ,6194 ,'.:.;741

,4606

.4307
.3810

5941 5484 5089 4745 4.443


3939

.5222 .4876 .4572 .4062


.3,6'53

.5346
5000 ,4694
,4180

.3414
.3092 .2825

3536
3207 2934 2703

,3318

,3765 ,3424

,3039
2 8 0 2

,3139
,2898

,9654 ,9538 9 4 1 4 9286 ,9156 9 0 2 3 8756 8492 ,8235 7 9 8 6 7747 ~7193 6700 6 2 6 4 ,5876 5 5 3 1 ,5221 4 6 9 3 4 2 5 9 3 8 9 7 ,3591

,9717 ,9621 ,9517

.9.409
,9298 ,9184 ,8953 ,8722 ,8494 8 2 7 1 ~8 0 5 4 ~7544 ,7082 ,6665 ,6290 ,5951

,9761 9 6 7 8 ,9589 ,9.496 ,9399 ,9300 ,9096 8 8 9 1 ,8686

8484
8286 7816

7383
6986 6625 6295 5995 5;467 5021 4641 4312 4026

~5645
o5114 ,4671 ,4297 3977 ,3700

,2600

,3329

9t4

P. R, Krishnaiah

] ' a b l e 17 ( C o n t i n u e d )

p ::: 8

e~

=:

010

4
5
6 7

5
,9479 9303 ,9120 ,8931 8742
.8554 ~8:1.86

8
9 10 12 14
]. 6 ]. 8 20

.910 ;3 ,883:3 ,8 5 6 3 8296 ~8036 ,1717 85 7315;


6 I386
.6497

,9218 ,8974 ,8726 8480

,8238 .8003
,7556
7:1.44 .6766 ~. 6 4 2 0 o 6 ]. 0 4 ~'5427 4877 ~ 442,'3..i ,4048

,6145
51325

25 7;0
35

40 45 50 60
70

.51.47 ,4604 ~ 4:1. 6:1. ~3795 3487 +3 2 2 4 , :280 :L


,2.476

9306 9083 8855 8627 8400 8 :L 7 9 7754 7358 699,,1. 6654 6343 5;669

,9376 ,9171 ,8960 , f1747 85~;4 8325 7920 ,75;40


.7].85 .68,55

9432 9243 9047 8 8 4 7 8647


.134413

~8()63 7697 7353


7032

,6551 5 8 8 4

6 7 3 4 6075
5525 5060

o5118
,4659 ,4274 ,3946 ~3664 3204 ,2846 ,2560 232'.'.5 , 2 :L3 0

53,~?
,4870 ~4479 ,4:1.44
.3855

, J;729
,3455 ,3012 ~2669

3382

.3010

80 90 :I. 0 0

2!2:1. 7

.2008 :I. 8:.34

,2396 ,2173 , ].9 8 8


p

,2712 ~2467 ,2263

,4665 4326 4031 3546 ~3163 ~2855 ,2600


2388 c~ =

7834 7501 ,7188 ,6896 6247 5700 5235 4837 ~4493 ,4194 ,3699

1 1

~3307
.2989 2727
2506 . 010

,9519 ,9355 ~9182 9005 ,8825 8646 ,8294 7954 ,7632 7328 7042 6403 5860 5396 ,4996 4650 ,4346 384:.~ ,344~3 .3117 .2847 .2619

= 8
,,1.0

15

2 ~0

25

5
6

7 8 9
10 12 .1.4 16 18

20 25 30 35 40 45 50 60 70 80 90
]. 0 0

9554 9399 9236 90613 8898 8727 8389 8062 7750 ,7453 .7174 ~6546 6007 ~5544 5144 4795 4489 3979
357].

,9583 ,9438 ,9283 ,9124 ,8961 8798 8474 ~8159 7856 ,7567 ,7294 ~6676
.6'1'1.43 .5682

,9609 ,9471 ~9325 ,9173

9018
8862 8550 ,8246 ,7952 ,767]. 7404 ,6797 ,6269 ,, 5 8 :L 1
o5-411

~9632 ,9501 ,9362 ,9217 9069 ,8919

,97].5 9611
.9499

8619
8324 8 0 3 9 ,7766 ~7505 6908
.6387

9381 ,9259 913'."5 8882 ,8630 ,8382 8].40


o7907

,9767 ~9681 ~9 5 8 7 ,9488 9385 9279 9061 8840 8621 8405


8194

,9804 ,9730 ,9649 ,9563 ,9474

.9381
9189 8994 8798 8603 8412 7952 7525

,5282 ,4932 ~4623 ,4].07

.5060 .4750 ,42.29


.3809

3693
.3353 .3070 ~2830

3238 2961 ,2727

,3463
~3174 ,2930

,5932 .5533 ~'5181 4870 ,4345 ,3919 ,3569 ~3275 3025

7361 6873 6 4 3 7 ,6047 ,5699 5387


4851

.4409 4039 3725 3456

769'.-'; 7238 ,6824 6449 ,6109 ,5801 +5264 .4815 ~4434 .4108 3826

7133
6773 6444 ,6142 ,5611 5160 .4774 ,4441 ,4149

Computations of some multivariate distributions


Chart 1 C h a r t f o r L a r g e s t R o o t of M u l t i v a r i a t e B e t a M a t r i x p=2, S Ioll~
8~

915

~ =0.01

70o

,~& 0 .tSb

,o6o .o~8 .6~

.!
.6

.1II6

.150

,17~

.2 .7

,3 .8

.4 .9

.426 .~

.4~o .~6o

,4~s .5 ,~

.l~J A ~

,m

+6%6

.$

! .0

916

P. R. Krishnaiah

Chart

1 (Continued)

p = 2, a = 0.025 s

0 .5

.I

.2 .7

.:,:,r, .,t~,o .,~lr, .3 .8

.6

.9

1.0

Computations of some multivariate distributions


Chart 1 (Continued) p=2, S lOO0 900 eoo 7oo 6o0 500
400

917

a =0.05

3OO

200

f7-11
~:-r!
2L ' :

lOg 9O 8O 70 60 5O 40

~T . . . .

::
30

}
,

ii
25 20

i
lO 9

7
6

~1;!

0 .5

.I .6 .7 .B .9

.5 10

918
Chart 1 (Continued)

P. R. Krishnaiah

p = 3 , a=O.O1
s

.~"~

.50
.1SO

,,'7 I%

. .3 4 .9 .5 1.0

.,-,r;

,I 5 .6

.2
.7'g5 .715

.7

.8

A~

Computations of some multivariate distributions


Chart 1 (Continued) p = 3, a = 0.025

919

90( 80C 7O( 60( 5tK

30{

20(

15{

10( 9( 8(
7[

a( 5(

0 ,5

.1 .6

,2

.j

.4

.7

.8

.9

|.0

"~A~

920

P. R. Krishnaiah

Chart 1 (Continued) p=3,


S Iooo ~o ~oo 7o0 ~o

et = 0.05

JOo

300

I00

60

Jo

10

9 7

.I ,6

,2 ,7

.3 .8

.4 9

.5

Computations of some multivariate distributions

921

Chart 1 (Continued) p=4, a =0.01


S IOOO 9OO 8OO TOO 6OO 500

!;ii

ii

!if! ;~ i!!~

;i]: !i
i ' ' r il

r,

!!~

300

........

200

r',
i~1 t ;I;:

i!
11111'.11111 :; ;]

100 60 70 60 3~

40

I0

9 8 7 6

225

.~Sa .27~

,3~ .3 .8

.3go

,3'7~
.4 .9 1.0

,I

.2 .7

A~

922

P. R. Krishnaiah

Chart

1 (Continued)

p = 4, c~ = 0.025 s q: ;~[

i,

H_ i~l!

!Ii1!!
l!!!!!!

.15

.lfi,l

tl~, .2

.3 .8

.4

.5

.6

.7

.9"

| .0

Computations of some multivariate distributions


Chart 1 (Continued) p = 4, a = 0.05 S Iooo 900 800 70o

923

500

400

300

200

150

10O 90 80 70 60 50

40

30 25 20

15

0 9 8 7 6 5 0 .5 .I .6 .2 .7 .3 .8 .4 .9 .5 1.0

924 Chart I (Continued)

P. R. Krishnaiah

p = 5 , a =0.01
S

m~

10

~0

0 . .5

d,dl

.D~5

.I .6

.2 .7~,~ .7 .750 ,77%

.3

.4 ,?

.5 ! .0

,'.%0 .575

.8

Computations of some multivariate distributions


Chart I (Continued)

925

p = 5 , c~= 0,025
S 1000 900 800 700

500

300

150

IO0 9O 8O 70 6O

4o

30 25

20

~0 9 8
7 6 5 0 .,5 .6 .7 .8 .? 1.0

926 Chart 1 (Continued) p = 5 , a =0.05


t3 ~oo .

P. R. Krishnaiah

"oo ~o w~

400

300

30

5 0 ,5 .l ,6 .2 .7
.3

.4
,,~ ..,~ .~

.5
A~'

.8

.9

1.0

Computations of some multivariate distributions

927

Table

18 Roots of the Multivariate i--2 2 .4217 .3865 .3565 .3309 .3087 ,2893 .2569 ,2311 .2099 ,1923 ,1775 .1487 .1279 .I123 .1006 .0902 .0821 3 .4772 .4407 .4093 .3820 .3582 .3371 .3015 .2727 .2496 .2290 .2120 .1788 ,1545 .]361 ,1216 .1099 .1002 4 .5227 .4858 .4537 .4255 .4006 .3785 .3468 .3098 .2840 .2622 .24:35 .2066 .1793 .1584 .1419 .1286 .1174 5 .5608 .5241 .4918 .4632 .4378 .4149 .3756 .3431 .3158" .2924 .2723 .2324 .'2026 .1796 .]612 .1463 .1339 Beta Matrix

Percentage Points of the Intermediate p = 4 r Sx % 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .2620 .2350 .2130 .1948 .1794 .1663 ,I451 .1287 .1156 .1049 .0960 .0792 .06'15 .0587 .0520 .0467 .0423 1 .3523 .3198 .2927 .2699 .2503 .2334 ,2056 .1836 ,1660 ,1514 .1392 .1158 .0992 .0867 .0770 .0693 .0629

a=.050 7 .6213 .5856 .5538 .5252 .4994 .4760 .4351 .4007 ,3712 .3458 .3237 .2790 .2451 .2185 .1971 .1795 .1649 10 .6856 .6526 .6225 .5951 .5699 .5467 .5054 .4700 .439] .4120 .3880 .3387 .3005 .2701 .2452 ,2245 .2070 15 .7550 .7263 .6995 .6746 .6514 .6297 .5902 .5552 ,5242 .4964 .4714 .4186 .3764 .3418 .3131 .2889 .2781

p= 4

i ~ 2 2 .4884 .4497 .4167 .3881 .3632 .3412 .3043 .2746 .250I .2297 .2123 .1785 .1539 .1353 .1207 .1090 .0994 3 .5406 .5016 .4678 .4382 .4]20 .3887 .3492 .3t69 .296] .2675 .2481 .2099 .1820 .1606 .1437 .1300 .1186 4 .5828 .5442 .5103 .4802 .4534 .4294 .3882 .3542 .3256 .3012 ,2802 .2386 .2078 .1840 .1650 .1497 .1369 5 .6177 .5798 .5462 .5162 .4891 .4647 .4226 .3873 .3574 .3318 .3096 .2651 .2318 .2059 .1852 .1683 .1542 7

a ~ .010 10 .7300 .6973 .6672 .6395 .6139 .5902 .5477 .5109 .4786 .450] .4247 .3722 .331t .2982 .2713 .2487 .2296 15 .7910 .7631 .7369 .7123 .6892 .6674 .6277 .5922 .5604 .5318 .5060 .4509 .4066 .3702 .3397 ,3138 .2916

s\

r 5 6 7 8 9 10 12 14 16 18

0 .3323 .2995 .2727 .2501 .2311 .2147 ,1880 .1672 .15(i6 .1369

1 .4218 .3848 .3537 .3272 .3044 .2846 .2517 .2256 .2044 ,1859 .1720 .1436 .1232 ,1079 .0960 .0864 .0786

.6723 .6364 .6040 .5746 .5478 .5234 .4804 .4438 .4124 .3850 .3610 .3124 .2752 .2459 .2222 .2027 .1864

20
25 30 35 40 45 50

.1255
.1039 .0887 .0773 .0635 .0615 .0558

928

P. R. Krishnaiah

Table

18 ( C o n t i n u e d ) p = 4 t = 3 2 .6136 .5708 .5334 ,5004 .4711 .4449 .4003 .3637 .3333 .3074 .2852 .2416 .2(195 .1849 .1655 .1498 .1368 3 ,6563 .6149 .5780 .5451 .5155 .4890 .4431 .4049 .3727 .3453 .3214 .2742 .2389 .2117 .1901 .1724 .1578 4 .6902 .6504 .6145 .5821 .5527 .5262 .4797 .4406 .4073 ,3786 .3537 .3035 .2657 .2363 .2126 .1934 .1773 5 .7179 .6797 .6449 .6132 .5844 ,5579 .5114 ,4719 .4379 .4084 .3825 .3301 .2903 .2589 .2337 .2129 .1956 7 .7605 .7255 .~i930 .6631 .6354 .6099 ,5641 .5245 .4899 .4594 .4326 .3771 .3341 .2999 .2719 .2488 .2292 a = .050 10 .8045 .7736 .7446 .7173 .6918 .6678 .6242 .5856 .5513 .5207 .4932 .4354 .3896 .3525 ,3217 .2958 .2739 15 .85112 .8246 .8002 .7768 .7545 .7333 .6938 .6581 .6255 .5960 .5689 .511t7 .4630 .4233 .3899 .3613 .3365

S \r _ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0 .4810 .4386 .4030 .3725 .3463 .3236 .2858 .2559 .2317 .2117 .1948 .1624 .1392 .1218 .1083 .0974 .0885

1 .5579 .5145 .4772 .4448 .416,1 ,3914 .3492 .3152 .2872 .2636 .2437 .2048 .1767 .1553 .1385 .1250 .1139

p = 4

i= 2 .6787 .6356 .5972 .5628 .5319 .5042 .4562 .4164 ,3828 .3542 .3295 .2805 .2441 .216U .1937 .1756 .1606 '3 .7157 .9746 .6374 .6038 .5733 .5455 .4971 .4563 .4215 .3916 .3655 .3133 .2740 .2435 .2190 .1990 ,1823

3 4 .7448 ,7057 .6700 .6374 .6075 .5801 .5317 .4905 .4551 .4242 .3973 .3427 ,30tl .2684 .2422 .2206 .2025 5 .7683 .7313 .6971 .6655 .6364 .6095 .5616 .5203 .4845 .4532 .4256 .3691 .3257 .2914 .2636 .2406 .2213 7

~ =.OlO 10 .8410 .8120 .7842 .7579 .7330 .7094 .6660 .6271 .5923 .5608 .5325 .4724 .4242 .3848 .3520 .3243 .3006 15 .8788 .8551 .8321 .8099 .7885 .7680 .7294 .6941 .6616 .6318 .6045 .5450 .4957 .4545 .4195 .3894 .3633

s~ r
5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0 .5612 .5157 .4766 .4428 .4135 .3876 .3445 .3099 .2815 .2579 .2379 .1992 .1713 .1502 .1338 .1206 .1098

1 ,6299 ,5850 ,5457 ,5111 .4804 ,4531 .4067 .3687 .3372 .3105 ,2877 .2431 .2104 .1854 .1657 .1498 ,1366

.8043 .7708 .7394 .7100 .6826 .6570 ,6107 ,5700 ,5342 ,5025 ,4742 .4154 .3694 .3325 .3022 .2770 ,2556

Computations of some multivariate distributions


Table 18 ( C o n t i n u e d ) p=5 6N r 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .2005 .1803 ,1638 .1500 .1384 .1285 ,1124 .0998 .0897 .0815 .0748 .0618 .0527 .0459 .0407 .0365 .0332 1 .2812 .2554 ,2340 .2159 .2003 .1869 .1648 ,1473 .1333 .1216 .1119 .0931 .0798 ,0698 .0621 .0558 ,0507 2 .3465 ,3174 ,2928 .2718 .2536 .2376 .2111 .1899 .1725 .1581 .1458 .1222 .1052 .0923 ,0823 .0742 .0675 3 .4008 .3698 .3432 .3202 ,3000 .2823 ,2524 ,2282 .2083 .1916 .1773 .1495 .1292 .1138 .1017 .0918 .0838

929

i =2
4 .4467 .4146 .3868 .3625 .3410 .3219 .2895 .2630 .2411 .2224 .2065 .1750 .1520 .1342 .1203 .1088 .0995 5 ,4860 .4534 .4249 .3997 .3774 .3574 .3231 .2949 ,2712 .2510 .2336 .1992 .1736 .1538 .1380 ,1252 .1146

~=.050 7 .5500 ,5174 .4884 .4625 .4392 .4182 .3816 .3509 .3249 .3023 ,2827 .2433 ,2135 .1903 .1715 .1562 .1434 10 .6207 .5894 .5611 .5355 .5120 .4905 .4526 .4200 .3918 .3672 .3456 .3011 .2667 .2394 .2172 .1988 .1832 15 .6994 .6712 .6452 .6211 .5987 .5779 .5404 .5074 .4783 .4523 .4290 .3800 .3411 .3094 .2831 .2609 .2419

p = 5 S~ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .2569 .2318 .2112 .1939 .1792 .1666 .1461 .1301 .1172 .1066 .0978 .0811 .0692 ,0604 .0536 .0481 .0437 1 .3395 .3097 .2845 ,2632 ,2448 .2288 .2024 .1814 .1644 .1502 .1383 .1155 ,0992 ,0868 .0773 .0695 .0633 2 .4046 .3720 .3443 .3204 .2996 .2813 .2507 .2260 ,2058 .1890 .1746 .1468 .1265 .1113 .0992 .0896 .0816

i =2 3 .4575 .4236 .3944 .3689 ,3465 .3267 .2931 .2657 .2430 .2239 .2075 .1755 .1520 .1341 .1200 .1085 .0991 4 ,5015 .4672 .4372 .4107 .3873 .3664 .3306 .3012 .2766 .2557 .2377 .2022 .1759 .1556 .1395 ,1265 .1157 5 .5389 .5044 .4741 .4472 .4231 .4015 .3643 .3333 .3072 .2849 .2656 .2271 .1983 .1761 .1582 .1437 .1316

a=.OlO 7 .5987 .5651 .5350 .5079 ,4834 .4611 .4221 .3892 .3610 .3367 .3153 .2722 .2395 .2138 .1930 .1760 .1616 10 .6640 .6324 .6036 .5772 .5531 .5309 .4913 .4572 .4274 .4013 .3783 .3307 .2937 .2641 .2399 .2198 .2028 15 .7355 .7076 .6817 .6575 .6350 .6139 .5756 .5418 .5117 .4848 .4605 .4092 .3682 .3346 .3066 .2830 .2627

930

P. R. Krishnaiah

Table

18 ( C o n t i n u e d ) p=5 1=3 2 .5084 .4707 .4381 .4097 ,3847 .3626 .3251 .2946 .2693 .2480 .2298 .1942 .1681 .1482 .1325 .1198 .1093 3 .5557 .5181 .4850 .4558 .4300 .4068 .3672 .3345 .3072 .2839 .2640 .2245 .1952 .1727 .1548 .1403 .1283 4 ,5946 .5574 .5244 .4950 .4687 .4449 .4040 .3698 .3410 ,3163 .2949 .2522 .2203 .1956 .1757 .1596 .1461 5 .6271 .5906 .5580 .5287 .5022 .4783 .4364 .4013 .3713 .3456 .3230 .2778 ,2436 ,2168 .1955 .1779 .1631 a=.05O 7 .6784 .6438 .6124 .5839 .5577 .5339 .4915 .4553 .4240 .3967 .3726 .3236 .2858 .2560 .2318 .2117 .1949 10 .7331 .7016 .6726 .6457 .6208 .5976 .5566 .5197 .4878 .4594 .4352 .3817 .3404 .3072 .2798 .2569 .2375 15 .7920 .7650 .7397 .7158 ,6933 .6721 .6332 .5983 .5670 .5388 .5131 .4585 .4142 .3777 .3471 .3210 .2986

S~r 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

O ,3711 .3374 ,3092 ,2854 ,2649 ,2472 ,2180 ,!950 .1763 1609 1480 1232 1055 .0923 .0820 .6738 .0670

1 .4488 .4122 .3810 .3542 .3309 .3104 .2762 .2488 .2262 .2075 .1916 .1608 .1384 .1216 .1083 .0978 .0890

p =5 S~ r 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 O .4392 .4013 .3694 .3421 .3185 .2979 .2638 .2367 .2146 .1962 .1808 .1510 .1297 .1136 .1011 .0911 .0828 1 .5130 .4735 .4396 .4100 .3842 .3613 .3229 ,2918 .2661 ,2446 ,2262 .1904 ,1645 .1447 .1292 .1166 .1064 2 .5685 .5289 .4943 .4638 .4368 .4128 .3716 .3379 .3097 .2858 .2654 .2250 .1954 .1725 .1545 .1399 .1278 3

i =3 4 .6474 .6095 .5755 .5451 .5174 .4925 .4490 .4124 .3812 .3845 .3311 .2842 .2490 .2214 .1994 .1813 .1662 5 .6767 .6400 .6068 .5767 .5493 .5243 .4804 .4431 .4112 .3834 .3592 .3100 .2726 .2433 .2196 .2001 .1838

~=.010 7 .7226 .6883 .6569 .6280 .6015 .5769 .5332 .4955 .4627 .4338 .4083 .3559 .3153 .2829 .2567 .2348 .2163 10 .7709 .7403 .7117 .6850 .6600 .6367 .5945 .5572 .5243 .4950 .4686 .4135 .3698 .3345 .3052 .2806 .2597 15 .8224 .7966 .7720 .7487 .7265 .7055 .6668 .6317 .6000 .5712 .5450 .4887 .4428 .4046 .3724 .3450 .3213

.6121 .5732 .5387 .5080 .4805 .4558 .4131 .3776 .3477 .3222 .3001 .2561 .2233 .1979 .1778 .t613 .1477

Computations of some multivariate distributions


Table 18 ( C o n t i n u e d ) p= s~ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .5699 .5263 .4886 .4558 .4271 .4016 .3587 .3240 .2954 .2713 .2509 .2112 .1822 .1602 .1430 .1291 .1176 1 .6306 .5878 .5501 .5167 .4870 .4605 .4149 ,3775 .3461 .3196 .2967 .2517 .2185 .1930 .1728 .1564 .1429 5 2 .6754 .6342 .5973 .5642 .5344 .5074 .4607 .4216 .3886 .3603 .3358 .2869 .2503 .2220 .1994 .1810 .1657 3 .7102 .6707 .6350 .6026 .5731 .5462 .4991 .4592 .4251 .3957 .3699 .3180 .2789 .2482 .2236 .2035 .1866 ~ =4 4 .7380 .7004 .6660 .6344 .6955 .5790 .5319 .4917 .4569 .4267 .4002 .3460 .3048 .2722 .2459 .2243 .2061 5 .7609 .7250 .6919 .6614. .6332 .6070 .5604 .5202 .4851 .4545 .4273 .3715 .3286 .2944 .2667 .2437 .2243 a= 7 .7962 .7637 .7331 .7046 .6778 .6529 .6078 .5681 .5330 ..5020 .4742 .4163 .3708 .3342 .3042 ,2791 ,2577 .050 lO .8330 .8046 .7775 .7517 .7273 .7042 ,6618 .6237 .5896 .5588 .5309 .4717 .4242 .3853 ,3528 .3253 .3018 15 .8716 .8481 .8254 .8036 .7825 .7623 .7244 .6896 .6578 .6285 .6016 .5429 .4944 .4536 .4190 .3892 .3633

931

p~5 S~ r 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .6402 .5953 .5559 .5210 .4902 .4626 .4156 .3771 .3450 .3180 .2947 .2492 .2157 1903 1701 1538 1404 1 .6931 ,6502 .6118 .5773 .5463 .5182 .4696 .4292 .3950 .3658 .3405 .2902 .2528 .2239 .2009 .1822 .1667 2 .7317 ,6911 .6542 .6206 .5900 .5621 .5131 .4717 .4363 .4058 .3791 .3255 .2850 .2535 .2281 .2075 .1902

~ ~ 4 3 .7614 .7231 .6878 .6553 .6255 .5980 .5493 .5076 ,4716 .4401 .4126 .3565 .3137 .2800 .2528 .2304 .2116 4 .7850 .7488 .7151 .6839 .6550 .6281 .5800 .5384 .5021 .4703 .4420 .3842 .3396 .3042 .2754 .2515 .2315 5 .8042 .7700 .7380 .7080 .6800 .6538 .6066 .5653 .5291 .4969 .4684 .4093 .3632 .3264 .2963 .2712 .2500

a~.OlO 7 ,8338 ,8031 _7739 ,7462 .7201 .6955 .6503 .6101 .5744 .5424 .5136 .4530 .4050 .3661 .3338 .3068 .2838 10 .8644 .8378 .8122 .7876 .7640 .7416 .6998 .6618 .6274 .5962 ,5677 .5068 .4573 .4165 .3821 .3531 .3280 15 ,8961 .8746 .8534 .8327 .8127 .7933 .7565 .7224 .6908 .6616 .6345 .5751 .5254 .4834 :4475 .4163 .3893

932

P. R. Krishnaiah
18 ( C o n t i n u e d ) P= ~r S 5 6 7 8 9 10 12 14 16 18 20 25 3t) 35 40 45 50 .1596 1440 1312 1205 1114 1036 .0908 .0808 ,0729 .0663 .0608 .0504 .0430 ,0375 .0333 .0299 .0272 2308 21Ol 1928 .1782 1656 .1547 1367 1224 .1108 .1012 .0932 0778 ,0667 .0584 ,0519 ,0467 ,0425 .2911 .2669 .2465 .2291 .2138 .2005 .1784 .1606 .1460 .1339 .1237 .1038 .0894 .0786 .0701 .0631 .0576 .3426 .3163 .2937 .2741 .2570 ,2418 .2165 .1958 .1789 .1645 .1524 .1286 .1112 .0980 .0875 .0792 ,0722 .3874 ,3595 .3354 .31t3 .2958 ,2792 ,25t2 ,2284 ,2093 ,1932 .1793 ,1522 ,1321 ,1167 .1046 .0947 .0866 .4264 .3976 .3725 .3504 ,3308 ,3132 .2833 .2584 .2377 .2201 .2048 .i747 .1522 .1348 .1210 .1098 ,1005 .4914 .4619 .4358 ,4125 .3916 .3726 .3399 .3124 .2890 .2689 .2515 .2164 .1898 .1691 .1525 .1388 .1274 ,5654 .5363 ,5100 .4863 .4646 .4449 .4101 .3803 .3545 .3321 .3124 .2719 .2408 .2161 .1958 .1792 .1651 .6504 .6233 ,5984 .5755 .5542 .5346 ,4991 .4682 .4408 .4166 .3948 .3493 .3132 .2838 .2596 .2391 .2216 0 1 6 2 3 i= 2 4 5 ~= 7 .050 10 15

Table

p = 6 \ 5 6 7 8 9 lO 12 14 16 18 20 25 30 35 40 45 50 r O1 .2057 1861 1699 1564 1448 .1348 .1184 .1056 .0953 .0868 .0798 .0662 .0566 .0495 .0439 .0394 .0358 1 .2804 2561 .2356 .2181 .2031 1899 1683 1510 1369 .1252 .1154 .0965 0829 .0726 .0646 .0583 0530 2 .3418 .3143 .2910 .2710 .2535 ,2380 .2123 .1915 .1745 .1602 .1481 .1246 .1075 .0945 .0844 .0762 .0694 3

= 2 4 .4371 .4069 .3805 .3574 .3369 .3187 .2875 .2619 .2404 .2223 ,2066 .1757 .1529 .1353 .1213 .1100 .1006 5 .4750 .4443 .4172 .3933 ,3719 .3528 ,3199 .2926 .2696 .2499 .2330 .1991 .1738 .1543 .13~7 .1259 .1153

a=.010 7 .5373 .5065 .4789 .4542 .4320 .4118 .3766 .3469 .3216 .2998 .2807 .2421 .2128 .1899 ,1715 .1563 .1435 10 .6072 .5773 .5503 ,5256 .5032 .4824 .4458 .4144 .3871 .3632 .3420 .2986 .2649 .2381 .2162 .1980 .1826 15 .6861 .6589 .6338 .6106 ,5890 .5688 .5324 .5004 .4720 .4467 .4240 .3761 .3380 .3069 .2810 .2591 .2404

.3932 .3640 .3389 .3169 .2977 .2807 .2518 .2284 .2088 .1925 .1784 .1509 .1308 ,1154 .1032 .0934 .0853

Computations of some multivariate distributions

933

Table

18 ( C o n t i n u e d )
p ~ 6 i ~ 3 a ~ .050

S~r 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50

0 .2977 .2708 .2483 .2292 .2128 .1987 .1753 .1570 .1420 .1296 .1193 .0995 .0852 .0746 .0663 .0596 ,0542

1 ,3714 ,3409 .3150 .2927 .2733 .2564 ,2281 .2054 .1869 .1713 .1582 .1328 .1144 ,1004 ,0895 .0808 .0736

2 .4304 .3979 .3700 .3457 .3244 ,3055 .2738 .2479 .2265 .2085 .1932 .1631 .1412 .1245 .1113 .1006 .0917

3 .4788 .4455 ,4164 .3910 .3684 .3482 .3139 .2857 .2622 .2422 .2250 .1912 .1662 .1469 .1317 .1194 .109I

4 .5196 .4859 ,4564 .4301 .4068 .3858 .3497 .3197 .2945 .2729

5 .5543 .5207 .4910 .4645 .4406 .4191 ,3817 .3504 .3239 .3011 .2813 .2415 .2116 .1882 .1695 .1542 .1415

7 .6104 .5778 .5484 .5218 .4977 .4757 .4370 .4641 .3757 .3511 .3295 .2856 .2519 ,2253 .2039 .1861 .1712

10 .6721 .6415 .6135 .5879 .5642 ,5423 .5033 .4695 .4399 .4138 .3906 .3425 .3649 .2749 .2500 .2294 .2120

15 .7405 .7134 .6882 .6647 .6428 ,6222 .5846 .5512 .5215 .4947 ,4705 .4193 .3781 .3442 .3159 .2918 .2712

.2542
.2172 .1896 .1682 .1511 .1372 .1256

p =6 S~r~ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 1 2 .4848 .4499 ,4196 .3932 .3697 .3490 .3137 .2848 .2608 .2406 .2232 .1890 .1640 .1447 .1295 .1173 .1071 3

~ = 3 4 .5692 .5352 .5032 .4755 .4507 .4283 .3895 .3570 .3296 .3060 .2856 .2447 .2140 .1902 .1711 .1555 .1425 5 .6016 .5671 .5363 .5085 .4835 .4607 .4210 .3876 .3590 .3344 .3129 .2694 .2366 .2109 .1902 .1732 ,1590

a=.OlO 7 .6535 .6205 .5905 .5633 .5383 .5155 .4750 .4404 .4104 .3842 .3611 .3140 .2776 .2488 .2254 .2060 .1897 lO .7099 .6795 .6514 .6255 .6015 ,5792 .5391 .5041 .4732 .4460 .4216 .3799 .3310 .2989 .2724 .2502 ,2314 15 .7716 .7452 .7204 .6970 .6751 .6544 .6165 .5826 .5522 .5248 .4999 .4468 .4038 .3683 .3385 .3132 .2914

.3554 .4279 .3245 .3942 .2985 . .3654 ,2763 .2572 .2405 .2129 .1910 .1732 .1584 .1459 .1219 .1047 .0917 .0816 .0735 .0669 .3405 .3187 .2996 .2673 .2414 .2200 .2021 .1869 .1573 .1358 .1194 .1066 .0963 .0877

.5309 .4957 .4648 .4375 .4132 .3914 .3540 .3231 .2970 .2749 .2559 .2180 .1899 .1682 .1510 !1369 .1252

934 Table 18 (Continued)


p=6
s x r\ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .4608 .4256 .3919 .3646 .3408 .3199 .2849 .2567 .2336 .2143 .1980 .1663 ,1432 .1258 .1123 .1032 .0922 1 .5259 .4878 .4547 .4257 .4601 .3774 .3388 .3074 .2812 .259l .2403 .2033 .1761 .1553 .1389 .1256 .1147 2 .5763 .5382 .5047 .4750 .4486 .4249 .~841 .3504 .3221 .2980 .2773 .2361 .2055 .1820 .1632 .1480 .1354 3

P. R. Krishnaiah

i=4
4 .6496 .6132 .5804 .5508 .5240 .4996 .4569 .4208 .3900 .3632 .3399 .2928 .2572 .2292 .2068 .1882 .1728 5 .6773 ,6420 .6099 .5808 .5542 .5298 .4868 .4502 .4186 .3911 .3669 .3177 .2801 .2504 .2264 .2066 .1900

a=.050
7 .7212 .6882 .6578 ,6298 ,6040 ,5801 .537d ,5004 ,4680 ,4395 ,4142 ,3621 ,3215 ,2890 .2625 .2405 .2217 10 .7682 .7385 .7107 .6848 .6605 .6378 .5965 .5601 .5276 .4987 .4727 .4181 .3747 .3393 .3100 .2854 .2644 15 .8190 .7937 .7698 .7470 .7254 .7049 .6669 .6325 .6014 .5731 .5472 .4915 .4460 .4080 .3760 .3486 .3250

.6165 ,5792 .5459 .5161 ,4892 .4650 ,4230 .3878 .:3581 .3325 ,3102 .2658 .2325 .2065 .1858 .1688 .1546

p=6
xr 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 0 .5238 .4840 .4497 .4198 .3935 .3704 .3312 .2995 .2733 .2512 .2325 .1959 .1693 .1489 .1330 .1202 .1096 1 .5843 .5444 .5095 .4786 .4512 .4267 .3847 .3501 .3212 .2967 .2756 .2340 .2033 .1796 .1609 .1458 .1331 2 ,6302 .5913 .5566 .5256 .4977 .4726 .4290 .3927 .3620 .3357 .3129 .2674 .2335 .2072 .1861 .1690 .1547 3

i =4
4 .6965 .6600 .6269 .5968 .5692 .5440 .4995 .4615 .4288 .4003 .3753 .3246 .2859 .2554 .2307 .2103 .1933 5 ,7211 ,6861 .6540 .6245 .5975 .5725 .5281 ,4899 .4567 .4276 .4020 .3495 .3090 .2768 .2507 .2291 ,2109

a=.OlO 7 .7600 .7277 .6977 .6698 .6438 .6197 .5761 .5381 .5046 .4749 .4485 .3934 ,3503 .3157 .2872 .2634 .2433 10 .8013 .7726 .7455 .7200 .6959 .6733 .0319 .5949 .5618 .5321 .5053 .4485 .403] .3658 .3348 .3087 .2863 15 .8455 .82t5 .7984 .7764 .7552 .7351 .6975 .6632 .6319 .6033 .5770 .5201 .4731 .4339 .4005 .3718 .3470

.6667 .6289 .5949 .5642 .5364 .5111 .4668 .4294 .3974 .3699 ,3458 .2975 ,2608 .2322 .2092 .1904 .1746

Computations of some multivariate distributions


Table 18 ( C o n t i n u e d ) p= s ~xr 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 O .6383 .5955 .5577 .5242 .4943 .4676 .4216 .3838 .3521 .3252 .3021 .2564 .2227 .1967 .1762 .1595 .1458 1 .6866 .6455 .6087 .5755 .5456 .5184 .4712 .4318 .3982 .3694 ,3445 .2946 .2573 .2283 .2051 .1863 .1706 6 2 .7230 .6840 .6484 .6160 .5864 .5594 .5120 .4716 .4370 .4070 .3808 .3279 .2877 .2563 .231t .2103 .1930 3 .7516 .7145 .6804 .6490 .6201 .5935 .5462 .5056 .4704 .4397 .4126 .3575 .3152 .2817 .2546 .2324 .2136 ix 5 4 .7747 .7395 .7068 .6765 .6485 .6224 .5757 .5351 .4996 .4685 .4409 .3840 .3401 .3050 .2764 .2528 .2328 5 .7938 .7604 .7291 .7000 .6726 .6472 .6013 .5610 .5256 .4942 .4662 .4082 .3628 .3264 .2966 .2718 .2507 ~=.050 7 .8235 .7933 .7646 .7376 .7120 .6880 .6439 .6047 .5697 .5383 .5101 .4506 .4034 .3650 .3332 .3064 .2836 10 .8548 .8284 .8032 .7789 .7557 .7337 .6926 ,6455 .6217 .5910 .5631 .5932 .4545 .4142 .3804 .3516 .3269 15 .8878 .8662 .845l .8246 .8047 .7855 .7492 .7158 .6844 .6557 .6290 .5795 .52]5 .4801 .4446 .4139 .3872

935

p=6 S~ 5 6 7 8 9 10 12 14 16 18 20 25 30 35 40 45 50 ~ 0 .6996 .6568 ,6185 .5840 .5529 .5247 .4759 .4352 .4607 .3711 .3456 .2948 .2569 ,2276 ,2043 .1852 .1695 1 .7411 .7009 .6641 .6306 .6000 .5720 .5228 .4811 .4453 .4143 .3874 .3329 .2917 .2596 .2338 .2126 .1949 2 .7720 .7343 .6994 .6671 .6374 .6099 .5610 .5190 .4826 .4508 .4229 .3659 .3223 .2879 .2601 .2371 .2179 3

i=5 4 .8155 ,7822 .7507 .7211 .6933 .6674 .6203 .5789 .5424 .5100 .4811 .4212 ,3742 .3366 .3057 .2800 .2583 5 .8315 .800] .7701 .7418 .7151 .6899 .6440 .6032 .5669 .5345 .5055 .4447 .3967 .3579 ,3260 .2991 .2764

a=.OlO 7 .8563 .8281 .8010 .7750 .7502 .7266 .6830 .6437 .6084 .5764 10 .8821 .8578 .8341 .8111 .7889 .7675 .7273 .6906 .6569 .6261 .5977 .5365 .4863 .4444 .4090 .3788 .3526 15 .9092 .8895 .8700 .8507 .8319 .8135 .7785 .7457 .7151 .6864 .6598 .6009 .5511 .5086 .4720 .4402 .4123

.7962 .7607 .7276 .6966 .6679 .6411 .5930 .5511 .5145 .4823 .4538 .3950 .3496 .3133 .2839 .2594 .2389

.5474
.4860 .4365 .3960 .3623 .3338 .3094

936

P. R. K r i s h n a i a h

T a b l e 19 P e r c e n t a g e Points of the R a t i o s of the I n d i v i d u a l R o o t s to the Trace of C~ = r 0 I 2 3 4 5 6 7 8 -9 10 I1 12 13 14 15 16 18 20 22 25 p 3 p 0.05, 4 p j = 1. r 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 22 25 p C~ = 3 p 0.01, 4 p j

the Wishart Matrix


= l 6

5 p ::- 6 .5752 .5222 .4844 .456(I .4336 .4154 .4003 .3875 .3766 ,3670 .3586 .3510 .3443 .3382 .3327 .3277 .3230

=5 p

.8815 .8005 .7448 .7044 .6735 .6490 .6289 .6121 .5979 .5855 .5747 .5652 .5567 .5490 ,5420 .5357 .5299 .5196 .5108 ,5031 .4931

.7591 .6834 .6322 .5948 .5662 ,5434 .5248 .5091 .4958 .4842 .4742 .4652 .4572 .4500 .4435 .4375 .4321 .4224 .4142 .4069 .3976

.6568 .5928 .5485 ,5158 ,4904 .4700 .4531 .4389 .4268 .4162 .4070 .3988 .3914 .3848 ,3787 .3733 .3682 .3594 .3517 .3450 .3364

.9337 .8626 ,8076 .7653 .7319 .7047 .6823 .6633 ,6470 .6327 .6203 .6092 .5992 .5903 .5821 .5747 .5679 .5557 .5453 .5362 .5244

.8246 ,7475 .6925 ,6513 .6192 .5933 .5719 .5539 .5385 .5251 .5133 .5029 .4935 ,485I .4775 .4705 .4641 .4528 ,4430 .4345 .4235

.7207 .6517 .6026 ,5657 .5368 .5135 .4941 .4777 .4636 .4514 .4406 .4311 .4225 .4148 .4078 .4014 .3955 .3851 .3762 .3683 .3582

.6334 .5746 .5322 .4998 .4743 .4535 .436l .4214 .4087 .3976 ,3878 .3791 .3713 .3642 .3578 .3520 .3465

C~ = r 0 I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 22 25 p-3

0.95, p -- 4 .001421 .009804 .020794 .031596 .041483 .050353 .058281 .065384 .071775 .077557 .082813 .087816 .092025 .096091 .099854 .103349 .106608 .112512 .117729 .122384 .128511

j p

= - 5

p p-6 r 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 22 25 p 3

~ p

0.99, 4 p

j ~ 5

p p , 6

.003402 .020545 .040339 .058266 .073783 .087152 .098742 .108877 .11782l .125781 .132920 .139369 .145229 .150584 .155502 .160039 .164242 .171796 .178408 .184260 .191900

.000731 .005503 .012319 .019430 .026218 .032501 .038256 .04351l .048316 .052720 .056769 .060505 .063963 .067175 .070167 .072964 .075584 .080364 .084622 .088445 .093514

.000427 .003415 .007964 .012932 .017837 .022495 .026847 .030888 .034632 .038102 .041325 .044322 .047117 .049729 .052178 .054476 .056641

.000669 .008755 .022134 .036302 .049660 ,061810 .072745 .082573 .091432 .099448 .106737 .113395 .119503 .125131 .130336 .135168 .139669 .147815 .155002 .161405 .169820

.000279 .004159 .011352 .019587 .027786 .035549 .042752 .049386 .055483 .061094 .066267 .071047 .075480 .079601 .083444 .087037 .090405 .096553 .102031 .106952 .113475

.000144 .002330 .006709 .012015 .017520 .022894 .028003 .032800 .037281 .041458 .045352 .048987 .052386 .055571 .058559 .061370 .064019 .068888 .073262 .077219 .082501

.000084 .001444 .004331 .007985 .011902 .015825 .019628 .023259 .026694 .029934 .032985 .035856 .038560 .041109 .043516 .045792 .047947

Computations of some multivariate distributions

937

T a b l e 19 ( C o n t i n u e d ) =
p = 3 p

0.05

0.01

~ 4 p = 5 p ~ 6

p ~ 3 p == 4 p

, 5 p :~ 6

r 0 I 2 3 4 5 6 7 8 9 10 12 15 20 25 .4054 .4035 .4002 .3970 .3940 .3915 .3892 .3872 .3853 .3837 .3822 .3796 .3764 .3723 .3693 .3895 .3766 .3666 .3586 .3521 .3467
.3421

.3660 ,3494 .3370 .3272 .3193 .3127

.3416 .3243 .3113 .3010 .2926 .2857

.4472 .4395 .4328 .4272 .4224 .4183 .4147 .4116 .4088 .4064 .4041 .4002 .3955 .3895 .3850

.4251 .4087 .3961 .3861 .3780 .3712 .3654 .3604 .3560 .352l .3486 .3426 .3354 .3264

,3977 ,3782 .3635 .3519 .3424 .3346 .3279 .3221 .3171 .3126 .3086 .3018 .2936 .2834

.3700 .3501 .3353 .3231 .3134 .3052 .2983 .2923 .2871 .2824 .2783

.3071 .2797 .3023 .2980 .2943 .2910 .2853 .2784 .2700 .2746 .2701 .2561 .2661

.3381 .3346 .3315 .3287


.3240

.3182 .31II

c~ =
p=3

0.95
p=4p=5

p-i
p~6 p~3

0.99

p-I

p=4p=5p=6

0 1 2 3 4 5 6 7 8 9 10 12 15 20 25

,0946 .1446 .1749 .1952 .2099 .2210 .2298 .2369 .2429 .2479
.2522

,0359 .0640 .0846 .1000 Al20 .1217 .1297 .1364 .1421 .i471 .1515 .1589 .1674 .1776

.0178 .0348 .0486 .0598 .0690 .0766 .083I .0888 .0936 .0980 .1018 .1084 .1161 .1257

.0102 .0212 .0309 .0392 .0462 .0523 .0575 .0621 .0662 .0699 .0732

.0528 .0991 .1311 .1540 .1711 .1845 .1953 .2042 .2117 .2180 .2236 .2328 .2432 .2553 .2637

,0196 .0429 .0621 .0775 .0899 .1001 .i087 .1160 .1224 .I279 .1328 .1411 .1509 .1627

.0096 .0231 .0354 .0460 .0550 .0626 .0693

.0055 .0140 .0224 .0300 .0367 .0425 .0477

.0751 .0524 .0802 .0847 .0888 .0958 .1043 .1147 .0565 .0602 .0636

.2593 .2674 .2766 .2829

938

P. R. Krishnaiah

Table 20 Percentage Points of the Ratio of the Extreme Roots of the Wishart Matrix

p
. . . . . . . . .

m
. . . . . . . .

~ = 0.i00
. . . . . . . . . . . . . . . . .

a = 0.050
. . . . . . . . . . . . . . . . . .

a = 0.025
. . . . . . . . . . . . . . . . . .

~ = 0.010
. . . . . . . . . . . .

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

3 5 7 9 II 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89

.5195 .6306 .6866 .7223 .7477 .7669 .7822 .7947 .8052 .8143 .8221 .8290 .8351 .8406 .8456 .8501 .8542 .8580 .8616 .8648 .8678 .8707 .8734 .8759 .8782 .8804 .8825 .8845 .8864 .8882 .8899 .8915 .8931 .8946 .8961 .8974 .8988 .9000 .9012 .9024 .9035 .9047 .9057 .9068

.6345 .7254 .7696 .7972 .8165 .8311 .8426 .8519 .8598 .8665 .8723 .8774 .8819 .8860 .8896 .8929 .8959 .8987 .9013 .9037 .9059 .9079 .9099 .9117 .9134 .9150 .9165 .9179 .9193 .9206 .9219 .9230 .9242 .9252 .9263 .9273 .9282 .9291 .9300 .9309 .9317 .9325 .9333 .9340

.727O .7983 .8321 .8528 .8673 .8781 .8866 .8936 .8993 .9043 9085 9122 9155 9185 9211 9235 9257 9277 9296 9313 9329 9344 9358 9371 9383 9395 9406 9416 9426 9435 9444 9453 9461 9469 9476 9484 9490 9497 9503 95O9 9515 9521 9526 9531

.8182 .8677 .8907 .9046 .9142 .9214 .9270 .9315 .9354 .9386 .9413 .9438 .9459 .9478 .9495 .9511 .9525 .9538 .9551 .9562 .9572 .9581 .9590 .9599 .9607 .9614 .9622 .9628 .9635 .9641 .9646 .9652 .9657 .9662 .9667 .9671 .9676 .9680 .9684 .9688 .9692 .9695 .9699 .9702

Computations of some multivariate distributions


Table20(Continued)

939

0.100

0.050

0.025

0.010

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 5 5 5

91 93 95 97 99 i01 103 105 107 109 III 113 115 117 119 24 26 28 30 32 34 36 38 40 42 44 45 48 50 19 21 23 25 27 14 16 18

.9077 .9087 9096 .9105 .9114 .9122 .9131 .9139 .9147 .9154 .9162 .9169 .9176 .9183 .9190 .5945 .6076 .6193 .6300 .6398 .6488 .6570 .6648 .6719 .6786 .6849 .6909 .6965 .7017 .3999 .4203 .4384 .4548 .4697 .2338 .2619 .2867

.9347 .9354 .9360 .9367 .9373 .9380 .9385 .9391 .9397 .9402 .9407 .9413 .9417 .9423 .9427 .6454 .6572 .6679 .6776 .6864 .6945 .7020 .7089 .7153 .7214 .7269 .7323 .7372 .7419 .4423 .4623 .4800 .4958 .5102 .2656 .2943 .3194

.9537 .9542 .9546 .9551 .9556 .9560 .9564 .9568 .9572 .9576 .9580 .9584 .9587 .9591 .9594 .6889 .6996 .7092 .7180 .7259 .7332 .7399 .7461 .7519 .7572 .7623 .7670 .7714 .7755 .4805 .4997 .5169 .5321 .5460 .2950 .3241 .3493

.9706 .9709 .9712 .9715 .9718 .9721 .9723 .9726 .9728 .9731 .9733 .9736 .9738 .9740 .9742 .7378 .7471 .7556 .7631 .7700 .7762 .7821 .7874 .,7923 .7969 .8012 .8053 .8091 .8126 .5258 .5441 .5604 .5749 .5880 .3313 .3604 .3856

940

P. R. Krishnaiah

~7

~ o

O~

o
II

~ o

D~

~F

Computations of some multivariate distributions

941

II P~

m m

I!

I'

0 Lq II

II

O~ Ii

Cq !1

942

P. R. Krishnaiah

II

I[

. . . . . . . . . . . . . . .

II

I]

II

II

II

II

II

Computations of some multivariate distributions

943

m ~

II

.~ .

. ~

o--.1"1"--o

cqo'~ oooo

<or--oooo

oo

o~

o~

oo oo oo o~

~ o~o~

II

c o o-,

o
e~h II

.~

~. . . . .

. . . . . .

II II

II

944

P. R. Krishnaiah

"

I
xl ,-~ o~
o~ ,-4

,-H c-4

r-,~ o0

li
II

.~ .

RRR

~3

a~. o5 . . . . . . . . .

II

d II

. il

.~

.~

c~

Computations of some multivariate distributions

945

d d d d d d d & d d d

6 6 6 6 6 6 6 6 6 6 6

0
o o o o o . . . . . . . . .

~ o o o

0
o o . . . . . . . . . . . .

~5

od

~ ~o ~ r--ox r~ ~ ~ eq ~ ' ~ ..~ ~ ~ .~- ~ ~ ~ ''~ ~ ~ II


0 eq

C5 ~

N,...;

. . . . . . .

d ~

. C~ . . . . . ,---i ,.-.i . . . . . .

c~

. . . . . . . . . . . . .

"s
0

946

P. R. Krishnaiah

d d d d d d d d d d d

NdddddNNddo
0 d o d o d d d . . . . . .

o o o o o

~
o

~ , - - 0 o o ~ o . ~ . ~ , ~ c~ c~ o o . ~ . . . . . II

~ . .

~. o ~s

~'-

c~Sd

tl

Od

t'N

Computations of some multivariate distributions

947

oo i~. t-~. on 0o .~r on ,~


o

co ".~ d d d . . . . . . . .

. . . . . . . .

~ d r d

NN

. . . . . . . . . o . . . . . . . . II o ~ og ~ . . . . . . o0 o ~

N r-i~

N N ~

c,i c,i e,i ~

~
li

~1 ~o ~- o

t~ o
. . . . . . . . d d d d ~ d

N c,i rd oi r,i r,i N

. . . . . .

,'-,itd d

I1

M ~ i M N M M

. . . . .

M ~ M N M ~ M M ~

,~ ~

c,i ~-,i t--,i M t",i e,i M ,',i M c4 e,i

~d

d d ~ i d

d d d

~d

,..~ ,.-.~ C,i c,i M c,i c,i M e-i c-i e,i e-i c4 r,i

~ M M M ~ M M M M M

@
-.~ 0
0 o

,....; ,M c-i S

M ~i e,i c,i c,i c,i t'-,i ,M c,i e4

~ M M d M M M d M ~ M M ~ i

e,i r',i ~i c,i c4 c,i ,-q e',i e,i c,i e4 c4 e,i t--,i
co

eq
o

948

P. R. Krishnaiah

6 6 6 6

. . . . . . .
C~

o
II ko

. . . . . . .

II

. . . . . .

~ M M M ~ M M N

. . . . .

~ M M M M M M M ~ M N M N
-,~. 0 M N d M e i N ~ i M M ~ i N M M M

-q

Computations of some multivariate distributions

949

',,o 0o o-, o eq ,5 o o ,,.4 ,,-4

c:,5

. . . . . .

c:, . . . . . . e,i ,'--i oi oi e,i ,.q

. . . . .

. . . . .

,,.q c,i c-i e,i .:-i ,-q cq

~ ~ ~ o

~ N~i~

~i~i

. . . . .

,'-,i ~

c,i e,i ,'q ,.q e,i

c~

,4 ,z~ P-- o,. c:~ r,.l e.~ -~- ..,~ iPi ,~o o,,, . . . . ,'q c4 ,'q ,"q c,~ ,'q ,.q ~q

II

II

~ ~

~N

~ ~ oi~i~

II

,,--; ,.-; ~',i e,i r-i M

~i M

e,i oi M

e4

,.-; ,'q oi oi ,:-,i oi oi ,'q ~

,-4 ~-; ~

o 1-~ .t~ r-- oo ~,, o ~ ,"q c4 c-i oi r,,i e,i ,--,; ~

e,l rq ~ ',d- it~ ~,q ~.4 ~.4 ~,-; ~

1"-I

950

P. R. Krishnaiah

. . . . . .

~ E E E ~ E

~ II

~ o i ~ E E E ~ i E ~ E E

I[

~ E ~ { ~ E ~ E E E E E E

~ E E E E E ~ E E E E E E

~ E E E E E E E E E E E t-N m

[..

Computations of some multivariate distributions

951

~cq~

~ M ~ M N M ~
II

ii

,~1 II

t13

,@

~ i N ~ M M M M M M ~

~ N M M ~ M d M ~ m ~

.~ 0 k~ cq
~D

09

[-

952

P. R, Krishnaiah

II

II

M M M d d d M d d d ~

0 0

Computations of some multivariate distributions

953

!
o|
0 0 0 0 0 0 0 0 0 0 0 0 0 0 ~ ~ ~el~01 ~
. . . .

..... q
SS~<5 ~ddd odds dJ.~ ~J~ d~;gd

~
O3

> = m

Sddd

d~

ddg

dggg

gJJ4

jgd

............. ;211
l

&

ol

954

P. R. Krishnaiah

.C

q~?

. . . .

-..

~o

o o

o o

~ o

ocDo

~e~

e~

ClelC)~

c~c~'~

~,

it)

oJdJ

j ~: < ,4

,j ,J .j ,J

.j o; o] o;

~4d

g J44

d~,4

~oc/
o . . . . . . . . . . . .

~oo

ol

c~

o ~

Computations of some multivariate distributions

955

....

. .

R~=~

~...

~ .

....

...~

....

dddg

gggg

ggg

4;,~d

~d~g

.~.

956

P. R. Krishnaiah

o~oo

gSSg

~gd~

gSgd

gJ~J

JJJ&

oidg

~1

ggdd

. . . . . . . ." . . . . . .

g g g,:;

c; c; c: c:

~:~4g

g ~;,4 g

.4 ,J ,-; ~4

~4 ~4 ~4 ~g

gg~

~t.RRR

o o ~

. . . .

me

l0

g ~g

Computations of some multivariate distributions

957

gdgd

gg$d

gSdd

4d44

S4~J

44014

4gg

o o o c

o o o o

o o o o

o o ~

m@~

g~dJ

d4~

2 ~

0;dd

gggg
~o

ddd

gdJ~

;4~

;J~

ddgg

ggdd

#~d

gdd

958

P. R. Krishnaiah

~ o

. .... oo

. . . .

RQ

--'-1
. . . . . . . . Q ~ . . . . .

, , , ~

, , ~

~m

Co~putations of some multivariate distributions

959

c'

t-.

a~

k,

's

, , , ,

I---

-.1-

cO

04

!~

'

960

P. R. Krishnaiah

d ~ d

d~

d d Z Zd

g~4

dg

H,

. . . .

~ s s ~ s

c.l

Computations of some multivariate distributions

961

. . . . . . . . .

~, ~

~ o, ~

~ m

. . . . .

962

P. R. Krishnaiah

~o

Illl

IIII

Computations of some multivariate distributions

963

++

~r7
+~

+~

+
,..o

964

P. R. Krishnaiah

I I I I

++~

++

~ ~

~ + ~

++
0

c~
. . . . o ~ ~ ~ ~ ~o ~ o ~ o ~ ~ R ~ ~ ~~ ~ ~-~

Computatiort~ of some multivariate distributions

965

Acknowledgement T h e a u t h o r w i s h e s to t h a n k R. M. Boudreau f o r his v a l u a b l e h e l p in

c o m p u t i n g t h e T a b l e s 12, 14 a n d 17.

References
Armitage, J. V. and Krishnaiah, P. R. (1964). Tables for the studentized largest chi-square distribution and their applications. ARL 64-188. Wright-Patterson Air Force Base, OH~ Beckman, R. J. and Tietjen, G. L. (1973). Upper 10% and 25% points of the maximum ratio. Biometrika 60, 213--214. Bose, S. S. (1935). On the distribution of the ratio of two samples drawn from a given bivariate correlated population. Sankhya 2, 6 5 7 2 . Chambers, C. (1967). Extension of tables of percentage points of the largest variance ratio, SZm~/ S~. Biometrika 54, 225-227. Chang, T. C. (1974). Upper percentage points of the extreme roots of the MANOVA matrix. Ann. Inst. Statist. Math. Suppl. 8. Clemm, D. S., Chattopadhyay, A. K. and Krishnaiah, P. R. (1973a). Upper percentage points of the individual roots of the Wishart matrix. Sankhya, Set. B 35, 325-338. Clemm, D. S. Krishnaiah, P. R. and Waikar, V. B. (1973b). Tables for the extreme roots of the Wishart matrix. ,L Statist. Comp. and Simulation 2, 65-92. Constantine, A. G. (1963). Some non-central distribution problems in multivariate analysis. Ann. Math. Statist. 34, 1270-1285. Cornish, E. A. (1954). The multivariate t distribution associated with a set of normal sample deviates. Austral. J. Phys. 7, 531-542. David, H, A. (1952). Upper 5 and 1% points of the maximum F-ratio. Biometrika 39, 422-424. Davis, A. W. (1968). A system of linear differential equations for the distribution of Hotelling's generalized Tg. Ann. Math. Statist. 39, 815-832. Davis, A. W. (1970a). Further applications of a differential equation for HoteUing's generalized To 2. Ann. Inst. Statist. Math. 22, 77-87. Davis, A. W. (19701>). Exact distribution of Hotelling's generalized To 2. Biometrika 57, 187-191. Davis, A. W. (1970c). On the null distribution of sum of the roots of a multivariate beta distribution. Ann. Math. Statist. 41, 1557-1562. Davis, A. W. (1972a). On the ratios of the individual latent roots to the trace of a Wishart matrix. J. Multivariate Anal. 2, 440-443. Davis, A. W. (1972b). On the marginal distributions of the latent roots of the multivariate beta matrix. Ann. Math. Statist. 43, 1664-1670. Davis, A. W. (1972c). On the distribution of latent roots and traces of certain random matrices J. Multivariate Anal. 2, 189-200. DeBruijin, N. G. (1955). On some mtfltiple integrals involving determinants. J. Indian Math. 19, 133-151. DeWaal, D. J. (1973). On the elementary symmetric functions of the Wishart and correlation matrices. South African Statist. Ji 7, 47-60. Dunnett, C. W. (1955). A multiple comparisons procedure for comparing several treatments with a control. J. Am. Statis. Assoc. 50, 1096-1121. Dunnett, C. W. (1964). New tables for multiple comparisons with a control. Biometrics 20, 482-491.

966

P. R. Krishnaiah

Dunnett, C. W. and Sobel, M. (1954). A bivariate generalization of Student's t distribution, with tables for certain cases. Biometrika 41, 153-169. Dunnett, C. W. and Sobel, M. (1955).Approximations to the probability integral and certain percentage points of a multivariate analogue of Student's t distribution. Biometrika 42, 258-260. Feller, W. (1950). Introduction to Probability Theory and its Applications. Wiley, New York. Finney, D. J. (1938). The distribution of the ratio of estimates of the two variances in a sample from a normal bivariate population. Biometrika 30, 190-192. Finney, D. J. (1941). The joint distribution of variance ratios based on a common error mean square. Ann. Eugenics 11, 136-140. Fisher, R. A. (1939). The sampling distribution of some statistics obtained from non-linear equations. Ann. Eugenics 9, 238-249. Foster, F. G. (1957). Upper percentage points of the generalized beta distribution, II. Biometrika 44, 441-443. Foster, F. G. (1958). Upper percentage points of the generalized beta distribution, III. Biometrika 45, 492-493. Foster, F. G. and Rees, D. H. (1957). Upper percentage points of the generalized beta distribution, I. Biometrika 44, 237-247. Fujikoshi, Y. and Isogai, T. (1967). Lower bounds for the distributions of certain multivariate test statistics. J. Multivariate Anal. 6, 250-255. Griffiths, R. G. (1970). Infinitely divisible multivariate gamma distributions. Sankhya, Set. A, 32, 393-404. Grubbs, F. E. (1954). Tables of 1% and 5% probability levels of Hotelling's generalized T02. Statistics Tech. Note No. 926. Balistic Res. Lab. Aberdeen Proving Ground, MD. Gupta, S. S. (1962). On selection and ranking procedures for gamma populations. Ann. Inst. Statist. Math. 14, 199-216. Gupta, S. S. (1963). Probability integrals of multivariate normal and multivariate t. Ann. Math. Statist. 34, 792-828. Gupta, A. K., Chattopadhyay, A.K. and Krishnaiah, P. R. (1975). Asymptotic distributions of the determinants of some random matrices. Commun. Statist. 4, 33-47. Gupta, S. S. and Sobel, M. (t957). On a statistic which arises in selection and ranking problems. Ann. Math. Statist. 28, 957-967. Gupta, S. S. and Sobel, M. (1962). On the smallest of several correlated F statistics. Biometrika 49, 509-523. Hahn, G. J. and Hendrickson, R. W. (1971). A table of percentage points of the distribution of the largest absolute value of k Student t variates and its applications. Biometrika 58, 323-332. Hanumara, R. C. and Thompson Jr., W. A. (1968). Percentage points of the extreme roots of a Wishart Matrix. Biometrika $$, 505-512. Harter, H. L. (1960). Tables of range and studentized range. Ann. Math. Statist. 31, 1122-1147. Halter, H. L. (!961). Use of tables of percentage points of range and studentized range. Technometrics 3, 407-411. Harter, H. L. (1962). Contributions to multiple comparison tests. A R L 62-312. Wright-Patterson Air Force Base, OH. Harter, H. L. (1969). Order Statistics and Their Use in Testing and Estimation. Vol. 1: Tests Based on Range and Studentized Range of Samples from a Normal Population. U.S. Government Printing Office, Washington, DC. Hartley, H. O. (1950). The maximum F-ratio as a short-cut test for heterogeneity of variance. Biometrika 37, 308-312. Hartley, H. O. (1955). Some recent developments in analysis of variance. Comm. Pure Appl. Math. 8, 47-72.

Computations of some multivariate distributions

967

Heck, D. L, (1960). Charts of some upper percentage points of the distribution of largest characteristic root. Ann. Math. Statist. 31, 625-642. Hotelling, H. (1951). A generalized T-test and measure of multivariate dispersion. Proc. Second Berkeley Syrup. 1, 23-42. Hsu, P. L. (1939). On the distribution of roots of certain determinantal equations. Ann. Eugenics 9, 250-258. Hsu, P. L. (1941). On the limiting distribution of roots of a determinantal equation. London Math. Soc. ,L 16, 183-194. Hunter, D. (1976). A n upper bound for the probability of a union. J. Appl. Probability 13, 597-603. Jensen, D. R. (1969). An inequality for a class of bivariate chi-square distributions. J. Am. Statist. Assoc. 64, 333-336. Jensen, D. R. (1970). The joint distribution of traces of Wishart matrices and some applications. Ann. Math. Statist. 41, 133-145. Jensen, D. R. (1972). Some simultaneous multivariate procedures using Hotelling's T 2 statistics. Biometrics 28, 39-53. Jensen, D. R. (1974). On the joint distribution of Friedman's chi-square statistics. Ann. Statist. 2, 311-322. Jensen, D. R. and Solomon, H. (1972). A Gaussian approximation to the distribution of a definite quadratic form. J. Am. Statist. Assoc. 67, 898-902. Jogdeo, K. (1970). A simple proof of an inequality for multivariate normal probabilities of rectangles. Ann. Math. Statist. 41, 1357-1359. James, A. T. (1964). Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Statist. 35, 475-501. Johnson, N. L. and Kotz, S. (1968). Tables of distributions of positive definite quadratic forms in central normal variables. Sankhya Ser. B 30, 303-314. Johnson, N. L. and Kotz, S. (1972). Distribution in Statistics: Continuous Multivariate Distribu.. tions. Wiley, New York. Khatri, C. G., Krishnaiah, P. R. and Sen, P. K. (1977). A note on the joint distribution of correlated quadratric forms. J. Statist. Planning and Inference 1, 299-307. Kibble, W. F. (1941). A two-variate gamma type distribution. Sankhya 5, 137-150. Korin, B. P. (1968). On the distribution of a statistic used for testing a covariance matrix. Biometrika 55, 171-178. Korin, B. P. (1969). On testing of equality of k covariance matrices. Biometrika 56, 216--217. Korin, B. P. and Stevens, E. H. (1973). Some approximations for the distribution of a multivariate likelihood ratio criterion. J. Roy. Statist. Soc. Ser. B 35, 24-27. Kounias, E. and Martin, J. (1976). Best linear Bonferroni bounds. S l A M J. Appl. Math. 30, 307-323. Krishnaiah, P. R. (1965). On the simultaneous ANOVA and M A N O V A tests. Ann. Inst. Statist. Math. 17, 35-53. Krishnaiah, P. R. (1977). On generalized multivariate gamma type distributions and their applications in reliability. In: 1. N. Shimi and C. P. Tsokos, Eds., Proc. Conf. Theory and Appl. of Reliability with Emphasis on Bayesian and Nonparamteric Methods. Academic Press, New York. Krishnaiah, P. R. (1978). Some developments on real multivariate distributions. In: P. R. Krishnaiah, ed., Developments in Statistics Vol. 1. Academic Press, New York, pp. 135-169. Krishnaiah, P. R. (1979). Some developments on simultaneous test procedures. In: P. R. Krishnaiah, ed., Developments in Statistics, Vol. 2. Academic Press, New York, pp. 157 201. Krishnaiah, P. R. and Annitage, J. V. (1964). Distribution of the studentized smallest chi-square, with tables and applications. A R L 64-218. Wright-Patterson Air Force Base, OH.

968

P. R. Krishnaiah

Krishnaiah, P. R. and Armitage, J. V. (1965a). Percentage points of the multivariate t distribution, A R L 65-199. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R. and Armitage, J. V. (1965b). Tables for the distributions of the m a x i m u m of correlated chi-square variates with one degree of fleedom. A R L 65-136. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R. and Armitage, J. V. (1965c). Tables for the distribution of the m a x i m u m of correlated chi-square variates with one degree of freedom. Trabajos Estadist. 16, 91--96. Krishnaiah, P. R. a n d Armitage, J. V. (1965d). Probability integrals of the multivariate F distributions, with tables and applications A R L 65-236. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R. a n d Armitage, J. V. (1966). Tables for multivariate t distribution. Sankhya, Ser. B 28, 31-56. Krishnaiah, P. R. and Armitage, J. V. (1970). O n a multivariate F distribution, in: R. C. Bose et al., eds., Essays in Probability and Statistics. Univ. of North Carolina Press, Chapel Hill, NC, pp. 439-468. Krishnaiah, P. R, and Chang, T. C. (t971a). O n the exact distributions of the extreme roots of the Wishart and M A N O V A matrices. J. Multivariate Anal. 1, 108-117. Krishnaiah, P. R. and Chang, T. C. (1971b). O n the exact distributions of the smallest root of the Wishart matrix using zonal polynomials. Ann. Inst. Statist. Math. 23, 293-295. Krishnaiah, P. R. and Chang, T. C. (1972). On the exact distributions of the traces of S1(Sj + S2) - l and S l S f -1. Sankhya Ser. A 34, 153-160. Krishnaiah, P. R. and Chattopadhyay, A. K. (1975). On some noncentral distributions in multivariate analysis. South African Statist. J. 9, 37-46. Kristmaiah, P. R. a n d Lee, J. C. (1977). Inference on the eigenvalues of the covariance matrices of real and complex multivariate normal populations. In: P. R. Krishnaiah, e d , Multivariate Ana[ysis-IV. North-Holland Publishing Company, Amsterdam. Krishnaiah, P. R. a n d Lee, J. C. (1979). O n the asymptotic joint distributions of certain functions of the eigenvalues of some r a n d o m matrices. J. Multivariate Anal. 9, 248-258. Krishnaiah, P. R. a n d R a t , M. M. (1961). Remarks on a multivariate g a m m a distribution. Am. Math. Monthly 68, 342-346. Krislmaiah,. P. R. and Schuurmann, F. J. (1974a). O n the evaluation of some distributions that arise in simultaneous tests for the equality of the latent roots of the covariance matrix. J. Multivariate Anal. 4, 265-282. Kristmaiah, P. R. and Schuurmann, F. J. (1974b). O n the distributions of the ratios of the extreme roots of the real and complex multivariate beta matrices. A R L 74-0122, W r i g h t Patterson Air Force Base, OH. Krishnaiah, P. R. and Schuurmann, F. J. (1974c). On the exact distributions of the ratios of the individual roots to the trace of the complex a n d real Wishart matrices. A R L 74-0026. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R. and Schuurmann, F. J. (1974d). O n the exact distributions of the ratios of the extreme roots of the real and complex Wishart matrices. A R L 74-0118. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R. and Waikar, V. B. (1971a). Exact joint distributions of any few ordered roots of a class of r a n d o m matrices. J. Multivariate Anal. 1, 308-315. Krishnaiah, P. R. and Waikar, V. B. (1971b). Simultaneous tests for equality of latent roots against certain alternatives--I. Ann. Inst. Statist. Math. 23, 451-468; also A R L 69-119 (1969). Krishnaiah, P. R. and Waikar, V. B. (1972). Simultaneous tests for the equality of latent roots against certain alternatives--lI. Ann. Inst. Statist. Math. 24, 81-85. Krishnaiah, P. R. and Waikar, V. B. (1973). On the distribution of a linear combination of correlated quadratic forms. Comm. Statist. 1, 371-380.

Computations of some multivariate distributions

969

Krishnaiah, P. R., Hagis Jr., P. and Steinberg, L. (1963). A note on the bivariate chi distribution. S I A M Rev. 5, 140-144. Krishnaiah, P. R., Hagis Jr., P. and Steinberg, L. (1965). Test for the equality of standard deviations in a bivariate normal population. Trabajos Estiadist. 17, 3-15. Krishnaiah, P. R., Armitage, J. V. and Breiter, M. C. (1969a). Tables for the probability integrals of the bivariate t distribution. A R L 69-0060. ~,Vright-Patterson Air Force Base, OH. Krishnaiah, P. R., Armitage, J. V. and Breiter, M. C. (1969b). Tables for the bivariate t distribution. A R L 69-0210. Wright-Patterson Air Force Base, OH. Krishnaiah, P. R., Schuurmann, F. J. and Waikar, V. B. (1973). Upper percentage points of the intermediate roots of the M A N O V A matrix. Sankhya, Ser. B 35, 339-358. Krishnamoorthy, A. S. and Parthasarathy, M. (1951). A multivariate gamma-type distribution. Ann. Math. Statist. 22, 549--557; correction, ibid. 31, 229 (1960). Kshirsagar, A. M. (1961). Some extensions of the multivariate t distribution and the multivariate generalization of the regression coefficient. Proc. Cambridge Philos. Soc. 57, 80-85. Lawley, D. N. (1938). A generalization of Fisher's z test. Biometrika 30, 180-187; correction, ibid. 467-469. Marsaglia, G. (1960). Tables of the distribution of quadratic forms of ranks two and three. Tech. Rept. No. D1-82-0015-1, Boeing Scientific Research Laboratories. Mehta, M. L. (1960). On the statistical properties of the level-spacings in nuclear spectra. Nuclear Phys. 18, 395-419. Mehta, M. L. (1967). Random Matrices and the Statistical Theory of Energy Levels. Academic Press, New York. Mehta, M. L. and Gaudin, M. (1960). On the density of eigenvalues of a random matrix. Nuclear. Phys. 18, 420-427. Mijares, T. A. (1964a). Percentage points of the sum V (s) of s roots (s = 1 - 50). The Statistical Center, University of the Philippines, Manila. Mijares, T. A. (1964b). On elementary symmetric functions of the roots of a multivariate matrix distribution. Ann. Math. Statist. 35, 1186 1198. Moran, P. A. P. and Vere-Jones, D. (1969). The infinite divisibility of multivariate gamma distribution, Sankhya, Ser. A. 31, 191-194. Nair, K. R. (1948). The studentized form of the extreme mean square test in the analysis of variance. Biometrika 35, 16-31. Nanda, D. N. (1948a). Distribution of a root of a determinantal equation. Ann. Math. Statist. 19, 47-57. Nanda, D. N. (1948b). Limiting distribution of a root of a determinantal equation. Ann. Math. Statist. 19, 340-350. Nanda, D. N. (1950). Distribution of the sum of roots of a determinantal equation under a certain condition. Ann. Math. Statist. 21, 432-439. Nanda, D. N. (1951). Probability distribution tables of the larger root of a determinantal equation with two roots. J. Indian Soc. Agric. Statist. 3, 175 177. Pillai, K. C. S. (1956). On the distribution of the largest or the smallest root of a matrix in multivariate analysis. Biometrika 43, 122-127. Pillai, K. C. S. (1960). Statistical Tables for Tests' of Multivariate Hypotheses. Statistical Center, Univ. of the Philippines, Manila. Pillai, K. C. S. (1964). On the distribution of the largest of seven roots of a matrix in multivariate analysis. Biometrika 51, 270-275. Pillai, K. C. S. (1965). On the distribution of the largest characteristic root of a matrix in multivariate analysis. Biometrika 52, 405-414. Pillai, K. C. S. (1967). Upper percentage points of the largest root of a matrix in multivariate analysis. Biometrika 54, 189-194.

970

P. R. Krishnaiah

Pillai, K. C. S. and Chang, T. C. (1970). An approximation to the cdf of the largest root of a covariance matrix. Ann. Inst. Statist. Math. Suppl. 6, 115-124. Pillai, K. C. S. and Dotson, C. O. (1969). Power comparisons of tests of two multivariate hypotheses based on individual characteristic roots. Ann. Inst. Statist. Math. 21, 49-66. Pillai, K. C. S. and Jayachandran, K. (1967). Power comparisons of tests of two multivariate hypotheses based on four criteria. Biometrika 54, 195-210. Pillai, K. C. S. and Jayachandran, K. (1970). On the exact distributions of Pillai's V (s) criterion. J. Am. Statist. Assoc. 65, 447-454. Pillai, K. C. S. and Young, D. L. (1971). On the exact distribution of Hotellmg's generalized To 2. J. Multivariate Anal. 1, 90-107. Ramachandran, K. V. (1956). On the simultaneous analysis of variance test. Ann. Math, Statist. 27, 521-528. Ramachandran, K. V. (1958). On the studentized smallest ehi-square. J. Am. Statist. Assoc. 53, 868-872. Robbins, H. and Pitman, E. J. G. (1948). Application of the method of mixtures to quadratic forms in normal variables. Ann. Math. Statist. 20, 552-560. Roy, S. N. (1939). p-statistics of some generalizations in analysis of variance appropriate to multivariate problems. Sankhya 4, 381-396. Roy, S. N. (1942). The sampling distribution of p-statistics and ceVmin allied statistics on the nonnull hypothesis. Sankhya 6, 15-34. Roy, S. N. (1945). The individual sampling distribution of the maximmn, the minimum and any intermediate of the p-statistics on the null hypothesis. Sankhya 7, 133-158. Roy, S. N. (1957). Some Aspects of Multivariate Analysis. Wiley, New York. Roy, S. N., Gnanadesikan, R. and Srivastava, J. N~ (1971). Analysis and Design of Certain Multiresponse Experiments. Pergamon, New York. Schuurmann, F. J. and Waikar, V. B. (1973). Tables for the power function of Roy's two-sided test for testing the hypothesis Z = I in the bivariate case. Comm. Statist. 1, 217-280. Schuurmann, F. J. and Waikar, V. B. (1974). Upper percentage points of the smallest root of the MANOVA matrix. Ann. Inst. Statist. Math. Suppl. 8, 79-84. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1973a). On the distribution of the ratios of the extreme roots to the trace of the Wishart matrix. J. Multivariate Anal. 3, 445-453. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1973b). Tables for the distribution of the ratios of the extreme roots to the trace of Wishart matrix. A R L 73-010. Wright-Patterson Air Force Base, OH. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1975a). Exact percentage points of the distribution of the trace of a multivariate beta matrix. J. Statist. Comp. Simulation 3, 331-343. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1975b). Tables for a multivariate F distribution. Sankhya, Ser. B 37, 308-331. Schuurmann, F. J., Waikar, V. B. and Krishnaiah, P. R. (1975c). Percentage points of the joint distribution of the extreme roots of the random matrix Sl(S1 + $2)- 1. j. Statist. Comp. Simulation 2, 17-38. Siotarti, M. (1959a). On the range in multivariate case. Proc. Inst. Statist. Math. 6, 155-156. Siotani, M. (1959b). The extreme value of the generalized distances of the individual points in the multivariate normal sample. Ann. Inst. Statist. Math. 10, 183-203. Siotani, M. (1960). Notes on multivariate confidence bounds. Ann. Inst. Statist. Math. 11, 167-t82. Solomon, H~ (1960). Distribution of quadratic forms--tables and applications. Applied Mathematics & Statistics Laboratory. Tech. Rept. No. 45, Stanford University.

Computations of some multivariate distributions

971

Sugiyama, T. (1967). Distribution of the latent root and the smallest latent root of the generalized B statistics and F statistics in multivariate analysis. Ann. Math. Statist. 38, 1152-1159. Sugiyama, T. (1970). Joint distribution of the extreme root of a covariance matrix. Ann. Math. Statist. 41, 655-65'7. Tiku, M. L. (1971). A note on the distribution of Hotelling's generalized T~. Biometrika 58, 237-241. Tong, Y. L. (1970). Some probability inequalities of multivariate normal and multivariate t. J. Am. Statist. Assoc. 65, 1243-1247. Waikar, V. B. and Schuurmann, F. J. (1973). Exact joint density of the largest and smallest roots of the Wishart and M A N O V A matrices. Utilitas Math. 4, 253-260. Wilk, M. B., Gnanadesikan, R. and Huyett, M. J. (1962). Probability plots for the gamma distribution. Technometries 2, 1-20.

P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company (1980) 973-994

~')

L. ,J

Inference on the Structure of Interaction in Two-way Classification Model


P. R Krishnaiah* and 34. G. Yochmowitz

1.

Introduction

Under the classical two-way classification model with one observation per cell, the hypotheses of no main effects are tested in practice by using the ratios of the mean squares associated with the main effects to the error mean square. But when the interaction between the main effects is present, these tests are no longer valid. So, there is quite a bit of interest in studying the structure of interaction term and the effect of interaction on the usual tests for main effects. In Section 2 of this chapter, we review Tukey's test for nonadditivity (see Tukey (1949)) and certain generalizations of this test by Scheff6 (1959, p. 144) and Milliken and Graybill (1970). Some other interesting early developments like the work of Fisher and Mackenzie (1923) and Williams (1952) are also discussed in this section. In Section 3, we discuss the model when the interaction matrix is decomposed by singular value decomposition of a matrix. The work of Gollob (1968), Mandel (1969) as well as the likelihood ratio tests (see Corsten and van Eijnsbergen (1972), Johnson and Graybill (1972), and Yochmowitz and Cornell (1978)) for testing the hypotheses on the structures of interaction term are also reviewed. Krishnaiah and Waikar (1971, 1972) proposed simultaneous test procedures for testing the equality of the eigenvalues of the covariance matrix against certain alternatives. Applications of the above procedures in studying the structure of interaction term are emphasized in Section 3. I n Section 4, we discuss the effect of the presence of interaction on the usual tests for the hypotheses of no main effects. Finally, the applications of certain tests for the hypotheses of no interaco tion are illustrated with some data on animal models.

*The work of this author is sponsored by the Air Force Office of Scientific Research, Air Force Systems command under Contract F49620-79-C-0161. 973

974

Inference on the structure of interaction in two-way claz'sification model

2.

Some early developments on tests for additivity Consider the model yu= l~t- ai + ~. + 70 + e~j

(2.1)

where y 0. (i = 1..... r; j = 1..... s) denotes the observation in i-th 'row a n d j - t h column and e0-'s are distributed independently as normal with mean 0 and variance o z. Also/,, % ~. and 7,7 respectively denote the general mean, i-th row effect, j-th column effect, and interaction between i-th row and j - t h column. In addition, let Z i oli = ~jfij = i70 = j 7 ~ / = 0. Tukey (1949) pro posed the following procedure for testing the hypothesis H : ~ / = 0 where 7 - (70)- The hypothesis H is accepted or rejected according as F, X b~ where (2.2)

eE <.F.lH]=(ls~{r- 1)(s- 1)--1) El= (4- 4)


'

(2.3) (2.4)

g= E E (yij-y,.-yj+y.) 2,
i j

(2.5)

sYi. = Yij,
i

r f j = ~,, Yij,
i

rs.. = Z Z Yij"
i j

When H is true, the statistic F 1 is distributed as the central F distribution with ( 1 , r s - r - s ) degrees of freedom. In examining the model (2.1) with 7u =)~aifla, Ward and Dick (1952) solved the normal equations and arrived at Sl 2 as the sum of squares associated with testing the hypothesis of no interaction. Ghosh and Sharma (1963) showed that the power of Tukey's test for H against the alternative hypothesis 70 = ?~ai~ is high. Tukey (1955) showed as to how his test can be extended to test for no interaction in the Latin Square. The model equation in this case is given by y0k = / , + a/A + ~ a + 'Yk C -Jr"~/jk "-}-e~k (2.6)

P. R. Krishnaiah and M. G. Yochmowitz

975

where ai a , fljP a n d y c ( i = 1,2 . . . . . r ; j =

1,2 . . . . . r ; k = 1 , 2 , . . . , r ) r e s p e c t i v e l y

denote the effects of i-th level of A, j-th level of B and k-th level of C. Also, ~//jk denotes the interaction of i-th level of A withj-th level of B and k-th level of C. In addition, the errors egk are distributed independently and normally with mean 0 and variance o 2. If we apply Tukey's test, we accept or reject the hypothesis H of no interaction under the model (2.6) when F 2 N F,~ (2.7) where

el ~ -<,~A/-/] = (1 - ,~),

(2.8)

s~-d
i j

4 =

E ei,~.,,k
j

~,

4 = E E e~k, 2
i j

s~= E ~,(%k-ffi..-uj.-ff..k + 2ff...) z,


Uok = ( ) ~ " ' + ) ~ ' + 3 7 " k - - 3 f i ' - - ) 2'
lYi" "~ Yi..' FUi.. "~- Ui.. , F y j . = y j., rff j. = blj.,

r237-.- = E Y o k , i j

I~"k ~'Y"k ; FU.. k = U.. k

e~/k=Yok --37i--- - f j . - 37--k+ 2)?-....

(2.9)

When H is true, the statistic F 2 is distributed as the central F distribution with (1, r 2 - 3r + 1) degrees of freedom. Thus interaction can be tested with only one cell replicate in the Latin Square. Mandel (1969) also considered the problem of testing the hypothesis of no interaction under the model (2.6) when %~ =~uivj where u; and vj are specified a priori and 2~ is an unknown constant. Mandel (1969) has identified many models as special cases of the Factor Analysis of Variance (FANOVA) model given by (3.1) in the next section.
Table 1 Special cases of the FANOVA model Structure of */e 0 ;kc~ifl j
Riftj

Type of the Model Additive Concurrent Bundle of lines--rows linear Bundle of lines--columns linear Combination of concurrent and bundle of lines First sweep of Tukey's vacuum cleaner

Cjai ~.flj + )kOtifl j


Riftj + a i Cj "t" )kOLi~ j

976

Inference on the structure of interaction in two-way classification model

These special cases are obtained by assuming very special structures of the interaction term ~0 in (2. l) and they are given in the Table 1. The additive model has no interaction. The concurrent model can be tested effectively by using Tukey's test for nonadditivity. Mandel (1961) proposed the bundle of lines model with one replication per cell in the fixed two-way layout. The test for no interaction under this model is described below. If we have ~j = R;flj, the total sum of squares (s.s.) is partitioned as shown in Table 2, where

y~(yj-y..)
hi= J ~
(2.10)

E (;j-;..)~
J
The hypothesis R~ = 0 is accepted or rejected according as

(2.11)
where

P[ te3 <F,,IHJ=(1- a),


and F 3= (s-2)s~

(2.12)
(2.13)

Also, F 3 has F distribution with r - 1 and ( r - 1 ) ( s - 2 ) degrees of freedom when H is true. When H is rejected, Mandel indicated that the data is represented by a bundle of non-parallel lines with scatter about the lines being measured by the residual m e a n square. In order to examine whether the multiplicative structure R;]?j. is an appropriate descriptor for ~Tu he partitioned the s.s. (slopes) as shown in Table 3.
Table 2 Source of variation

d.f.

s.s.

Total
Mean Rows Columns

r~
1 r- 1 s- 1

i j rsy2

2 2g

r "~, (Yl- - 37-)2 i s ~,, J

(y.j --2..) 2

Slopes
Residual (r-

r-I
1)(s - 2)

{
~ ~ {(Y/j i j

-Yi.)

- -

bi(Yj _ y . . ) } 2 = s 2

P. R. Krishnaiah and M. G. Yochmowitz

977

Table 3 Source Slopes Concurrence N on-concurrence df r- 1 1 r- 2 s.s. ~ (bl - 1)2'~.(yj-)7..) 2

[ Y, ~,, ()7i.--)7..)(ffj--~7..)y#] 2 2 2 E ()~i'--if--) E (ffj--ft.-)


Remainder

The s.s. for concurrence is identical to Tukey's s.s. for 1 df. I n the presence of interaction, significant c o n c u r r e n c e indicates that the multiplicative model R ; ~ will a c c o u n t for m o s t of the interaction. He tests this hypothesis by using F 4 as test statistic where

F4 = s.s. ( c o n c u r r e n c e ) ( r - 2 ) s.s. ( n o n - c o n c u r r e n c e )
W h e n there is no c o n c u r r e n c e F 4 has F distribution with 1 a n d r - 2 degrees of freedom. Testing for interaction in the bundle Of line models is thus a two step procedure. Step 1 involves testing for n o interaction. T h e second step is to test for the appropriate structure of the interaction if the interaction is present. W e can use simultaneous tests to test both h y p o t h e ses simultaneously. The c o m b i n a t i o n of the c o n c u r r e n t and b u n d l e of lines models can be reparametrized by expressing ~//j as T0"= (Xa,. + R/)~., and therefore b e c o m e s a F A N O V A model (see (3.1) below) with a single multiplicative c o m p o nent. T h e first sweep of T u k e y ' s v a c u u m cleaner can be reduced to a two c o m p o n e n t F A N O V A m o d e l by a similar reparametrization. Future sweeps of Tukey's v a c u u m cleaner differ f r o m the F A N O V A model in that new terms of the v a c u u m cleaner are functions of the residuals a n d the preceding sweep. I n the F A N O V A model, they are functions of the residuals only. Milliken and Graybill (1970) considered the m o d e l y--X/3+ Z/~+e where e : n 1 is a n d covariance Z(zu(Xfl)) :n x k 2t : k 1 a n d fl :p (2.14)

distributed as a multivariate n o r m a l with mean vector 0 matrix o2I,, X : n p is k n o w n matrix of rank q, is u n k n o w n but its elements are k n o w n functions of X ~ , 1 are u n k n o w n . If Z is known, the usual test statistic

978

Inference on the structure of interaction in two-way classification model

used for testing the hypothesis 3, = 0 is given by F where

F= Q , ( n - r) Qo(r- q) ' QI=y'[(I--XX-)Zt[(I-XX-)Z] Qo=Y'[ I - X X


] Y - Ql,

(2.15)

y,

(2.16) (2.17)

and r is the rank of [X,Z]. Also, q<r<n and A - denotes the M o o r e Penrose generalized inverse of A. Since Z is unknown, we replace Z with 2 in (2.16) where 2~ is obtained from Z by replacing XIg with X/~; here fl is the least square estimate of 13 under the model when )t--0. Now let, F* -- (n - r)Q 1

(r_q)O

(2.18)

where

Q.,=y'[(I-XX QofY'[ I - X x -

)ZI[(I-XX-)Z]
]y-Qv

Y,

(2.19) (2.20)

The hypothesis N = O is accepted or rejected according as

F*%F,~
where

(2.21)

elF* <<F.IA=01-(I- a).

(2.22)

When 1~= 0, the statistic F* is distributed as central F distribution with ( r - q) and (n - r) degrees of freedom. When )t=/=0, the distribution of F* is not known. The distribution theory given above is essentially contained in Scheff6 [(1959); problem 4.9] and the model (2.14) is a slight generalization of the model considered by Scheff& When k = 1, we obtain

O, =

Z'(I-XX-)Z

'

(2.23) (2.24) (2.25)

0o = Y ' ( I - XX - ) y - Q1, F * = Q.,(n-r)^

Qo(r - q)

P. R. Krishnaiah and M. G. Yochmowitz

979

Milliken and Graybill (1970) discussed some useful special cases of the model (2.1). One of the special cases discussed was the concurrent model (2.26) where 3, is unknown and other notations are the same as used in the model (2.1). The hypothesis )t = 0 can be tested by using the test statistic (2.25). In this special case, the test discussed in Graybill and Milliken (1970) is equivalent to Tukey's test for non-additivity. Fisher and Mackenzie (1923) considered the model when the expected effect is the product of the constants representing the effects of two factors. Williams (1952) considered the following model: y/j = Xaiva.+ ~ + eU (2.27)

where Ec~ =1~ ~ = 0 and ~ , a 2 = X v f = 1. He showed that the least square estimate of X is the largest root of the matrix T = ( % ) where tik = Y'J(Yo)Tj)(yik --)T k). Williams 0952) also considered the following model: (2.28) where Y~ai=Y,/3j=Zci---Zdj=O and E c ~ = Z ~ ? = l . He showed that the least square estimate of )t is the largest root of the matrix V=(vjk ) where

vjk = Y~i(Yij --Yi. --Y j)(Yik --Yi. --Y.k)"


3. Tests for the structure of interaction using eigenvalues of a random matrix

In the model (2.1), we assume that the rank of ~/=(%) is c. Using the singular value decomposition of a matrix, we know that
= OlUlV'l ' + ' ' " "1- OcUcVtc

(3.1)

where 0(/> . . . >i 0~ are the eigenvalues of ~/~', u i is the eigenvector of ~/' corresponding to 0/2 and vi is the eigenvector of ~/'~ corresponding to 0~. Gollob (1968) and Mandel (1969) considered the problem of testing the hypotheses H i where Hi:O i = 0. Their tests as well as the likelihood ratio tests for testing H i will be discussed in the later part of this section. We will first discuss as to how the simultaneous tests of Krishnaiah and Waikar (197l, 1972) for sphericity can be applied in the area of testing for the structure of interaction term ~ . Some discussions along these lines were made by Schuurmann, Krishnaiah and Chattopadhyay (1973b) and Krishnaiah and Schuurmann (1974).

980

inference on the structure o f interaction in two-way classification model

It is known (see Gollob (1968)) from a result of Eckert and Young (1936) that the least square estimates of 0g, u~, and vg are respectively/~, fi~ and ce i where ( ~ 2 > . . . >~2_1 are the nonzero roots of DD', fii is the eigenvector of D D ' corresponding to ~2 f~ is the eigenvector of D ' D corresponding to ~.2, D = (do.), and d~ =Yo-Y~.-Y:J +Y.." But

- =(,

+4

(3.2)

where I r is the r r identity matrix, and Jr is the r r matrix with ones as its elements. We can choose C r such that C/Cr=Ir_ 1 a n d / ~ - ~ J r = CrC i. So, it is easily seen (e.g., see Johnson and Graybill (1972a)) that the nonzero roots of D D ' are the same as the nonzero eigenvalues of W where W = C" YC, C" Y' Cr. But the columns of C i Y are distributed independently as ( r - l ) - v a r i a t e normal with mean vector C / M and covariance matrix C/,Cro 2. So, W is distributed as noncentral Wishart matrix with ( s - 1 ) degrees of freedom and noncentrality p a r a m e t e r ~2 where a = C} M C, C2M ' Cr, M = ( m / j ) and mij = lx + ai + tSj + rlij. Also, E( W / ( s - 1 ) ) = Z* where Y~*= o2I+(~2/(s - 1)). We can express fa as f~= k
k=l

02C"u c~ k"~r k u'k ~ r"

(3.3)

Let )t 1 > . - - >X~ I be the nonzero roots of Z*. Then Xi=o2+(Oi2/(s - 1)) ( i = 1,2 ..... c), Xc+l . . . . . Xr--1 = 2 ' It is of interest to test the hypothesis H:O 1. . . . . 0c = 0 and its subhypotheses simultaneously. The hypothesis H is equivalent to testing the hypothesis H* where H*:X~ . . . . . Xc+l. So, the proble m of testing the hypothesis of no interaction is equivalent to the problem of testing the equality of the eigenvalues of Z*. Motivated by this equivalence, we consider the following procedures for testing the hypothesis of no interaction and its subhypotheses in the spirit of the simultaneous tests of Krishnaiah and Waikar (1971). To fix the ideas, we will first consider the case when c = r - 2 . The hypothesis H* can be expressed as
r--2 r--1 r--2

H * = (=) Hi*r_,,
i=1

H*= 0
i=1

H*i,i+1,

H*=

~
i=1

H*

where H/~ : X , = ~ ( i < j ) , H * :)ti=X and ( r - l)X=)~l+ +)t~_ v Also, let


r--2 r--I
1'

r--2

A~{ = U A~,~
i=1

A~ "~- U a~*i+,
i=1

A ~ = U A7
i=l

P. R. Krishnaiah and M, G. Yochmowitz

981

where A * : X i > ) t j ( i < j ) , A * : ~ i > X , and ( r - 1 ) X = ( X l + ' - " +r--l)" T h e hypothesis H * when tested against A ~ is a c c e p t e d or rejected according as
--

l1

N c1~

(3.4)

where

[ lr_l

]=(]--Ol).

(3.5)

If H * is rejected, we accept or reject H* i,r 1 against A* t,r-- 1 according as

lr_-- X cir.

l,

(3.6)

Here we note that H * i,r-- I is equivalent to the hypothesis that 0~= 0. Next, consider the p r o b l e m of testing H * against A~. I n this case, we accept H * if

ti
' li+--~ <c2~ for i = 1,2 ..... r - 1 a n d reject it otherwise where (3.7)

P[ li/li+ 1 -<<c2~; i = 1,2 ..... r - 2IH* ] = (1 - c0.


If H * is rejected, we accept or reject H * i,i+ 1 according as

(3.8)

li+ l N c2~.
The hypothesis H * i,l+l is equivalent to the hypothesis that 0 i = 0i+ l" If we test H* against A~, we accept or reject H * according as Ii
/l + ' ' "

(3.9)

X c3,
+[r-I

(3.10)

where c3, is chosen such that ii P /1 + ' ' "

+lr 1

<c3~ H*]=(l-a).

(3.11)

982

Inference on the structure of interaction in two-way classification model

If H* is rejected, we accept or reject H* against A* according as _~ c3, (3.12)

l1 +''"

+[r-I

Here we note that H* is equivalent to the hypothesis that ( r - 1 ) 0 i 2=

07+.-. +0L2.
It is known that W is distributed as the central Wishart matrix with ( s - 1 ) degrees of freedom a n d E ( W / ( s - 1 ) ) = o 2 1 when H* is true. Schuurmann, Krishnaiah and Chattopadhyay (1973a, b) investigated the exact distribution of l l / ( l ~ + . . . + lr_ 0 whereas Krishnaiah and Schuurm a n n (1974) investigated the distribution of l l / l r _ 1. Percentage points of the above statistics are reproduced in Chapter 24 of Krishnaiah (1980) [this volume] for some values of the parameters. The exact distribution of m a x ( l i / 1 2 , 1 2 / l 3. . . . . l r _ 2 / l r _ l ) is not known. But we know that
lI

(3.13)

Using the inequality (3.13) and the results on the distribution of l l / 1 p, we can obtain upper bounds on the values of c2~ where c2<` is given by (3.8). Computer programs are also available for computing percentage points of various ratios like ll/lp, l l / ( l I + . . . + lp) and m a x i ( l i / l i + 1) by using M o n t e Carlo methods. We will now discuss simultaneous test procedures to test H * when c H* c<r-2. In this case, we can express H* as H* = f-)i=l i,c+r Motivated by this decomposition, we propose the following procedure. We accept or reject H* against I.J ~= 1(Xi > ~ + 1) according as
- -

11 lc+l

~> C4c ~

(3.14)

where

L ]c+l ~C4ct O*j

1 ~i-(l--~).

(3.15)

When H* is rejected, we accept or reject I-Ii*c+ 1 according as

l,
lc+1%C44"

(3.16)

P. R. Krishnaiah and M. G. Yochmowitz


It is quite complicated to evaluate c4a where

983

ll

~ C4~[H* ] = (1 -- 0~).

(3.17)

T h e above test for H * is equivalent to testing H ~ , c + p . . . , H c , c+ 1 simultaneously against appropriate alternatives and accepting H * if and only if all the subhypotheses Hi, c 1 (i = 1..... c) are accepted. T h e hypothesis Hi, c+ 1 is equivalent to the hypothesis that 0i--0. In p r o p o s i n g the test discussed above, l + l / ( s - 1) is used as an estimate of Xc+ 1. One m a y use any of the eigenvalues 1 , , + 2 / ( s - 1 ) , . . . , l r _ l / ( s - - 1 ) also as estimates of Xc+ 1. Alternatively, one m a y use (/c+ l + " " " + l r - 1 ) / ( r - c - 1 ) ( s - 1) as an estimate of ~c+1. So, procedures can be p r o p o s e d to test H * a n d Hi*c+ 1 (i---1 ..... e) simultaneously by replacing /c+l with /c+i ( i = l , 3 , . . . , r - 1 ) or (/c+l + . . . + 1r_ 1 ) / ( r - c - 1 ) . C o m p u t e r p r o g r a m s are available for c o m p u t i n g the percentage points of the test statistics lille+ i, (i = 1,2 . . . . . r - - c - 1), a n d Ill(It+l+''" + / r - i ) " Also,

[ II < C4alH* ]

[ II P]-~-c4~x[ n
<~c4alH* ) P

,].I'
II
l l - b . . " +lr__ 1

(3.18)
< c4~1 H *

;[

1c+1+''"

II
+lr_ 1

][

!
.

(3.19) W h e n H * is true, we can use inequalities (3.18) a n d (3.19) a n d the k n o w n results on the distributions of l l / l r _ ~ and l l / ( l 1 + . . " + l,_1) to obtain b o u n d s on the critical values associated with the procedures discussed a b o v e for testing H * a n d Hi*c+ 1 ( i = 1,2 . . . . . c). We will now consider the p r o b l e m of testing H * against the alternatives C . :~ [,.Ji 1 [ ~ > ~ + 1 ] - In this case, the hypothesis H is d e c o m p o s e d as H * = O i= 1Hi*~+l and the following procedure m a y be used. We accept H * if li/li +1 < Csa for i = 1,2 ..... c and reject it otherwise where P ~ <c5~ ; i = 1 , 2 ..... c l H * =(l-c 0. (3.21) (3.20)

W h e n H * is rejected, Iti, i+ 1 (i = l, 2 . . . . . c) is accepted or rejected according

984

Inference on the structure of interaction in two-way classification model

as (l i / l i + l) ~ c5,.. As before we can replace lc + l with l~+ i (i -- 2 ..... r - c -- 1) or (l+ 1+ " " + It- l ) / ( r - c - 1) in the above procedure. Next, consider the problem of testing H * against U ~=~[Xi>X]. In this case, we accept or reject H* according as
l|
ll - ~ ' ' " "q- lr

1 ~ C6a

(3.22)

where
[1

1 1 + " " +lr--~ <

c6a[H* ] = ( 1 - c 0.

(3.23)

W h e n H* is rejected, we accept or reject the hypothesis Xi = X (i = 1. . . . . c) against X; > X according as

//
ll q- " "" "~- l r - 1 ~ C6a"

(3.24)

Here we note that the hypothesis h i = X is equivalent to the hypothesis that 0i2=(0~2+ . . - + 0 2 ) ~ ( r - l ) . W e m a y decompose H * as H * = (-'1c i=l()t/= )~*} where ( c + I ) X * = X l + . . +Xc+ 1. In view of this decomposition, we propose the following procedure for testing H* against U ~= l[Xi > X*]. We accept or reject H * according as
l1

1 1 + " " +/~+1 Xcv" where

(3.25)

11+ . .-

l, =l=lc+l < CT.IH*]=(I-,x).

(3.26)

W h e n H * is rejected, the hypothesis 2~ =f,* is accepted or rejected according as - U>c7~. 1 1 + . . - +lc+ 1 (3.27)

In the above procedure, we m a y replace lc+ 1 with (l~+ 1 + " " " + l r - 1 ) / ( r - c 1) and apply the test. Next, consider the problem of testing the hypothesis ( r - c - 1 ) ( X l + " " +Xc)=C(Xc+l + " " + X r - l ) against the alternative that ( r - c - 1 ) ( X 1
-

P. R. K r i s h n a i a h a n d M . G. Y o c h m o w i t z

985

"~" " " " -[- ~kc) > C(~kc + I -[- ' " " "+" ~kr--

1)" In this case, the hypothesis is accepted if

l~+--.

+l~ (3.28)

1+1 -al- 4- lr_ 1 < CSt


and rejected otherwise where

/c+l +

l,+-..+t, ,<~oln,]=(l_~). +lr-I

(3.29)

Here, we note that the hypothesis ( r - - c - - 1 ) ( X l + . - - + X ~ ) = c ( X ~ + l + " ' " +Xr-1) is equivalent to the hypothesis that 0~ . . . . . 0~=0. Next, consider the p r o b l e m of testing the hypothesis H(,) where H ~ ) : X~ =X~+ l . . . . . X~-X~+ I. We c a n express H(~) as [-)i=lH(a)i where H(a)i. ( r - a ) A i = ( X a + " " + 2 ~ - 0 - Also, let A~a)i:(r-a)Xi>(Xa+'"-]-~kr_l). Then, the hypothesis H ~ ) is a c c e p t e d if

lo
[a -~ " " " dr- lr - I < c9a

(3.30)

and rejected otherwise where

e lo+- l. - +lr-i ~ c9~ln*(a)l =(1-a).

(3.31)

But the distribution of l a / ( l a + . . . + lr_ 0 involves 01 ..... 0, 1 as nuisance p a r a m e t e r s even w h e n H~,~ is true. So, the a b o v e test c a n n o t be applied unless b o u n d s (free f r o m nuisance parameters) are o b t a i n e d on the distribution of the a b o v e test statistics. Here, we note that the hypothesis H~)is equivalent to the hypothesis that 0a . . . . . 0c = 0, a n d A(*)i is equivalent to the hypothesis that 0,2 > (0if+ - - - + 02)/(r - a). Procedures similar to the above can be p r o p o s e d for testing H ~ ) against alternatives L,J ~=~[Xi >Xc+ l] a n d U ~=a[~ki>~i+l]. Next, consider the p r o b l e m of testing the hypothesis Hoab(r--a--b) (h a + . . . + Xc) = (c - a + l)(Xa+ b + ' ' " + Xr- 1) against the alternative that (r-a-b)(X~+... + X c ) > ( c - a + l ) ( X ~ + b + . . . +Xr_t). In this case we accept or reject the null hypothesis according as

(lo+.-. + 0
([a+b dr-''"

+/r-- l)

clo ~

(3.32)

986

Inference on the structure o f interaction in two-way classification model

where

P la+b +

t.+... +Ic

+ l r - , <C10~1Hab = ( 1 - c O .

(3.33)

The distribution of the test statistics in (3.32) involves nuisance parameters even when H0a b is true and so b o u n d s free f r o m nuisance parameters should be obtained to apply this procedure. Here we note that H0a o is equivalent to the hypothesis that

(r-a-b)~ Oi2=(c-a+l) ~ Oi 2,
i=a i=a+b

that is
a+b-I

Oi2(r-b-c--l)+(r-a--b)
i=a+b

~
i=a

0i2=0

and so 01... = 0 C=0. We n o w will discuss the likelihood ratio test statistics for testing the hypotheses 0i = 0 a n d observe the relationship of these procedures with the procedures discussed above. Corsten and van Eijnsbergen (1972) derived the likelihood ratio test statistics for testing the hypothesis that H : O 1 ..... 0c = 0 . The test procedure in this case is to accept or reject H according as L I ~ Cll . where ell a is chosen such that (3.34)

e [ Ll <<.c'll~lH] = ( 1 - ve)

(3.35)

where L 1=(1 l + . . . + lc)/(1 l + l2 + . . . + lr_O. W h e n c = 1, J o h n s o n a n d Graybill (1972) derived the likelihood ratio test independently. The distribution of L~ for c > 1 is not k n o w n but a p r o g r a m is available to c o m p u t e the percentage points of L l, by using M o n t e Carlo methods. Here we note that the likelihood ratio test statistic described above is equivalent to the test statistic for testing the hypothesis that -r--1 c ( r - c - 1 ) ~ ci = l h i c(Y.i=+l)~i) against the alternative ( r - c - 1)Y.i=l~ i> c]~-)+ ~)ti. W h e n c = l, the likelihood ratio statistic L 1, is equivalent to the

P. R. Krishnaiah and M. G. Yochmowitz

987

test statistic given in (3.22) for testing H* against )kl>X. Yochmowitz (1974a, b) and Yochmowitz and Cornell (1978) discussed the likelihood ratio statistic for testing the hypothesis 0a.= 0 against the alternative 0j=/:0 and 0j+l= 0. The test procedure in this case is to accept or reject the null hypothesis according as Tj~ cl2~ where p[ Tj ~<c,2=10j= 0 ] = ( 1 - ~) and (3.37) (3.36)

+ tr_,).

(3.38)

But the distribution of Tj even in the null case involves 01..... 0a._I as nuisance parameters. When c = 2 , Hegemann and Johnson (1976) have independently discussed the likelihood ratio test for 02=0. Krishnaiah (1978) discussed the likelihood ratio test for 0 j = 0 against the alternative that 0j.v~0,0j+, v~0..... 0j+,~0,0i+~+ , =0. Yochmowitz and Cornell (1978) discussed a step-wise procedure to test 0fs by making use of the distribution of l l / ( l I + . . . +lr_ 0 considered by Schuurmann, Krishnaiah and Chattopadhyay (1973). At the first stage, the hypothesis 01 = 0 is accepted or rejected according as T 1X c13,~ where (3.39)

r [ T 1 < Cl3~10, =01 = (1 - c0.

(3.40)

If the hypothesis of 01 = 0 is accepted and Tj was defined by (3.38), we do not proceed further. If 0~ = 0 is rejected, we proceed further and accept or reject 02-- 0 according as
T 2 ~ Cl4,a

(3.41)

where

PI Tz < c,4,~102= 0] = (1 - o~).

(3.42)

If the hypothesis of 02 = 0 is accepted, we do not proceed further. Otherwise, we proceed and test the hypothesis of 03=0 by using T3 as test

988

Inference on the structure of interaction in two-way classification model

statistic. This procedure is continued until 0j = 0 is accepted for any j or Oc=0 is rejected. At the first stage, the test can be implemented since the null distribution of T~ is free from nuisance parameters. But the distribution of Tj(j = 2 ..... c) involves 01..... 0s-1 as nuisance parameters. As an ad hoc procedure, Yochmowitz and Cornell assumed that the joint distribution of/j > / . . - /> ln_ 1 is approximately equivalent to the joint density of the roots of the central Wishart matrix Wj of order ( r - j ) x ( r - j ) with ( s - 1) degrees of freedom and E(VVJ(s-1))=o2Ir_j. Johnson and Graybill (1972) and Yochmowitz and Cornell (1978) suggested approximations of T~ with central F distribution. Gollob (1968) and Mandel (1969) considered the problem of testing the hypotheses on 0j's. The tests of Gollob were motivated by the assumption that the eigenvalues /j are distributed independently as chi-square variables. But these eigenvalues are neither distributed independently nor as chi-square variables. Mandel (1969) computed 1,/= E(/i) by using Monte Carlo methods. Using these values of 5., he suggested heuristically to examine the magnitude of lj/uj62 to determine as to which of the 0j's are significant; here 62=(lc+1+ ".- + l r 1)/(Pc+l't-''' "~-Pr--l)" But Mandel did not consider the evaluation of the distribution o f / j / ~ 2 . For discussions on tests for the structure of interaction term in two-way classification with replications, the reader is referred to Gollob (1968) and Krishnaiah (1979).

4.

Tests for the main effects

In this section, we discuss the problem of testing the main effects in presence of interaction. Let Hol denote the hypothesis of no block effect and let //o2 denote the hypothesis of no treatment effect. The sum of squares associated with variation between blocks is given by s 2 where $2 = s~r= I(Yi-__~..)2. Similarly, the sum of squares associated with variation ~ - x2 between treatments is denoted by s 2 where s j = r ~ - ~ sj= 1tY.j-Y..) We know that

E(s~/(r-

1))=o 2+( ~

c~Z/r -

1),

(4.1)
(4.2)

E ( s ~ / ( s - 1 ) ) = a z + ( E flj?ls- 1).
Now let,
s2

Fm

6 2 ( r - 1)'
s4

(4.3) (4.4)

F02 - - -

(s- 1)~2

P. R. Krishnaiahand M. G. Yoehmowitz

989

where 62 is an estimate of o z. W e m a y divide the d a t a into two sets and use one set to estimate o 2 and the other set to test H0~ and H02. A n o t h e r possibility is to use s o m e previous set of data to estimate 02 . Of course, we can use the m a x i m u m likelihood estimate of 02 . Also the m a x i m u m likelihood estimate of o 2 is k n o w n (e.g., see J o h n s o n and Graybill (1972)) to be (lc+l+ . . . + l~_ 1)/rs. If we are testing Hoi individually, we accept or reject Hoi according as F0i N where

F~

(4.5)

P[ Fo, < F,o I Ho,] =


and

(4.6)

F~- ( r - 1)62,
r2 = ( s - 1)62.

(4.7)

(4.8)

W h e n the interaction is present, the distribution of (/c+l + + I,_ 1) is not only complicated but also involves nuisance parameters. If we are testing Ho~ and Ho2 simultaneously, we accept or reject Hoi according as

l~oiN F~
where

(4.9)

P[Foi <.F~,; i= 1,21Ho, Cl Ho2 ] = ( 1 - o 0 -

(4.10)

The critical values F~ can be o b t a i n e d by using M o n t e Carlo methods. T h e statistics F01 and Fo2 are the likelihood ratio statistics (see Y o c h m o w i t z (1974b)) for testing Hol a n d H02 respectively, if 62 is the m a x i m u m likelihood estimate of o 2. W h e n c = 1, this was p o i n t e d out in J o h n s o n and Graybill (1972). Next, let

Fl=(S__l)s2/S2e,

F 2 = ( r _ 1)s41s e2 2

where Se 2 was defined b y (2.5). T h e statistics F I a n d F 2 have been used extensively to test the hypotheses of no block effect and no treatment effect respectively, u n d e r two-way classification additive m o d e l with one observation per cell. But if the true m o d e l is (2.1), then the statistics F l a n d

990

Inference on the structure of interaction in two-way classification model

k2 are distributed as doubly noncentral F distribution with nuisance parameters even in the null cases. So, the usual F tests are no longer valid. Approximations to doubly noncentral F distribution were discussed in Mudholkar, Chaubey and Lin (1976).

50 Illustrative examples
In this section, we illustrate the methods described before with real data sets. Table 4 gives data from an experiment I involving the effects of doses A, B, C, D of benactyzine upon the performance of trained rhesus monkeys where A =0.54 m g / k g , B = 0 . 1 7 m g / k g , C = 0 . 0 5 4 m g / k g and D = 1.7 m g / k g . The subjects were trained to control the position of a primate equilibrium platform (see Yochmowitz, Patrick, Jaeger and Barnes (1977a)) and to press fire and alert buttons on an instrument panel upon their illumination. The platform was perturbed by a random signal and the alert light was triggered at random. The alert light caused one of four fire buttons to light at random. Data were collected at three minute intervals and included the adjusted RMS (i.e., the root mean square position of the platform adjusted about its mean position (see Yochmowitz, Patrick, Jaeger and Barnes (1977b) and the reaction times necessary to extinguish the alert and fire lights. Animal training costs prevented extensive testing and the experiment was limited to 4 subjects. The treatments were administered in the counter-balanced design shown in Table 5. Trials were preceded by a diluent run which served as a standard against which succeeding treatments were compared. For a detailed description of the experiment, the reader is referred to Farrer et al (1979). Z-scores were computed for each variable as follows:

X is the mean 3 minute score over a 30 minute test period. Yp is the corresponding predicted level of performance from a linear least squares fit to the preceding diluent run and s is the root mean square error from the linear fit. Z-scores less than - 3 represent unusally good performance relative to the preceding diluent run. Conversely, z-scores in excess of 3 represent unusually poor performance relative to the preceding diluent run. IThe animals involved in this study were procured, maintained, and used in accordance with the Animal Welfare Act of 1970 and the "Guide for the Care and Use of Laboratory Animals" prepared by the Institute of Laboratory Animal Resources--National Research Council.

P. R. Krishnaiah and M. G. Yochmowitz

991

Table 4 Mean adjusted RMS Z-scores Subject 1 2 3 4 1 A 7.26 B -0.61 C 0.65 D 1.99 Trial 2 B 0.27 C -0.24 D 3.83 A 0.02 C D A B 3 --0.80 -0.55 -0.75 -0.63 4 D 1.91 A -1.29 B 1.3 C-0.8

Table 5 Subject
1

1
A

Trial 2
B

3
C

4
D

2 3
4

B C
D

C D
A

D A
B

A B
C

Table 6 Source Subjects Trials Doses Residual s.s. 18.539 19.156 11.769 24.002 d.f. 3 3 3 6 m.s 6.180 6.385 3.923 4.000

T h e A N O V A table for the d a t a in T a b l e 4 is given in T a b l e 6. T h e su m of squares due to n o n - a d d i t i v i t y is 19.76. T h e test statistic associated with T u k e y ' s test for n o n - a d d i t i v i t y is 23.35. T h e critical value f r o m F tables with (1,5) degrees of f r e e d o m at 5% level is 6.61o So, we reject the hypothesis of additivity. In o t h er studies (see Boster (1978)), b i o c h e m i c a l m e a s u r e m e n t s 2 are taken on male a n d f e m a l e rhesus m o n k e y s in a long term ch r o n i c study. C h o l e s t e r o l m e a s u r e m e n t s in m i l l i g r a m s per deciliter ( m g / d l ) on 19 males serving as controls are p r o v i d e d in T a b l e 7. 771, 772 a n d 773 respectively represent the first, s e c o n d a n d third test periods in 1977. Similarly, 7 8 1 , 7 8 2 a n d 783 are the first, s e c o n d a n d third test p er i o d s in 1978. W e assume the m o d e l (2.1) with in te r a ct i o n t e r m given by (3.1). W e assume that y~j represents the o b s e r v a t i o n m a d e on j - t h subject (male 2The animals involved in this study were procured, maintained, and used in accordance with the Animal Welfare Act of 1970 and the "Guide for the Care and Use of Laboratory Animals" prepared by the Institute of Laboratory Animal Resources--National Research Council.

992

Inference on the structure of interaction in two-way classification model

Table 7 Cholesterol (mg/dl) of male monkeys Subjects 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 771 125 122 116 111 120 127 135 130 170 132 121 108 134 105 143 110 119 124 107 772 Test periods 773 106 93 89 73 104 139 142 127 125 132 104 108 112 141 114 99 105 118 98 781 107 97 118 101 116 109 98 124 173 117 107 112 107 108 118 102 123 103 77 782 130 126 129 130 124 138 119 132 160 136 94 116 113 135 153 100 121 102 llO 783 158 126 130 148 173 164 148 149 196 158 120 132 148 143 145 117 149 127 125

105 106 84 149 88 231 94 103 120 105 149 76 75 128 119 86 91 98 99

m o n k e y ) at i-th t i m e p e r i o d . I n the n o t a t i o n of t h e m o d e l (2.1), w e h a v e r = 6 a n d s = 19. W e also a s s u m e t h a t c = 1. T h e n o n - z e r o e i g e n v a l u e s of D D ' in this case a r e 11= 19519.2, / 2 = 5 2 6 3 . 3 8 , 13=2184.8 , 1 4 = 1 6 6 7 . 7 a n d 15 = 1255.7. I n this case, w e h a v e l l / t r D D ' = 0.653. W e a p p l y t h e p r o c e d u r e g i v e n b y (3.10)-(3.12) to test 01 = 0 . U p p e r 5% p o i n t of the d i s t r i b u t i o n of l l / t r D D ' is g i v e n b y t h e e n t r y c o r r e s p o n d i n g to a = 0 . 0 5 , j = 1, p = 5 , r = 6 in T a b l e 19 of C h a p t e r 24 in this v o l u m e ; this p e r c e n t a g e p o i n t is 0.4531. B u t l l / t r D D ' c a l c u l a t e d f r o m t h e d a t a is g r e a t e r t h a n 0.4531 a n d so t h e h y p o t h e s i s 01 = 0 is r e j e c t e d . H e r e 01 = 0 is t h e h y p o t h e s i s of n o i n t e r a c t i o n b e t w e e n subjects a n d t i m e p e r i o d s .

References Anscombe, F. and Tukey, J. (1963). The examination and analysis of residuals. Technometrics 5, 141-160. Boster, R. (1978). Attachments to 1978 semi-annual progress report to NASA. Radiation Sciences Division, Brooks AFB, Texas. Corsten, L. C. A. and van Eijnsbergen, A. C. (1972). Multiplicative effects in two-way and analysis of variance. Statistica Neerlandica 26, 61-68. Davis, A. W. (1972). On the ratios of the individual latent roots to the trace of a Wishart matrix. J. Multivariate Anal., 2, 440-443.

P. R. Krishnaiah and 34. G. Yochmowitz

993

Eckart, C. and Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika 1, 211-218. Elston, R. C. (1961). On additivity in the analysis of variances. Biometrics 17, 209-219. Farrer, D., Yochmowitz, M., Mattsson, J., Barnes, D., Lof, N., Bachman, J. and Bennett, C. (1979). Behavioral effects of benactyzine on equilibrium maintenance and a multiple response task. USAF School of Aerospace Medicine, SAM-TR-79-19, Brooks AFB, Texas. Fisher, R. A. and Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. J. Agric. Sci., 13, 311. Ghosh, M. N. and Sharma, D. (1963). Power of Tukey's test for non-additivityo J. Roy. Statist. Soc. Ser. B., 25, 213-219. Gollob, H. F. (1968). A statistical model which combines features of factor analytic and analysis of variance techniques. Psychometrika 33, 73-116. Hegemarm, V. and Johnson, D. E. (1976a). On analyzing two-way AoV data with interaction. Technometrics 18, 273-281. Hegemann, V. and Johnson, D. E. (1976b). The power of two tests for non-additivity. J. Amer. Statist. Assoc., 71, 945-948. Johnson, D. E. and Graybill, F. A. (1972a). An analysis of a two-way model with interaction and no replication. J. Amer. Statist. Assoc., 67, 862-868. Johnson, D. E. and Graybill, F. A. (1972b). Estimation of o 2 in a two-way classification model with interaction. J. Amer. Statist. Assoc., 67, 388-394. Krishnaiah, P. R. and Waikar, V. B. (1971). Simultaneous tests for equality of latent roots against certain alternatives--I. Ann. Inst. Statist. Math., 23, 451-468. Krishnaiah, P. R. and Waikar, V. B. (1972). Simultaneous tests for equality of latent roots against certain alternatives--lI. Ann. Inst. Statist. Math., 24, 81-85. Krishnaiah, P. R. and Schuurmann, F. J. (1974). On the evaluation of some distributions that arise in simultaneous tests for the equality of the latent roots of the covariance matrix. J. Multivariate Anal., 4, 265-282. Krishnaiah, P. R. (1978). Some recent developments on real multivariate distributions. In: P. R. Krishnaiah, ed., Developments in Statistics, Vol. 1. Academic Press, Inc. Krishnaiah, P. R. (1979). Some developments on simultaneous test procedures, in: P. R. Krishnaiah, ed., Developments. in Statistics, Vol. 2. Academic Press, Inc. Krishnaiah, P. R. (1980). Computations of some multivariate distributions. In: P. R. Krishnaiah, ed., Handbook of Statistics, Vol. 1 North-Holland Publishing Company. Mandel, J. (1961). Non-additivity in two-way analysis of variance. J. Amer. Statist. Assoc., 56, 878-888. Mandel, J. (1969). Partitioning the interaction in analysis of variance. J. Res. Nat. Bur. Standards Sect. B., 73B, 309-328. Mandel, J. (1971). A new analysis of variance for non-additive data. Technometrics 13, 1-18. Milliken, G. A. and Graybill, F. A. (1970). Extensions of the general linear hypothesis model. J. Amer. Statist. Assoc., 65, 797-807. Moore, P. and Tukey, J. (1954). Answer to query 112. Biometrics 10, 562-568. Mudholkar, G. S., Chanbey, Y. P. and Lin, C. C. (1976). Approximations for the doubly noncentral F distribution. Comm. Statist. Theory Methods A 5, 49-63. Osborne, N. S., McKelvy, E. C. and Bearce, H. W. (1913). Density and thermal expansion of ethyl alcohol and its mixtures with water. Bulletin of the Bureau of Standards 9, 327-474. Scheff~, H. The Analysis of Variance. (Sixth Printing) Wiley; New York. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1973a). Tables for the distributions of the ratios of the extreme roots of the trace of Wishart matrix. Aerospace Research Laboratories, TR No. 73-0010, Wright-Patterson AFB, Ohio. Schuurmann, F. J., Krishnaiah, P. R. and Chattopadhyay, A. K. (1973b). On the distributions of the ratios of the extreme roots to the trace of the Wishart matrix. J. Multivariate Anal., 3, 445-453.

994

Inference on the structure of interaction in two-way classification model

Slater, P. B. (1973). Spatial and temporal effects in residential sales prices. J. Amer. Statist. Assoc., 68, 554-561. Snee, R. D. (1972). On the analysis of response curve data. Technometrics 14, 47-62. Snee, R. D., Acuff, S. F. and Gibson, J. R. (t979). A useful method for the analysis of growth studies. Biometrics 35, 835-848. Tukey, J. W. (1949). One degree of freedom for non-additivity. Biometrics 5, 232-242. Tukey, J. W. (1955). Answer to Query 113. Biometrics 11, 111-113. Tukey, J. W. (1962). The future of data analysis. Ann. Math. Statist., 33, 1-67. Ward, G. C. and Dick, I. D. (1952). Non-additivity in randomized block designs and balanced incomplete block designs. New Zealand Journal of Science and Technology 33, 430-436. Williams, E. J. (1952). The interpretation of interactions in factorial experiments. Biometrika 39, 65-81. Yochmowitz, M. G. (1974a). A note on partitioning the interaction in a two-way model with no replication. (Abstract). I M S Bulletin 80. Yochmowitz, M. G. (1974b). Testing for multiplicative components of interaction in some fixed models. Ph.D. dissertation. University of Michigan. Yochmowitz, M. G., Patrick, R., Jaeger, R. and Barnes, D. (1977a). Protracted radiationstressed primate performance. Aviat. Space Environ. Med. 48(7): 598-606. Yochmowitz, M. G., Patrick, R., Jaeger, R. and Barnes, D. (1977b) New metrics for the primate equilibrium platform. Perceptual and Motor Skills 45, 227-234. Yochmowitz, M. G. and Cornell, R. G. (1978). Stepwise tests for multiplicative components of interaction. Technometrics 20, 79-84.

Subject Index

Actual level of significance, 203 of the /'-*-test, 206,207,212,213,214,215 of the T2*-test, 225 of the T~*-test, 222, 223 of the V*-test, 232 Actual power of the F*-test, 209,211,212,213,214 the T2*-test, 225 the T02*-test, 222, 223 Adaptive inference, 412 Admissibility, 427 Allowances for various types of error rates, 619 Always-pool test, 417 Anderson-Darling statistic, 295,296 s e e also Power studies ANOPOW, 249, 251,256, 259 Approximate degrees of freedom solution, 232 Arbitrary covariance model, 91,93, 96 ARMAX, 263 Assumptions of normality and homoscedasticity, 199 Asymptotic optimality, 607 Asymptotic series solution, 234 Autocorrelation, 269 Autocovariance, 242

(minimum norm least squares solution), 471,481 Best linear m i n i m u m bias estimator (BLIMBE), 505 Best linear unbiased estimator (BLUE), 490 Beta distribution, 218, 219, 230, 231 Bias, 423,433 Bivariate gamma distribution, 227 Bivariate uniform distribution, 227 Block, randomized, 270 BLUE, 75 Bonferroni inequality, 618,619 Bonferroni procedure, 685, 689 Box and Cox transformations s e e Shifted power transformation Box's approximation, 521,527 Brunn-Minkowski inequality, 183

Bayes

homogeneous

linear

estimator

(BHLE), 501
Bayesian estimation, 91,92, t00-102, 106109 Bayesian prediction, 100-106, 109 Bayes linear estimator (BLE), 501 Bayes--MANOVA, 117 Regression, 118 Best approximate solution

Canonical correlations, 576, 582, 584 Canonical variates, 571-573, 575-578, 581 Changeover design, 63 Characterization of normal distribution, 463,464 Poisson, gamma distributions, 463 statistics under Gauss-Markoff model, 465 Charts for largest root of multivariate beta matrix, 915 Chi-squared goodness of fit, 293,294 s e e also Power studies Chi-square distribution, 108, 109 Chords on a circle, 627 Classical es(imation, 92-100 Coherence, 247, 251,260 Combination of test statistics, 300,301 Combining predictors, 110 Comparative power studies, 204

996

Subject l n d e x

Comparisons internal, 133, 134, 140-142 Component, principal, 240 Components of statistic, 295 Compound symmetry, 514, 534, 564 Computation of g-inverse, numerical exampies, 475 Computer programs, 647-650, 661-670, 703 - 741 Conditionally specified inference, 412-416 Conditional tests, 383 Confidence intervals, 45 multivariate regression, 404 regression model, 396 Confidence versus significance, 620 Conservative versus non-conservative tests, 621 Consistency, 602 Constrained inverse, 484 Contrast single-degree-of-freedom, 133,139, 153 Contrast vectors measures of size of, 153,154,158 single-degree-of-freedom, 153-159, 161 Cook-Douglas model, 595 Correlated quadratic forms, 758,760, 762 Covariance components, 1 models, 4 Covariate, 239 Cox and Small's tests, 315,316 Cramer-von Mises statistic, 295,296 multivariate generalization, 312 see also Power studies Critical values for Duncan's new multiple range test, tables, 621 Crossover design, 63 Crossover study, 624 Cross-spectrum, 243 Cumulant, 201 Cumulants, test of normality, 280,281 Curvature, maximum, 315 Curve, growth, 237,239,242 Curve, response, 237 Curve, survival, 237 CV rank trace, 582-585 D'Agostino's D, 287,288, 300, 301,305 see also Power studies Data analysis Fisher's Iris data, 585-587

head measurement data, 578 580,582, 583 Jarvick smoking questionnaire data, 587 Los Angeles heart study, 585 temperature records, 584 Data, longitudinal, 237 Data, serial, 237 Degrees of freedom modifications, 218 Design, change-over, 242 Design, cyclic rotation, 242 Design, experimental, 266 Design, systematic, 242, 270 Determinantal equations, 44 Diagnostic groups, 650-652 Dimensionality of regression, 573, 581-582 Directional tests for normality, 306, 307, 313 Dispersion matrices, 134, 163 measures of size of, 163, 164, 167 Distribution, randomization, 261,270 Domain, frequency, 242 Doubly noncentral F, 990 Duality theorem, 478 Duncan's multiple F test, 620 Duncan's ranked difference test, 619 Edgeworth series, 206,207,212,213 Effect, fixed, 238,242, 244, 248 Effect of heteroscedasticity on MANOVA tests, 220 the F-test, 211,218 the R-test, 227 the T-test, 227 the T0Z-test, 223 the W-test, 227 Effect of nonnormality on MANOVA tests, 220 the F-test,. 205,209,215 Effect, random, 238, 242, 250, 257 Eigenvalues, 163-167 Empirical cumulative distribution function, 135,136 Q-Q plot, 135,136 Entropy maximization, 298 Equality of covariance matrices, 513,526, 528,531,553,555 Equality of means, variances and covariances, 533,560, 562 Equality of mean vectors, 513, 514, 518, 531 Error rate, Type I, experimentwise, 619, 621

Subject Index

997

Error rate, Type I, per comparison, 618,619, 621 Error rate, Type I, per experiment, 619,621 Error rate, Type II, 619 Estimability, 489,497 Estimable linear functional, 489 Estimation of location parameters under mixed effect model, 465,466 Experiment, field, 237, 240, 242, 270 Exponential distribution, 215, 218 bivariate, 227 bivariate double, 227 Factor analytic model, 96 Factorial design structures, 57 FANOVA model, 975,977 F-distribution, 90-91,94,96-98,101-102, 105 Finite intersection tests, 45 Fisher-Cochran Theorem, 244,247,249,272 Fixed model, 426-433 preliminary test estimator of error variance, 432,433 F-ratio, 112, 113 F-test, 201 F*-test, 206, 232 actual distribution of, 207 density function of, 206 distribution of, 211 Full rank regression, 572, 578, 581 Gamma distribution, 588 Gamma function--q-dimensional, 122 Gamma probability plot, 587-590 Gap test, 302,303 Gauss-Hermite quadrature formula, 746, 748 Gaussian formula, 601 Gaussian matrix, 765, 766 G a u s s - L a g u e r r e q u a d r a t u r e formula, 751,756, 758 Gauss-Markoff, 42 Gauss-Markoff model addition and removal of observations, 485,486 Bayes homogeneous linear estimator (BHLE), 501 Bayes linear estimator (BLE), 501 best linear minimum bias estimator (BLIMBE), 505 best linear unbiased estimator (BLUE), 490

estimability, 489,497 Hoerl-Kennard estimator, 506 identifiability, 497 inverse partitioned matrix method, 495 James-Stein estimator, 507 minimax homogeneous linear estimator (MIHLE), 502 minimax linear estimator (MILE), 503 minimum bias estimator, 505 ordered alternatives, tests fol, 499 shrunken least squares estimator, 504, 506 singular covariance matrix, 491,495 tests of linear hypothesis, 498 validity of SLE for deviations in dispersion matrix, 508 Gauss-quadrature formula, 748 General linear model, 41 General MANOVA, 193 g-inverse (generalized inverse), 576 computation of, 475 constrained inverse, 484 definition, 472 duality theorem, 478 least squares, 477 minimum norm, 476 minimum norm least squares, 481 minimum seminorm, 476 minimum seminorm semileast squares inverse, 481 Moore Penrose, 471,481 optimal inverse, 482 partitioned matrix, 485 reflexive, 476 semileast squares inverse, 477 Goodness of fit tests for normality, 293-298 chi-squared, 293,294 empirical distribution function, 294-297 see also Power studies Graphical techniques, 133,303,304,313,314 Grenander's condition, 267 Growth curve analysis, 74 Growth curves, 594 Hadamard product, 59 Hartley's sequential F test, 620-621 Helmert's transformation, 282 Heteroscedasticity, 133,143,205 Hierarchical classification, 408-410 Hierarchical model, 107

998 Homoscedasticity, 143, 199 Homoscedasticity-multivariate, 118 Hotelling's T 2, 514,517,745 Hyperparameters, 124

Subject lndex

Incompletely specified models, 412 Inference based on conditional specification, 407-441 bibliography, 416 Infinitely divisible, 753 Interaction, 59 Intersection of vector spaces, 487 Intraclass correlation, 7 Invariance, 1, lO, 15 Inverse partitioned matrix method, 495 Inverted Wishart distribution, 123 /-sample repeated measurements, 50 Johnson family of curves, 285,291 Kolmogorov-Smirnov statistic, 294-298 multivariate generalization, 312 see also Power studies Kuiper's test, 295,296 see also Power studies Kurtosis see Skewness and Kurtosis Laguerre polynomials, 456 Laplace-type family, 292 Latent roots, 574-575,578,581,587,588 Latent vectors, 574-575 Latin square, 974 Lawley-Hotelling criterion, 44, 203 Least squares additional/removal of observation, 485,486 estimator and BLUE, Euclidean norm of difference between, 509 g-inverse, 477 Leptokurtic, 210,225 Likelihood ratio tests, 513 Lindberg condition, 604 Linearization, 598 Log-linear representation, 346 Log-normal distribution, 215, 218 bivariate, 227 MANOCOVA models one way layouts, 692 two way layouts, 694

MANOPOW, 254 Marginal likelihood multivariate regression, 403 regression model, 395 Marginal maximum likelihood, 2, 34, 35 Matrix, 444 eigenvalues, 445 g-inverse, 444 (Hermitian) symmetric, 444 Moore-Pen:rose inverse, 444 spectral decomposition, 444, 447 transpose, conjugate transpose, 444 Matrix, design, 238 Matrix T-distribution, 122 Maximum likelihood, 2, 32, 33,588 Maximum likelihood estimates, 157,165, 173,608 Mean deviation, 282, 292, 293 moments under normality, 283 Mean square error (MSE), 423,433 Measurements, repeated, 237 Methods of estimation, 349 Minimax homogeneous linear estimator (MIHLE), 502 Minimax linear estimator (MILE), 503 Minimum bias estimator, 505 Minimum norm g-inverse, 476 Minimum norm least-squares g-inverse, 481 Minimum seminorm g-inverse, 476 Minimum seminorm semileast squares inverse, 481 MINQE, 3,16,24 infinity MINQE, 23 non-negative definite, 28 partially unbiased, 25 unbiased, 21 unbiased invariant, 19 MINQUE, 443 Mixed model, 422 Model assumptions, 414 Model discrimination, 293 Model, state space, 262 Modified ANOVA t7. test, 219 Modified test procedures, 218,229 Monotonicity, 179,181,188 Monotonicity of power functions, 182, 189, 190, 191,192 Monte Carlo studies of the robustness of the F-test, 215 the R-test, 227,228 the S-test, 228

Subject Index

999

the T-test, 227, 228 the T2-test, 225 the T~-test, 234 the U-test, 228 the V-test, 228 the W-test, 227, 228 Moore-Penrose inverse, 471, 481 Multiple comparisons, 133,431,617, 618 Multiple comparisons of covariance matrices, 632, 655-660 Multiple comparisons of means Duncan's new multiple range test, 620,621 Dunnett's test, 620,631,638 Fisher's test, 618, 619, 621 Krishnaiah's finite intersection tests, 631,636, 637 Schefft's test, 620,621,631,635,637,638 Tukey's test, 619,621,631,638 Multiple comparisons of mean vectors Krishnaiah's finite intersection tests, 45, 204 Roy's largest root test, 631,638, 639, 647, 648 J. Roy's step-down procedure, 631,641, 642, 644, 649 T2 ~ x test, 631,638, 647, 650-652 Multiple comparisons of variances 632, 653-655 Multiple independence, 513, 519, 548 Multi-response repeated measurements, 69 Multivariate Behrens-Fisher problem, 232 Multivariate beta matrix, 514, 540, 764, 765 charts for the largest root, 915 individual roots, 772, 777, 785, 787, 903, 927 joint distribution of extreme roots, 786, 896 ratios of the roots, 779, 786, 787, 940 trace, 204, 229, 779, 788, 945, 953 Multivariate chi-distribution, 744, 753,760 Multivariate chi-square distributions, 632, 745,753,783,827, 832 Multivariate F distribution, 632, 745, 746, 753, 755,784, 840, 850 Multivariate F matrix, 764, 765 trace, 779, 788 Multivariate linear regression model, 571,572 Multivariate mixed model, 69 Multivariate repeated measurements, 68

Multivariate residuals, 567, 576, 587, 590 Multivariate t distribution, 632, 638, 745,746, 747, 749, 782, 789 Multivariate Weibull distribution, 753 Nearest distance test, 314 Near-normal distribution, 211 Necessary reduction multivariate regression, 400 regression model, 392 Never-pool test, 417,420-421,427,429,430 Newman-Keuls test, 618,619, 621 Newton-Raphson method, 332, 333, 334 Nominal significance level, 206 Nonadditivity, 973 Nonlinear regression, 593 Nonnormal distribution, 215,219 population, 206 Nonparametric S-method, 677, 680, 687, 690 Nonparametric T-method, 675,679 Normal distribution bivariate, 227 multivariate, 201,225 standardized, 206 Normality, test of, 279-320 multivariate, 310-320 univariate, 279-309 Observations, sequential, 237 Observed power, 215 Observed significance level, 215,228 Omnibus tests of normality, 284-286, 303 see also Power studies One-sample repeated measurements, 46 Optimal inverse, 482 Order statistics, 136,156, 165, 167, 173,588, 589 Outliers, 133,144 Pair-wise comparisons, 623 Parallel profiles, 89, 91 Parallel sum of matrices, 488 Patnaik's approximation, 422, 428, 431,432 PC rank trace, 586, 587 Pearson type curves, 285,291,321,322, 323, 324,325,327,328,329,330,514,516 Permutation approach, 218, 219, 229 distribution, 219,230,231

1000

Subject I n d e x

moment, 219, 230 test, 111-113 Plan, student sandwich, 220 Platykurtic, 210 Polynomial approximation, 599 Pooling error, 405 Power gain, 421,430 Power of test, 619,621 fixed model, 429-430 mixed model, 422 random model, 418-421 Power studies tests of multivariate normality, 316-317 tests of univariate normality, 304-309 Preliminary test, 407-439 two or more, 422, 425-426 Pretesting, 412 Prewhiten, 243 Principal components, 571-573,575,577, 582 Prior distribution, 120 non-informative, 120 informative, 124 natural conjugate, 124 Probability plots, 303,304 Probability plots/plotting, 134,167 Q-Q, chi-squared, 138,142, 149,151,152 definition of, 135 gamma, 138, 142, 149, 155-162, 165, 166,171 half-normal, 138, 141-149 normal, 136, 141 replotting, 146 Process, point, 271 Profile analysis, 70 Product, Kronecker, 238 Projection operators under seminorms, 479 Protected LSD test, 618 Protection levels based on degrees of freedom, 620 Quadratic form, 443, 587-589 moment generating function, 447,454 exact distribution, 454-459 asymptotic distribution, 459-462 Cochran's theorem and extension, 452,453 Quantiles, 135-139,141,142,145-152, 155, 157,160,162,164-166 computation of chi-squared, 169 gamma, 170-172 half-normal, 168, 169 no..~al, !68

Random model, 416-426 Random regression coefficients, 6 Range, 284 see also Power studies Range distribution, 752, 782, 821 Rank trace, 580-587 Rao's approximation, 513,515 Reduced-rank regression, 572, 573,577, 578, 580,581,585,587,588 Reflective g-inverse, 476 Regression, 238,266 Regression coefficient matrix, 572, 573, 575,577,581,587,588 Regression model, 391,433-439 multivariate, 398 forward selection procedure, 437,438 predictant, 434,435 preliminary test estimator of regression coefficients, 439 sequential detection procedure, 436, 437 Relative efficiency, 424 Repeated measurements, 41,89 Residual covariance matrix, 577, 581,582, 587 Residual latent roots, 575,578,582 Residual matrix, 119 Residuals, 134, 138, 139, 161 Response, average evoked, 259 Response, impulse, 259 Rhesus monkeys, 990, 991 RML (ratio of maximum likelihoods), 291, 292 Robust nonlinear regression, 611 Robustness properties, 199 Roy's criterion, 44, 203 R-test, 204,228,286 see also Power studies Runge-Kutta method, 322, 335,337, 339 Sample reuse, 110, 111 Second degree polynomial exact distribution, 449 in normal variables, 446,448 moment generating function, 446,447,449 NS conditions for independence, 451,452 NS conditions for Wishartness, 450 Sen-Krishnaiah procedures, 693,695 Sequence effects, 63 Serial correlation model, 96 Series, response, 258
~4~

~,ti~l

9-'11

Subject I n d e x

1001

Series, time, 242, 271 Shapiro-Francia's W ' , 288, 289 see also Power studies Shapiro-Wilk's W, 286-290 multivariate generalization, 311,312 Purl and Rao generalization, 289, 290 see also Power studies Shifted power transformation, 290,291 multivariate generalization, 312,313 Signal, 244,248 Signal, deterministic, 244, 252 Signal, stochastic, 246 Signal, transient, 252 Significance level--recommendations fixed model, 429, 430 mixed model, 422 random model, 417-421 regression model, 438 variance components, 426,433 Simple structure model, 95, 98 Simpson's rule, 326 Simultaneous confidence regions, 676, 683, 684, 686, 688, 697, 698 Simultaneous tests, 631-671,675,676, 677, 678, 679, 680, 681,683,684, 685,686, 687, 689, 690, 691,694, 696, 980, 982 Size curve, 420,421 Skewed distribution, 227 Skewness, 201,208, 210, 218 Skewness and Kurtosis, 201,210, 218,225, 280, 281,293,310,311 basis for omnibus tests, 284,285 contours in (~/bl, b2) plane, 285,286 distribution under normality, 281,282 multivariate generalization, 310, 311 power, 284 see also Power studies Sometimes-pool test, 417,420,421,427,429, 430 Specification conditional, 407-412 unconditional, 407 Spectrum, power, 243,244, 248,250, 251 Sphericity, 513,523, 551 Square, Latin, 270 Standard cumulant, 207 Statistical model, 389 Statistical rank, 573, 58 I, 588 S-test, 228 S*-test, 229 Step-down procedure, 45

Structure of covariance matrices, 535 Structure of interaction, 973, 979 Studentized largest chi-square distribution, 749, 75 I, 782, 801 StudentJzed range distribution, 618,619, 621, 752, 782, 822 Studentized smallest chi-square distribution, 749, 752, 782, 811 Student-t, in testing normality, 315 Symmetrical distribution, 225 Testimating, 412,413 Testipredicting, 4t2, 413 Testitesting, 412,413 Tests for additivity, 973-994 Tests for goodness of fit of models, 355 Tests for nested models, 359 Test of parallelism, 50, 51 Test specifying covariance matrix, 525,529, 552, 556 Test specifying mean vector, 517, 529, 556 Ties and grouping effect on normality tests, 307 Tolerance regions, 97,100 Transformations to normality, 321,336, 339 Transform, Fourier, 239, 243, 248, 252, 263, 271 Transistor data, 337 Translation inequality, 187 Treatment, 248,250, 251,259, 260, 262 Treatment vs. control procedure, 678,681, 688, 690 Trimmed statistics, 301,302 Tschebyscheff--polynomial, 601 Tukey's gap-straggler test, 619 Two-level factorial experiments, 139,152 Two-way classification, 427, 973 t-test, 202 t*-test, 211 T-test, 204,228 T 2 statistic, 42, 202, 220 TE-test, 202, 220 T 2., 220, 225, 227 TZ*-test, 225 T~, 203,220 T02*-test, 222,225 T~*-test, 224,225,234 T~-test, 232,234,255 Unbiasedness, 179 U statistics, 299, 300

1002

Subject Index
Variation, 390 V-test, 204, 228 V*-test, 229,232 Watson's statistic, 295,296 Weighted tests, 54 Wishart matrix, 639, 764, 765 complex, 254, 272 individual roots, 772, 777,785,875,882, 889 joint distribution of extreme roots,785, 868 moment generating function, 445 noncentral, 445 ratios of the individual roots to the trace, 936 ratios of the roots, 777, 779, 787, 938 W-test, 204, 228 W*-test, 227,229 O-unimodal inequality, 185

U*-test, 229 Unconditional distribution, 232 moment, 219, 231 Uncorrelated of order (r,s), 445,451 Uniform covariance structure, 49 Uniformly most powerful, invariant, 202 Uniform model, 89 Union intersection procedure, 646, 64'7 Unit, experimental, 242,248 Univariate mixed model, 49, 54 Unprotected LSD (multiple t) test, 618,621 Unweighted tests, 54 Variable, concomitant, 241 Variance components, 1,422 estimability, 8 minimum variance estimation, 12 models, 4 preliminary test estimator, 423-426 Variate, complex normal, 243, 253, 255,272

Вам также может понравиться