Академический Документы
Профессиональный Документы
Культура Документы
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
JOURNAL OF THE AMERICAN
STATISTICAL ASSOCIATION
Number260 DECEMBER 1952 Volume47
1. INTRODUCTION
1.1. Problem
A COMMON problemin practicalstatisticsis to deride whether
several samples should be regarded as coming fromthe same
population. Almostinvariablythe samples differ,and the questionis
whetherthe differences signifydifferences among the populations,or
are merelythe chance variationsto be expectedamongrandomsamples
fromthe same population. When this problem arises one may often
assume that the populations are of approximatelythe same form,in
the sense that if theydifferit is by a shiftor translation.
1.2. Usual Solution
The usual technique for attacking such problems is the analysis
of variance with a singlecriterionof classification[46, Chap. 10]. The
variationamongthe sample means, xi,is used to estimatethe variation
among individuals,on the basis of (i) the assumptionthat the varia-
tion among the means reflectsonly random sampling froma popula-
tion in whichindividualsvary,and (ii) the factthat the variance ofthe
means of randomsamples of size ni is o-2/niwhereo-2is the population
variance. This estimate of o-2based on the variation among sample
means is thencomparedwithanotherestimatebased onlyon the varia-
* Based in part on researchsupportedby the Officeof Naval Researchat the StatisticalResearch
Center,UniversityofChicago.
For criticismsof a preliminarydraftwhichhave led to a numberof improvementswe are in-
debtedto Maurice H. Belz (Universityof Melbourne),WilliamG. Cochran(JohnsHopkinsUniversity),
J. Durbin (London School of Economics), ChurchillEisenhart (Bureau of Standards),WassilyHoeff-
ding (Universityof North Carolina), Harold Hotelling (Universityof North Carolina), Howard L.
Jones (Illinois Bell Telephone Company), Erich L. Lehmann (Universityof California),William G.
Madow (Universityof Illinois), Henry B. Mann (Ohio State University),AlexanderM. Mood (The
Rand Corporation),LincolnE. Moses (StanfordUniversity),FrederickMosteller(HarvardUniversity),
David L. Russell (Bowdoin College), I. Richard Savage (Bureau of Standards),FrederickF. Stephan
(PrincetonUniversity),Alan Stuart (London School of Economics), T. J. Terpstra (Mathematical
Center,Amsterdam),JohnW. Tukey (PrincetonUniversity),Frank Wilcoxon (AmericanCyanamnid
Company),and C. AshleyWright(StandardOil Company of New Jersey),and to our colleaguesK. A.
Brownlee,HerbertT. David, Milton Friedman,Leo A. Goodman,Ulf Grenander,JosephL. Hodges.
HarryV. Roberts,MurrayRosenblatt,Leonard J. Savage, and CharlesM. Stein.
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 585
tion withinsamples. The agreementbetween these two estimates is
tested by the variance ratio distributionwith C -1 and N - C degrees
of freedom(where N is the numberof observationsin all C samples
combined),usingthe test statisticF(C- 1, N- C). A value of F larger
than would ordinarilyresultfromtwo independentsample estimates
ofa singlepopulationvarianceis regardedas contradictingthe hypoth-
esis that the variationamongthe sample means is due solelyto random
samplingfroma populationwhose individualsvary.
When o-2is known,it is used in place of the estimate based on the
variation withinsamples, and the test is based on the X2(C- 1) dis-
tribution(that is, x2 with C-1 degrees of freedom) using the test
statistic
(1.3) 1- >2T
N 3- N
where the summationis over all groups of ties andT=(t-1)t(t+1)
=t3-t foreach group of ties, t being the numberof tied observations
in the group.Values of T fort up to 10 are shownin Table 1.1.2
TABLE 1.1
(See Section3.1.2)
t 1 2 3 4 5 6 7 8 9 10
T 0 6 24 60 120 210 336 504 720 990
Since (1.3) must lie between zero and one, it increases (1.2). If all N
observationsare equal, (1.3) reduces (1.2) to the indeterminateform
0/0. If there are no ties, each value of t is 1 so ET=0 and (1.2) is
2 DuBois [4, Table nIgivesvalues ofT/12 (his C1) and T/6 (his ce) fort (his N) from5 to 50.
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 587
unaltered by (1.3). Thus, (1.2) divided by (1.3) gives a general ex-
pressionwhichholds whetheror not thereare ties, assumingthat such
ties as occur are given mean ranks:
12 c R 2
i=i ni
(1.4) H _ N(N + 1)
1- E T/(N3- N)
In many situationsthe difference between (1.4) and (1.2) is negligible.
A workingguide is that withten or fewersamples a x2 probabilityof
0.01 or more obtained from(1.2) will not be changed by more than
ten per cent by using (1.4), providedthat not morethan one-fourthof
the observationsare involved in ties.3H forlarge samples is still dis-
tributedas X2(C- 1) when ties are handled by mean ranks; but the
tables forsmall samples, while still useful,are no longerexact.
For understandingthe nature of H, a betterformulationof (1.2) is
N 1 c ni[Ri - l(N + 1)12
-
(1.5) H = E (no ties)
N il1 (N2 - 1)/12
whereRi is the mean ofthe ni ranksin the ith sample. If we ignorethe
factor(N-1)/N, and note that !(N+1) is the mean and 1(N2-1)
the variance of the uniformdistributionover the firstN integers,we
see that (1.5), like (1.1), is essentiallya sum of squaredstandardized
deviations of random variables fromtheir population mean. Inthis
respect,H is similar to X2, which is definedas a sum of squares of
standardized normal deviates, subject to certain conditions on the
relationsamong the termsof the sum. If the ni are not too small, the
Xi jointlywill be approximatelynormallydistributedand the relations
among them will meet the x2 conditions.
2. EXAMPLES
2.1 WithoutTies
In a factory,three machinesturn out large numbersof bottle caps.
One machine is standard and two have been modifiedin different
ways, but otherwisethe machines and their operatingconditionsare
identical. On any one day, only one machine is operated. Table 2.1
3 Actually,forthe case describedit is possibleforthe discrepancyslightlyto exceed ten per cent.
For a giventotal numberofties,S, the secondtermof (1.3) is a maximumifall S ties are in one group
and this maxiInum, (83 -) I/(N3 -N), is slightly less than (S/N)3. Thus, for S/N = -, (1.3) >63/64.
The 0.01 level of x2(9) is 21.666. This dividedby 63/64 is 22.010,forwhichthe probabilityis 0.00885,
a changeof 114 per cent. For higherprobabilitylevels,fewersamples,or morethan one groupof ties,
the percentagechangein probabilitywould be less. With the S ties dividedinto G groups,the second
termof (1.3) is always less than [(S-h)3 +4h1/N3,whereh =2(G-1).
588 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
shows the productionof the machineson various days and the calcu-
lation of H as 5.656. The true probability,if the machines really are
the same with respectto output, that H should be as large as 5.656
is shown in Figure 6.1 and Table 6.1 as 0.049. The approximationto
this probabilitygiven by the x2(2) distributionis 0.059. Two more-
complicatedapproximationsdescribed in Section 6.2 give 0.044 and
0.045.
TABLE 2.1
DAILY BOTTLE-CAP PRODUCTION OF THREE MACHINES.
(Artificialdata.)
Sum
n 5 3 4 12
R 24 14 40 78
R2/n 115.2 65.333 400. 580.533
X<5 ce
o0 Gr<, t
W mX
cqC,ce
Pv ; < 00 CtC
~~CO
CSCq. CSC
h ~ ~~~~~
W c ooN
;~~~~~~~C
O' NGGN .- to
? ? r:;> _X
tO
,zlew
0
Hlew-bSew view X X + 0-- 1194
o _
U X HoHo
N4l
vewS- r- -r4r- CO
- -
<~~~~~~~~~~c
< H~~~~~~~~t
~~~o t X
~~~~~~~~~~~~~~~~~~~~~~~~~r-
11 + C
Ce Co CrOvJ Cq C
M CO C.0 -4ee
iC C
~~~~~~~~~~~~~~->
CS U- Cq d4 00 1 O
< $ s os t 1 +M
os
to00C U-
Cocsnood CO X0 S- to C;
Ca Cqt ~
1011~ , C
590 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
3.1. TwoSamples
The rationale of the H test can be seen most easily by considering
the case of only two samples, of sizes n and N-n. As is explained in
Section 5.3, the H test for two samples is essentiallythe same as a
test-publishedby Wilcoxon[61] in 1945 and later by others.
In this case, we considereitherone of the two samples,presumably
the smallerforsimplicity,and denoteits size by n and its sum of ranks
by R. We ask whetherthe meanrankofthissampleis larger(orsmaller)
than would be expected if n of the integers1 throughN were selected
at randomwithoutreplacement.
The sum of the firstN integersis jN(N+ 1) and the sum of their
squares is 1N(N+ 1) (2N+ 1). It followsthat the mean and variance of
the firstN integersare 1(N+1) and -&?(N2-1).
The means of samples of n drawn at random withoutreplacement
fromthe N integerswill be normallydistributedto an approximation
close enoughforpracticalpurposes,providedthat n and N-n are not
too small. The mean of a distributionofsample means is, of course,the
mean of the originaldistribution;and the variance of a distributionof
sample means is (o-2/n)[(N-n)/(N-1)], where o2 is the population
variance,N is the populationsize, and n is the sample size. In thiscase,
(N2-
T-2=1 so
(3.1) _
(N2-1)(N-n) (N + 1)(N-n)
12n(NV- 1) 12n
wherea'2 representsthe variance of the mean of n numbersdrawn at
random withoutreplacementfromN consecutiveintegers.Letting R
denote the mean rank fora sample of n,
(3.5) = a.
e_lx2dX
,\27r aR
Valuesof (3.4) as largeas K, or largerresultin rejectionofthe null
hypothesis. If the alternative is one-sidedbut fora downwardshift,
is
thenullhypothesis rejected when (3.4) is as smallas - Ka orsmaller.
If thealternative is two-sided and symmetrical, thenullhypothesisis
rejectedif (3.4) falls the
outside range - Kia to +Kja.
3.1.1. Continuity adjustment. It seemsreasonableto expectthat a
continuity adjustment maybe desirable, to allowforthefactthatR,
the sum of the ranksin one sample, can take onlyintegralvalues,
whereasthe normal distribution is continuous.6 In testingagainsta
two-sidedalternative to the null hypothesis, the adjustmentis made
forthetwo-sampletest [28] withthosebased on the
5 An extensivecomparisonofexactprobabilities
normalapproximationindicatesthat the normalapproximationis usually betterwith the continuity
adjustmentwhen the probabilityis above 0.02, and betterwithoutit when the probabilityis 0.02 or
below. This comparisonwas made forus by Jack Karush, who has also renderedinvaluable assistance
withnumerousothermattersin the preparationof thispaper.
592 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
TABLE 3.1
PITMAN EXAMPLE [41, p. 122]
Sample A Sample B
0 1 16 4
11 2 19 5
12 3 22 7
20 6 24 8
29 9
TABLE 3.2
BROWNLEE EXAMPLE [2, p. 361
Method A Method B
95.6 91 93.3 4
94.9 7 92.1 3
96.2 12 94.7 51
95.1 8 90.1 2
95.8 11 95.6 91
96.3 13 90.0 1
94.7 5i
R =601, n = 6, N=13
R
(3.10)
(N+ 1)QN+1) jN + 1Y1
2(Xi_ + 2
2p -+
iRJRJ R,
(3.11) p= - _
+N - ni)(N+ 1)21
ihe + bRivi n d
It is well knownthat -2 timesthe exponentof a bivariatenormaldis-
11Although(3.11) is easily derivedand is undoubtedlyfamiliarto expertson samplingfromfinite
populations,we have not foundit in any of the standardtreatises.It is a special case of a formulaused
by Neyman[47,p.39] in 1923,and a moregeneralcase ofone used by K. Pearson [38] in 1924.For assist-
ance in tryingto locate previouspublicationsof (3.11) we are indebtedto ChurchillEisenbart,Tore
Dalenius (Stockholm),W. Edwards Deming (Bureau of the Budget), P. M. Grundy (Rothamsted
ExperimentalStation) who told us of [38], Morris H. Hansen (Bureau of the Census), Maurice G.
Kendall (London School ofEconomics),JersyNeyman (Universityof California)who told us of [47],
June H. Roberts (Chicago), FrederickF. Stephan who provided a compact derivationof his own,
JohnW. Tukey,and Frank Yates (RothamstedExperimentalStation).
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 597
has thex2(2)distribution
tribution [32,Sec. 10.10].Hence (3.12)could
be takenas our test statisticforthe three-sample problem,and ap-
proximateprobabilitiesfoundfromtheX2tables.
Fromtherelations
(3.13) niRi+ n;R1+ nkRk = iN(N + 1)
and
(3.14) ni + n1 + nk = N
it canbe shownthatthevalueof(3.12) willbe thesamewhichever pair
ofsamplesis usedin it,and thatthisvaluewillbe H as givenby (1.2)
withC=3. For computing, (1.2) has the advantagesofbeingsimpler
than(3.12) and oftreatingall (R, n) pairsalike.
Withthreeor moresamples,adjustments are unim-
forcontinuity
portantexceptwhentheni are so smallthatspecialtablesofthetrue
distributionshouldbe used anyway.
Sincethe adjustment forthe mean-rank methodofhandlingtiesis
a correctionto thesum of squaresofthe N ranks,it is the same for
threeor moregroupsas fortwo.The variancesgivenby (3.1) forthe
case withouttiesarereplacedby (3.6) whenthereareties;hence(1.2)
withmeanranksshouldbe dividedby (1.3) to giveH as shownby
(1.4).
3.3. MorethanThreeSamples
Nothingessentially newis involvedwhenthereare morethanthree
samples.If thereare C samples,themeanranksforany C-1 ofthem
are jointlydistributed approximatelyaccordingto a multivariate nor-
mal distribution,providedthatthesamplesizesarenottoo small.The
exponent ofthis(C- 1)-variatenormaldistribution willhavethesame
valuewhichever set of C-1 samplesis used. This value,whenmulti-
pliedby -2, willbe H as givenby (1.2), and it willbe distributedap-
proximately as X2(C- 1), providedthe ni are not too small.The ex-
ponentoftheapproximating multivariatenormaldistribution is more
complicated thanforthreesamples,butit involvesonlythevariances
ofthe7i as givenby (3.6) and thecorrelations amongpairs(Ri,Ri) as
givenby (3.11).
By usingmatrixalgebra,thegeneralformula forH is obtainedquite
as readilyas theformulas fortwo and threesamplesby the methods
usedin thispaper.A mathematically rigorous discussionofH forthe
generalcase of C samplesis presented elsewhereby Kruskal[25],to-
getherwitha formalproofthat its distribution underthe null hy-
pothesisis asymptotically x2.
598 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
4.1. GeneralConsiderations
H teststhenullhypothesis thatthesamplesall comefromidentical
populations. In practice,it willfrequentlybe interpreted, as is F inthe
analysisof variance,as a test that the populationmeansare equal
againstthealternative thatat leastonediffers. So to interpret it,how-
ever,is to implysomething aboutthe kindsofdifferences amongthe
populations which,ifpresent, willprobablylead to a significant value
ofH, and thekindswhich,evenifpresent, willprobablynotlead to a
significantvalueofH. To justifythisoranysimilarinterpretation, we
needto knowsomething aboutthepowerofthetest:Forwhatalterna-
tivesto identity ofthepopulations willthetestprobablylead to rejec-
tion,and forwhatalternatives willit probablylead to acceptanceof
thenullhypothesis thatthepopulationsare identical?Unfortunately,
fortheH testas formanynonparametric teststhepoweris difficult to
investigate andlittleis yetknownaboutit.
It mustbe recognized thatrelationsamongranksneednotconform
to the corresponding relationsamongthe data beforeranking.It is
possible,forexample,thatifan observation is drawnat randomfrom
each oftwopopulations, theone fromthe firstpopulationis largerin
mostpairs,but the averageof thosefromthe secondpopulationis
larger.In such a case the firstpopulationmay be said to have the
higheraveragerankbuttheloweraveragevalue.
It has beenshownby Kruskal[25]that a necessaryand sufficient
condition fortheH testto be consistent12 is thattherebe at least one
of the populationsforwhichthe limitingprobability is not one-half
thata randomobservation fromthispopulationis greaterthanan in-
dependent randommemberoftheN sampleobservations. Thus,what
H reallytestsis a tendency forobservations in at leastoneofthepopu-
lationsto be larger(or smaller)than all the observations together,
whenpairedrandomly. In manycases,thisis practically equivalentto
themeanofat leastone populationdiffering fromthe others.
4.2. ComparisonofMeanswhenVariability Differs
Rigorouslyinterpreted,all we can concludefroma significantvalue
ofH is thatthepopulations thatthemeansdif-
notnecessarily
differ,
if the populationsdiffer
fer.In particular, in variabilitywe cannot,
12 A testis consistentagainstan alternativeif, whenapplied at the same level of significancefor
increasingsample size, the probabilityof rejectingthe nuli hypothesiswhenthe alternativeis trueap-
proachesunity.Actually,the necessaryand sufficient conditionstated heremustbe qualifiedin a way
of the H test suggestedin thisparagraph.An exact state-
that is not likelyto affectthe interpretation
mentis givenin [25].
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 599
strictly speaking,inferfroma significant value of H thatthe means
differ. In the data ofTable 3.2, forexample,thevariancesofthetwo
chemicalmethodsdiffer significantly (normaltheoryprobability less
than0.01) and substantially (by a factorof 16), as Brownleeshows
[2]. A strictinterpretation ofH and its probability of less than0.01
does not,therefore, justifythe conclusionthatthe meansof the two
chemicalmethodsdiffer.
Thereis somereasonto conjecture, however, thatin practicetheH
testmaybe fairlyinsensitive to differences in variability, and so may
be usefulin the important"Behrens-Fisher problem"of comparing
meanswithoutassumingequalityofvariances.Perhaps,forexample,
we could concludethat the meansof the two chemicalmethodsof
Table 3.2 differ. The following considerations lendplausibility to this
conjecture(and perhapssuggestextendingit to otherdifferences in
form):
(i) The analysisofconsistency referred to in Section4.1 showsthat
iftwosymmetrical populations differonlyby a scalefactorabouttheir
commonmeantheH testis notconsistent forsmallsignificance levels;
in otherwords,belowa certainlevelofsignificance thereis no assur-
ancethatthenullhypothesis ofidenticalpopulations willbe rejected,
no matterhowlargethesamples.
(ii) Considerthefollowing extreme case:Samplesofeightaredrawn
fromtwopopulations havingthesamemeanbut differing so muchin
variability thatthereis virtually nochancethatanyofthesamplefrom
themorevariablepopulation willliewithintherangeoftheothersam-
ple. Furthermore, themedianofthemorevariablepopulationis at the
commonmean,so thatits observations are as likelyto lie above as to
liebelowtherangeofthesamplefromthelessvariablepopulation. The
actual distribution of H undertheseassumptions is easilycomputed
fromthe binomialdistribution withparameters 8 and 2. Figure4.1
showsthe exactdistribution ofH underthe nullhypothesis thatthe
two populationsare completely identical,underthe symmetrical al-
ternativejust described, and undera similarbut skewalternative in
whichtheprobability is 0.65thatan observation fromthemorevaria-
ble populationwilllie belowthe rangeofthe othersampleand 0.35
thatit willlie above. PossiblevaluesofH undereach hypothesis are
thoseat whichoccurthe risersin the corresponding stepfunction of
Figure4.1,andtheprobabilities at thesepossiblevaluesofH aregiven
bythetopsoftherisers. Figure4.1 shows,forexample,thatsamplesin
whichsevenobservations fromthemorevariablepopulationlie above
andoneliesbelowtheeightobservations fromthelessvariablepopula-
tion (so thatthe two valuesof R are 44 and 92, leadingto an H of
600 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
Pr{H>Ho}
1.000
.800
.600
- - -- -
.400
.300-
.200L
.100
.080-
.060-
.040- Skew Alternative
.030 _-? l
.020_
~,.010
oo? - SymmetricalAlternative I
.008 .............................
.006 -
.004
.003 NullHypothesis I
.002 _
.001 2 3 4 5 7
6 a 9 10 i1 12
HO
us to the extensiveanalysesof
4 and 5] in 1889. ChurcbillEisenhartand I. RichardSavage have referred
ranksby eighteenthcenturyFrench mathematiciansin connectionwithpreference-ordering problems,
specificallyelections.The earliestworkthey mentionis by Borda [1] in 1770, and theymentionalso
Laplace [26] in 1778, Condorcet [3] in 1786,and Todhunter'ssummaryof these and relatedwritings
[51,Secs. 690, 806, 989, 990]. Systematictreatmentof ranks as a nonparainetricstatistical device,
however,seemsto commencewiththe workofHotellingand Pabst [19] in 1936.
602 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
tensivetable [71 shows 104 disagreements among392 comparableentries(78 disagreements among 196
comparisonsat the 5 per centlevel, and 26 among 196 at 1 per cent). In each disagreement, Festinger
gives a lowercriticalvalue of the statistic,althoughboth writersstate that they have tabulated the
smallestvalue of the statisticwhoseprobabilitydoes not exceed the specifiedsignificance level. Three
of the disagreementscan be checkedwith the Mann-Whitneytable [28]; in all three,White's entry
agrees withMann-Whitney's.In one additionalcase (samplesizes 4 and 11 at the 1 per centlevel) we
have made our own calculationand foundFestinger'sentryto have a true probability(0.0103) ex-
ceedingthe stated significancelevel. The disagreements undoubtedlyresultfromthe factthat the dis-
tributionsare discontinuous,so that exact 5 and 1 per centlevels cannotordinarilybe attained.
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 605
forthe special case of Wilcoxon'stwo-sampletest certaindetails have
been discovered.Some that are interestingfroma practical viewpoint
are indicatedbelow, but withoutthe technicalqualificationsto which
they are subject:
(i) Lehmann [27] has shownthat the one-tailtest is unbiased-that
is, less likelyto reject whenthe null hypothesisis true than when any
alternativeis true-but van der Vaart [52] has shownthat the corre-
spondingtwo-tailtest may be biased.
(ii) Lehmann [27] has shown, on the basis of a theoremof Hoeff-
ding's [17],that underreasonable alternativehypotheses,as underthe
null hypothesis,the distributionof \/1 is asymptoticallynormal.
(iii). Mood [33] has shown that the asymptoticefficiencyof Wil-
coxon's test comparedwith Student's test, when both populationsare
normal with equal variance, is 3/ir, i.e., 0.955. Roughly, this means
that 3/r is the limitingratio ofsample sizes necessaryforthe two tests
to attain a fixed power. This result was given in lecture notes by
E. J. G. Pitman at Columbia Universityin 1948; it was also given by
van der Vaart [52]. To the best of our knowledge,Mood's proofis the
firstcompleteone.
(iv) Lehmann [27] and van Dantzig [15, 51a], generalizingthe find-
ingsofMann and Whitney[28],have shownthat the test is consistent'2
ifthe probabilitydiffers fromone-halfthat an observationfromthe first
populationwillexceed one drawnindependentlyfromthe second popu-
lation (forone-tailteststhe conditionis that the probabilitydiffer
from
one-halfin a stated direction). In addition van Dantzig [51a] gives
inequalities for the power. The C-sample condition for consistency
given by Kruskal (see Section 4.1) is a direct extension of the two
sample conditiongivenby Lehmann and van Dantzig.
5.4. Whitney'sThree-SampleTest
Whitney[60] has proposedtwo extensionsofthe Wilcoxontest to the
three-samplecase. Neither of his extensions,which are expressed in
termsofinversionsof orderratherthan in termsofranks,is equivalent
to our H test forC=3, since Whitneyseeks tests with power against
morespecificalternativesthan those appropriateto the H test.
Whitneyarraysall threesamplesin a singlerankingand thendefines
U as the number of times in which an observationfromthe second
sample precedesan observationfromthe firstand V as the numberof
timesin whichan observationfromthe thirdsample precedesone from
the first.22
22 U and V are not determinedby R1,R2,and R,, norvice versa,though
U + V - - ini(ni + 1)
606 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
5.5 Terpstra's
C-sampleTest.
Terpstra [50a] has proposed and investigateda test appropriatefor
alternativessimilarto those of Whitney'ssecond test, but extending
to any numberof populations.
5.6. Mosteller's
C-SampleTest
Mosteller[34] has proposeda multi-decisionprocedureforaccepting
eitherthe null hypothesisto whichthe H test is appropriateor one of
the C alternativesthat the ith populationis translatedto the right(or
left) of the others.His criterionis the numberof observationsin the
sample containingthe largestobservationthat exceed all observations
in othersamples. This procedurehas been discussed furtherby Mos-
tellerand Tukey [35].
6.2. ApproximateSignificanceLevels
6.2.1. X2approximation.This is the approximationdiscussedin Sec-
tions 1, 2, and 3. The most extensivesingletable is that of Hald and
Sinkbaek [13], thoughthe table in almost any modernstatisticstext
will ordinarilysuffice.
6.2.2. r approximation.This utilizes the incomplete-rdistribution
by matchingthe variance as well as the true mean of H. The mean, or
expectedvalue, of H underthe null hypothesisis [25]
(6.1) E= C-1
and the variance is
(6.2) C
2[3C2 - 6C + N(2C2 - 6C + 1)] 6 1
5N(N +1) 5 j-1 ni
'__8o g g 0g g g 0 0 00 0 0 0 O D tt rN
IQ w0 CM 0. (D ) *0
CM '-0 000 0 0 0
Alj /
E
x
/ /
0
A '~~99q0 0 0 009900 0m
X~~~~~~~~~~
I I ;tg/ I
FtM?ThFT I ! a 1@
ol~~~~~~~~~~~~~~~4
s/~~~~~~~~~~~~ /
44)
IL 44)
0
_ _/
_ _ _ / _ _ _ 0'
0
_
j~~~~~~~t t . " ____ . .
c ._____ _ C
i,~~~~~~C OD
O v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~O
E-4)
14)
x0~~~~~~~~~~~~~~~~~~~~~
Al~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*
I O 0 t0
O O O <) O O O
0 O
0 OD ( D u) st 0) oi
0 OD 0 0 0 0 0 o0 OD D tn Tn C - O) O O O O) O
CL? 04?)S - O O O O O O O s 0, 0
O,o 0
612 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
--o 0 0 o 0 0 0 0 0 0 o 0 0 0 0o co D fn c\ _
000 0o 0 0 0 o0 o@ Wtfl ) oJ -0 0 0 0 0 0
I Qa e ' <4 -7 O9000O O 9 O 0990 0 0 o 0
I 7 7 | ||,rZ Y l
CL / Z/
0
E0c C CL~~~~~~~~~~~~~~~~~~~~~C
/*/
. . / tZ . ---'.- s =~~~~~~~~~
1~ . _ _ _ _ _
0
m~~~~~
0c_
WtflS N)
//00990 9 999900
e .... Q
/
/~~~~~~~~~~~~~~~~~~~- Cs
I i /f - I '''- ' @ mER~~~~~~~
/y~~~~~~~~~~~~~~~~~~~C
/Ca
/</~~~~~~~~~~~~~~~~~~~ 0~~~~~~~
/A/~~~~~~~~~~~~~~~~~~~~~~~~~l O 0
P4
.,~~~~~~~~~~~~~~~c
,- _ -
XY r;~~~~~~~~~~~
/o/ =~~~~~~~~~~~~~~~~
/^/ ?~~~~~~~~~~~~
,i,;,
xf/vv? O=~~~~~~~~~~~~~~c
/>/ rt~~~~~~~~~~~~
_w .
_ - 0 -) .1 \ - __ . 0- JZ 0
_0 0 0 q
OD
(.u rf/ CM - - .- , 0 0
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 613
.02I .0 I I -- T
I I I II
? B Approximation
.01 _ X -
2
-ox x
0 X?r
0 Approximation
Al /\t-03
_.0 303 I
c001
1 I1 I1 - I I I I I I I I
0 .02 .03 .04 .05 .06 .07 .08 .09 .10 11 .12 .13 .i
( .02 XO
A-0 I I
II I III
0 00 I I I
.02 0 _
E c_
03 ~ 0
x 0
0
0 .01 .02 .03 .04 .05 .06 .07 .08 .09 .10 .11 .12 .13 .14
FUE.
for
H in the negbrodso 0 oI h ,
,ad10prcnoit,frthe.ape Approximtion
be om it .1 .1 1
614 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
TABLE 6.1
TRUE DISTRIBUTION OF H FOR THREE SAMPLES, EACH OF
SIZE FIVE OR LESS, IN THE NEIGHBORHOOD OF THE
10, 5, AND 1 PER CENT POINTS; AND COM-
PARISON WITH THREE APPROXIMATIONS
The probabilitiesshown are the probabilitiesunder the null hypothesis
that H will equal or exceed the values in the columnheaded "H"
Approximateminustrue
Sample Sizes True probability
H Proba- B
ni n2 n3 H r
(Linear (Normal
l n2
Probility x2
Interp.) Interp.)
2 1 1 2.7000 .500 -.241 -.309 -.500
2 2 1 3.6000 .267 - .101 - .167 - .267
2 2 2 4.5714 .067 +.035 -.007 -.067
3.7143 .200 - .044 - .083 +.010
3 1 1 3.2000 .300 - .098 - .180 - .300
3 2 1 4.2857 .100 +.017 - .040 - .100
3.8571 .133 +.012 - .045 - .042
3 2 2 5.3572 .029 +.040 +.083 -.029
4.7143 .048 +.047 +.012 +.014
4.5000 .067 +.039 +.003 +.020
4.4643 .105 +.002 - .033 - .014
3 3 1 5.1429 .043 +.034 - .010 - .043
4.5714 .100 +.002 - .046 - .062
4.0000 .129 +.007 - .041 - .024
3 3 2 6.2500 .011 +.033 +.012 -.011
5.3611 .032 +.036 +.010 +.001
5.1389 .061 +.016 - .012 - .019
4.5556 .100 +.002 - .027 - .020
4.2500 .121 - .002 - .031 - .014
3 3 3 7.2000 .004 +.024 +.010 -.004
6.4889 .011 +.028 +.011 - .001
5.6889 .029 +.030 +.009 +.003
5.6000 .050 +.011 - .010 - .015
5.0667 .086 - .006 - .029 - .026
4.6222 .100 -.001 -.025 -.010
4 1 1 3.5714 .200 - .032 - .114 - .200
4 2 1 4.8214 .057 +.033 -.017 -.057
4.5000 .076 +.029 - .022 - .047
4.0179 .114 +.020 - .032 - .056
4 2 2 6.0000 .014 +.036 +.010 -.014
5.3333 .033 +.036 +.007 - .017
5.1250 .052 +.025 - .006 - .021
4.3750 .100 +.012 - .020 - .002
4.1667 .105 +.020 - .012 +.014
USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 615
TABLE 6.1 (Continued)
Approximateminustrue
Sample Sizes H True probability
Proba- r B
nli n2 na bility ((Linear (Normal
Interp.) Interp.)
4 3 1 5.8333 .021 +.033 -.001 -.021
5.2083 .050 +.024 - .016 - .037
5.0000 .057 +.025 - .016 - .034
4.0556 .093 +.039 - .005 +.014
3.8889 .129 +.014 - .028 - .003
4 3 2 6.4444 .009 +.031 +.012 -.002
6.4222 .010 +.030 +.011 - .004
5.4444 .047 +.019 - .005 - .010
5.4000 .052 +.016 - .008 - .013
4.5111 .098 +.006 - .020 - .004
4.4667 .101 +.006 - .020 - .003
4 3 3 6.7455 .010 +.024 +.010 -.001
6.7091 .013 +.022 +.007 - .003
5.7909 .046 +.010 - .009 - .013
5.7273 .050 +.007 - .012 - .015
4.7091 .094 +.001 - .021 - .006
4.7000 .101 - .006 - .027 - .012
4 4 1 6.6667 .010 +.026 +.002 -.010
6.1667 .022 +.024 - .005 - .020
4.9667 .048 +.036 - .003 - .009
4.8667 .054 +.034 - .005 - .009
4.1667 .082 +.042 +.002 +.016
4.0667 .102 +.029 - .011 +.007
4 4 2 7.0364 .006 +.024 +.010 -.002
6.8727 .011 +.021 +.006 - .005
5.4545 .046 +.020 - .002 - .003
5.2364 .052 +.021 - .002 +.001
4.5545 .098 +.005 - .019 - .003
4.4455 .103 +.006 - .018 +.000
4 4 3 7.1439 .010 +.018 +.007 -.002
7.1364 .011 +.018 +.006 - .003
5.5985 .049 +.012 - .005 - .004
5.5758 .051 +.011 - .006 - .005
4.5455 .099 +.004 - .015 +.003
4.4773 .102 +.004 -.014 +.004
4 4 4 7.6538 .008 +.014 +.005 .000
7.5385 .011 +.012 +.003 - .002
5.6923 .049 +.009 - .006 - .002
5.6538 .054 +.005 - .010 - .007
4.6539 .097 +.001 - .015 +.004
4.5001 .104 +.001 - .015 +.007
5 1 1 3.8571 .143 +.003 - .109 -.143
616 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952
Approximateminustrue
Sample Sizes True probability
H Proba- B
nl n2 n3 bility x2 (Linear (Normal
Interp.) Interp.)
5 2 1 5.2500 .036 +.037 -.006 -.036
5.0000 .048 +.034 +.011 - .037
4.4500 .071 +.037 - .012 - .020
4.2000 .095 +.027 - .022 - .018
4.0500 .119 +.013 - .036 - .024
5 2 2 6.5333 .008 +.030 +.010 -.008
6.1333 .013 +.033 +.010 - .010
5.1600 .034 +.041 +.013 +.008
5.0400 .056 +.025 - .004 - .006
4.3733 .090 +.022 - .007 +.010
4.2933 .122 - .005 - .034 - .014
Approximateminustrue
Sample Sizes True probability
H Proba- B
nl n2 n3 bility x2 (Linear (Normal
Interp.) Interp.)
5 4 3 7.4449 .010 +.014 +.004 -.004
7.3949 .011 +.014 +.004 - .004
5.6564 .049 +.010 - .005 - .004
5.6308 .050 +.010 - .006 - .004
4.5487 .099 +.004 - .013 +.003
4.5231 .103 +.001 - .016 - .000