Академический Документы
Профессиональный Документы
Культура Документы
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series C (Applied Statistics).
http://www.jstor.org
Beta-binomial
AnovaforProportions
By MARTINJ. CROWDER
SurreyUniversity, Britain
Guildford,
[Received January 1977. Revised May 1977]
SUMMARY
A methodis proposedfor the regressionanalysisof proportionsbased on the
Beta-binomial
distribution.
Keywords: ANOVA FOR PROPORTIONS; BETA-BINOMIALDISTRIBUTION
1. INTRODUCrION
THEproblem considered herearosefromresearch conducted bymicrobiologist Dr P. Whitney
of SurreyUniversity. A batchoftinyseedsis brushedontoa platecoveredwitha certain
extractat a givendilution.The numbersof germinated and ungerminated seeds are
subsequently counted.A considerable amountofdatahasthusbeengenerated, with4 types
of seed,3 different extracts, severalserialdilutions and upwardsof 5 replicates formany
combinations. It is clear,however, frominspection of thedata thatthereis heterogeneity
of proportions betweenreplicates, thisobservation beingsupported (in factinsistedupon)
bytheexperimenter; x2testson theappropriate 2 x mtablesgivesomeconfirmation, though
the frequencies are oftensmall. Such a situation, withvariationof proportions between
replicates, cannotbe uncommon in othercontexts,buttheproblem ofanalysingthevariation
withinand betweendata setsdoes not seemto havebeentackledbefore.The analogous
situation forcontinuous datais thestandard nestedmixedmodelwherethedifferent treatment
groupsrepresent thefixed effects
andreplicationswithin treatmentsrepresent
therandom effects.
Variousapproximate methods mightbe used. Supposethatthereare mi observed pro-
portionsin the ith data set (i = 1,..., k), and let m+ denote i mi. A 2 x m+ table yieldsa
x2 with(m+- 1) d.f.of whicha component withk-I d.f.can be assignedto thecontrast
betweensets,and suchpartitioning of x2 can be extendedto the case of cross-classified
proportions. Butthemethod doesnotcaterforvariation ofexpected proportionswithin cells.
Thisis also trueof LogitAnalysis.Another methodwouldbe analysisofvarianceapplied
to theangulartransforms oftheobserved proportions. Thisallowsforwithin-cellvariation,
butthevariancestabilizing property ofthetransform dependson eachn beinglarge.
A standardmodelis thebeta-binomial distribution (BBD) in whichtheexpectedpro-
portionsare beta-distributed. Chatfield and Goodhart(1970) discussthe use of BBD in
connection withconsumer purchasing; theyfitdata bymatching themeanand zerocount.
Anotherapplication is givenby Griffiths (1973)whosetitleis self-explanatory. However,
in thesepapers,and otherscitedin them,thepurposeis to fita singleset of data. Here
suggestions are madeforusingBBD as an errordistribution forregression. The approach
has severaladvantages:
(i) it is based on a modelwhichis exactlyrealizableand containsparameters capableof
meaningful interpretation;
in thattheparticular
(ii) it is flexible assumptions madeabouttheparameters, representing
(a) thetypeofwithin-cell heterogeneity,and (b) theformoftheregression can
relationship,
be variedto suitthedata;
(iii) forone-way Anovaa singlesubroutine (FortranIV, about60 instructions,suppliedon
request)suffices to compute thevariouslikelihoods, in conjunctionwithanystandard routine
forfunction minimization; moregenerallinearmodelsarelikewise simply applied.
34
TABLE 1 TABLE 2
Data for 0. cernuaseed in bean rootextract for one-waydata
Log-likelihoods
3. DEVELOPMENTS
In orderto generalize to regardthesubscript
theanalysisit is onlynecessary i in (rjj,njj)
as defining thesetofconditionsunderwhichthosedataweregenerated, keepingjforreplicates
wherethereare morethanone foreach i. Thusi mayrepresent a combination of factor
levels,or a setofcovariates, or both. Therenowarisesthepossibility ofrelating thesa's to
thefactors andcovariates usinga regression equation.
In Table3 somemoreseeddataaregivenas a 2 x 2 factorial layout.Therearetwotypes
ofseed,0. aegyptiaca 75 and 0. aegyptiaca 73,and tworootextracts, beanand cucumber,
thedilutionbeing1/125throughout. Applying theone-way
first Anovadescribed aboveto
thesefourdata setswe findlog-likelihoods - 53-667fortheBBD modelwith8 parameters
(i.e.a (-, a) pairforeachdataset),- 53-767with5 parameters (- , a3, -T4anda common a),
and - 64*516with2 parameters (commonv and a). Thushomogeneity of a2 iS supported
(G2= 0-2, 3 d.f.)anddifferences between v valuesare highly (G2= 21-498,
significant 3 d.f.).
TABLE 3
Data for seeds 0. aegyptiaco75 and 73, bean and cucumber
rootextracts
0. aegyptiaca75 0. aegyptiaca73
TABLE 4a
Logit log-likelihoods
M.l.e.'s
Factorialeffects Numberof Log-
included parameters 7TIl 7T12 7T21 7T22 likelihood
None 2 0 494 0 494 0 494 0 494 -64-516
Seeds only 3 0 538 0 538 0 435 0 435 -63 553
Extractsonly 3 0-376 0-620 0-376 0-620 -57-196
Seeds+extracts 4 0 405 0-651 0-326 0 570 -55-832
(maineffects)
Full modelwith 5 0-368 0 685 0-391 0-519 -53 767
interaction
TABLE4b
x2 values
foreffects
Source d.f. G2