Вы находитесь на странице: 1из 7

Generalized Linear Models Specified in Terms of Constraints

Author(s): R. W. M. Wedderburn
Reviewed work(s):
Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 36, No. 3
(1974), pp. 449-454
Published by: Wiley-Blackwell for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2984931 .
Accessed: 21/08/2012 09:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Wiley-Blackwell and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend
access to Journal of the Royal Statistical Society. Series B (Methodological).

http://www.jstor.org
1974] 449

GeneralizedLinear Models Specifiedin Termsof Constraints

By R. W. M. WEDDERBURN
RothamstedExperimentalStation
[Received December 1972. Final revision December 1973]

SUMMARY
A modification ofthemethodofNelderand Wedderburn (1972)is givenfor
modelswiththesameerrordistributions
fitting as discussedtherebutwith
thesystematic partofthemodelsspecifiedin termsofconstraints. It is pos-
sibleto fitthesebythemethoddescribed byNelderand Wedderburn using
iterativeweighted butit turnsout to be simplerto replaceeach
regression,
regression calculationofthatmethodbyanotherone withtheproperty that
thefittedvaluesof one regressioncalculationare theresidualsof theother
andviceversa.Themethodis appliedto testing formarginal homogeneity in
contingency tables.
Keywords: CONSTRAINED ESTIMATION; CONTINGENCY TABLES; GENERALIZED LINEAR
MODELS; LINEAR MODELS; MARGINALHOMOGENEITY; MAXIMUMLIKELIHOOD;
REGRESSION; WEIGHTED LEAST SQUARES

INTRODUCTION
IN 1972 Nelder and Wedderburndefineda class of models for whichmaximum
procedurein whicheach iteration
likelihoodestimatescan be obtainedby an iterative
involvedcalculatinga weightedlinearregression.These modelshad a randomcom-
ponent specifying the distribution
of the observations.Several distributionswere
possible includingthe normal,Poisson, binomial and gamma distributions.The
systematic part of the modelsspecifiedthatsome functionof the meanswas linear
in a set of parameters.This resultgeneralizedthe resultsof Nelder (1968) and the
well-known methodforobtainingmaximumlikelihoodestimatesin probitanalysis.
Otherexampleswerevariousmodelsinvolvingcontingency tables wherethe effects
were additiveon a log scale, the inversepolynomialmodels of Nelder (1966) and
modelsinvolvingsumsof squareswhichhad a x2 or gammadistribution.
This paper considersmodelsin whichthe systematic componentof the model is
definedby a set of linearconstraints.Afterconsideringthe case of normallinear
modelsspecifiedin thisway,theextensionto moregeneralmodelscomes naturally.
As an examplethemethodis appliedto testingformarginalhomogeneity in contin-
gencytables.

1. LEAST-SQUARES FITTING OF MODELS SPECIFIED IN TERMS OF CONSTRAINTS


Suppose we have an n-dimensional
vectorof observationsy fromthemodel
E(y) = Y (1)
whereLY = 0, L beinga p x n matrix.(E(y) denotestheexpectationof y.)
Thensupposethatwe wantto fitthismodelusingweightedleastsquares,choosing
S to minimize(y-Yi)'W(y- i) whereW is a givensymmetric and positivedefinite
matrix.In theapplicationsdescribedbelow,W willbe diagonal.
450 WEDDERBURN - Generalized
LinearModels [No. 3,

Let M(A) denotethecolumnspace of a matrixA. SupposethatX is a matrixwith


dimensionsn x q suchthatM(X) is the orthogonalcomplement of M(L') in n-dimen-
sional Euclidean space. Then LY = 0 impliesthatY = X, forsome q-dimensional
vectorP. (Here we havep + q >?n withequalityifboththerowsof L and thecolumns
of X are linearlyindependent.)So (1) maybe written
E(y) = X,. (2)
We knowhow to fit(2) byweightedleastsquaresobtainingi = X(X'WX)- X'Wy.
(Here A- denotesa generalizedinverseof A, i.e. any matrixsatisfying AA- A = A.)
i is, of course,unique eventhough(X'WX)- maynot be.
However,ifp is small and n is large,thenq is also large,and we have a large
matrixto invert;also determining X explicitlymay be quite difficult.Fortunately,
we can obtainY muchmoreeasily.
First,notethatW definesan innerproducton Rn. For therestof thissectionthe
terin"W-orthogonal"will mean orthogonalwithrespectto thisinnerproduct(i.e.
<x,y> = x'Wy). Now, sinceW is non-singular, it is easilyseen thatM(W-1 L') and
M(X) are W-orthogonalcomplements.
Let r be thevectorof residualsobtainedby fitting (2) usingW as weightmatrix.
Then
y = Y+r.
Now S is theW-orthogonal projectionof y intoM(X). Thus r is theprojectionof y
intotheW-orthogonalcomplementof M(X) whichis M(W-1 L). It followsthatr is
thevectorof fittedvaluesthatwould be obtainedfroma least-squaresfitof
E(y) = W-1Ly (3)
using W as the weightmatrix. In otherwords,the fittedvalues for(2) will be the
residualsfor(3) and vice versa. Fittingtheregression(3) is likelyto be quiteeasy if
p is smalland W is diagonal.
In fact,thefollowingresulthas now been proved.
Theorem.If
(i) X is a matrixof maximalranksuchthatL'X = 0,
(ii) W is a symmetric positivedefinite
matrixand
estimatesfor(P) and (y) in theequationsE(y) = X
(iii) , and y are least-squares
and E(y) = W-1Ly withW usedas theweightmatrix, theny = X,3+ W-1L'y.
It cannotbe claimedthatthisresultis entirely new althoughit is difficult
to find
a satisfactory statement.Essentiallythe same result,thoughexpressedand derived
in a verydifferent way,can be foundin ChapterVII of Brunt(1917).

2. EXTENSION TO GENERALIZED LINEAR MODELS


(1972) took theform
The class of modelsdiscussedby Nelderand Wedderburn
E(z) = ,u, (4)
where,i =f(Y1) and Y = X,.
The distributionof z may come froma one-parameter
exponentialfamilyof
distributions
or a familywithdensitiesof theform
r(z; 0, b)= exp[c(O) {zO- g(6) + h(z)}+ /(0, z)], (5)
1974] WEDDERBURN - LinearModels
Generalized 451

where bis a fixednuisanceparameter,and 6 variesfromone observationto another


accordingto themodel(4), therelationbetween0 and jL being k= g'(0) (Section(1.1)
ofNelderand Wedderburn, forzi includethenormal
1972). The possibledistributions
distribution,possiblywith unknownvariance,binomial distribution, the Poisson
distributionand the gamma distribution,possibly with unknowncoefficient of
variation.
It was shownthatwe can writevar(z) in the formV(tk)/c(x() or simplyV(yt)if
thereis no nuisance parameter,where V(y) = g"(0). Then each iterationin the
of the model by maximumlikelihoodconsistsof regressing
fitting the vectorwhose
componentsare
dY (6)
Y( )

on X, and usingweights
(d} )2 (7)

Here [uand Y standforthecurrentapproximationto j and 9'and thefitted


valuesof
theregressionprovidea newvalue for Y fromwhicha newvalue for u can be calcu-
lated. We generallystartby takingK = z.
assumptionsfor z, but the
Suppose now that we have the same distributional
modelis expressedin theform
E(z) = ,u, whereuju=f(Yi) and LY = 0. (8)
Followingthe argumentof Section 1 we can findX so thatthe model takes the
form(4). Then the iterativemethodalreadydescribedcan be used, and each step
oftheiterationis equivalentto fitting
theregressionmodel(2). Clearlywe mayfit(3)
instead,and thenuse theresidualsas thenewvalue for Y. The algorithmthentakes
thefollowingform:
(a) Set V.= z and calculateY from,u. Set y = Y.
(b) CalculatethediagonalmatrixW using(7).
(c) Regressy on W-1L' usingW as weightmatrix.SetY = residualsand calculate
,ufromY. If theprocesshas gone farenoughstop.
(d) Calculatey from(6) and go to step(b).
If we have inhomogeneousconstraints fortheform
LY = c
thenwe simplyhave to choose anyvectora suchthat
La = c
and redefinefin (8) so thatY is replacedbyY - a. The constraintsthenbecomehomo-
geneous. This meansthatthefunction f willbe different but it
foreach observation,
shouldbe notedthatalthoughthe algorithmhas been describedas iff of (8) and g
and h of (5) werethesameforeach observation, thereis no needforthemto be so.

3. APPLICATION-MARGINAL HOMOGENEITY IN CONTINGENcy TABLES


Several authorshave consideredtestingwhetherthe two marginsof a square
contingency table may be consideredto have equal expectation,e.g. Stuart(1955)
and Ireland,Ku and Kullback (1969). Stuart(1955) says that the likelihood-ratio
4.52 WEDDERBURN - Generalized
LinearModels [No. 3,

principle result.Kullback(1971a) considered


yieldsan intractable thecorresponding
problem formultidimensionaltables.
It turnsoutthatthemethoddescribed abovecan be usedto providemaximum
likelihood
estimates ofthecellfrequencies
pii subjectto theconstraints
Px= y.jpji forj = I,..n.
Epi
i i

Lettheobservationsbe n*1.As pointedoutinNelderandWedderburn (1972),forthe


purposesof maximum likelihood estimation we can treata samplefroma multi-
nomialdistribution
as ifit consisted of independent Poissonobservations.
Thisis
mosteasilyseenas follows:supposewe havea samplen1,...,nkfroma multinomial
distribution
withparameters is
Pl, .. -,Pk; thelog-likelihood
L(pl, ...,Pk) =Eni lnp.
i

If we regardtheobservations
as comingfromPoissondistributions
withmeanmpi,
we obtaina log-likelihood
L*(pl, . . ., Pk, m) = E niInm-m +L(pl,, Pk)-

ThetermE niInm- m doesnotinvolvepl,.. ,Pk and so likelihoodratiotestsof


hypotheses
concerning
thep's usingL* willgivethesameresults
as testsusingL.
If theexpected
cellfrequenciesare ,uip,
thehypothesis
ofmarginal homogeneity
maybe expressed
as
pi .: ji forj = 1,.., n.
i *

Onlyn-1 oftheseconstraints arelinearlyindependent. Usingthenotation already


developed forgeneralized
linearmodels,wecalltheobserved zij. Wehave
frequencies
Yii= pij, so thatthey-variate
foreachiteration
isjustthesetofzij's,andWjj = l/lij.
Thekthconstraint maybe writtenij = 0 wherel(jk)
= Sik-jk where k etc.
areKronecker 8 symbols.
Thenwe use as independent variates
X*2 = /ij(Oik fork = 1,2,...,n-I
-
jk)
andtheresiduals
providethenextapproximation
to yij.
Example1
Stuart(1955)applieda testofmarginal homogeneityto thedataofTable 1.
Stuartobtaineda valueforx2 of 11 96 with3 degreesof freedom, Ireland,Ku
and Kullback(1969) usingseveralmethodsobtained11 998,12-010and 11 978.
The x2calculatedfromthelikelihood ratiowascalledthe"deviance"byNelderand
Wedderburn; thevalue obtainedusingthemethodof thispaperwas 11X986.Of
course,withquitelargenumbers inthetablewewouldexpectall themethods, which
are asymptotically
equivalent,to givesimilaranswers.The valuesoffi; are given
alongwiththedatain Table 1. We noticethattheobserved valuesareconsistently
largerthanthefitted
valuesabovethediagonalofthetableandconsistently lessbelow
thediagonal.It seemsthatlefteyevisiontendedto be weakerthanright eyevision.
We inevitably
findthat,ii = z*. Thisis notthecaseinTable6.3 ofIreland,Ku and
Kullbackand Stuart'smethoddoesnotprovideexpected frequencies
at all.
1974] WEDDERBURN- Generalized
LinearModels 453

TABLE 1
Unaideddistancevisionof 7,477 womenaged 30-39 withfittedvalues(jij) in brackets

Lefteye
Highest Second Third Lowest
Righteye grade grade grade grade Total

Highestgrade 1,520 266 124 66 1,976


(1,520-0) (252.5) (111-8) (57 0) (1,941-3)
Secondgrade 234 1,512 432 78 2,256
(247-2) (1,512.0) (409 4) (70 6) (2,2392)
Thirdgrade 117 362 1,772 205 2,456
(131P3) (383-1) (1,772-0) (195.3) (2,481-7)
Fourthgrade 36 82 179 492 789
(42 8) (91-6) (188-4) (492 0) (814.8)
Total 1,907 2,222 2,507 841 7,477
(1,941-3) (2,239.2) (2,481-7) (814-8) (7,4770)

Example2. A 24 contingency table


Kullback (1971b) consideredthe data in Table 2 whichshowsthedistributionof
thesexesin thefirstfourbirthsin 36,536families.He gavea testto determine
whether
the sex ratiosin thefirstfourbirthsare equal. The constraintshereare
E ijkl = E ljikl = z l-'jkil= X
kjkli-
jkl jkl jkl jkl
As in thepreviousexamplewe have su= Y.

TABLE 2
Distribution
ofsexes amongfirstfourbirthorders

k
M F

I
i j M F M F
M M 2,574 2,469 2,401 2,313 9,757
F 2,478 2,289 2,329 2,121 9,217
F M 2,340 2,258 2,276 2,209 9,083
F 2,253 2,084 2,107 2,035 8,479

9,645 9,100 9,113 8,678 36,536

A likelihoodratio testgave a devianceof 3 656 with3 d.f. This compareswith


3'672 producedby Kullback. It seemsthatthedata showlittleevidenceof inequality
of thesex ratiosin different
birthorders.
454 WEDDERBURN - Generalized
LinearModels [No. 3,

weredoneusingtheGENSTATsystem
Thesecalculations developed
at Rothamsted,
whichincludesall thenecessary foriterative
facilities regression
by allowingthe
storing
ofresidualsorfitted arithmetic
values,general vectoroperationsandlooping.
4. CONCLUSION
Themethodofthispaperis anextensionofthatoriginally
describedforgeneralized
linearmodels.An applicationhas been described, namelythat of testingthe
hypothesis
of marginalhomogeneity of contingencytables. In the absenceof a
convenient
method forapplying maximum likelihood
to thisproblem, othermethods
haveproliferated.
No doubtotherapplicationsofthemethod (suchas one concerning
variance
component estimation byG. N. Wilkinson
tobe described elsewhere)willbe
found.

REFERENCES
BRUNT, D. (1917). The Combination of Observations. Cambridge:UniversityPress.
IRELAND, C. T., Ku, H. H. and KULLBACK, S. (1969). Symmetryand marginalhomogeneity ofan
r x r contingency table. J. Amer.Statist.Ass.,64, 1323-1341.
KULLBACK, S. (1971a). Marginalhomogeneity of multidimensional contingency
tables. Anil.
Math.Statist.,42, 594-606.
-- (1971b). The homogeneity of thesex ratioof adjacentsibs in humanfamilies.Biometrics,
27, 452-457.
NELDER,J. A. (1966). Inversepolynomials, a usefulgroupof multifactor responsefunctions.
Biometrics, 22, 128-141.
-- (1968). Weightedregression, quantalresponsedata, and inversepolynomials.Biometrics,
24, 979-985.
NELDER,J.A. and WEDDERBURN, R. W. M. (1972). Generalizedlinearmodels.J. R. Statist.Soc.
A, 135, 370-384.
STUART,A. (1955). A testforhomogeneity ofthemarginaldistributionsina two-wayclassification.
Biometrika, 42, 412-416.

Вам также может понравиться