Two Group Discriminant Function Analysis

Two Group Discriminant Function Analysis
In DFA one wishes to predict group membership from a set of (usually continuous) predictor variables. In the most simple case one has two groups and p predictor variables. A linear discriminant equation, Di = a + b1 X 1 + b2 X 2 + + b p X p is constructed such that the two groups differ as much as possible on D. !hat is the weights are chosen so that were you to compute a discriminant score ( Di ) for each sub"ect and then do an A#$%A on D the ratio of the between groups sum of s&uares to the within groups sum of s&uares is as large as possible. !he value of this ratio is the eigenvalue. '(igen) can be translated from the *erman as 'own ) 'peculiar ) 'original ) 'singular ) etc. +hec, out the page at http-..core.ecu.edu.psyc.wuensch,./tat0elp.eigenvalue.t1t for a discussion of the origins of the term 'eigenvalue.) 2ead the following article which has been placed on reserve in 3oyner+astellow 4. A. 4uensch 5. 6. 7 8oore +. 0. (199:). (ffects of physical attractiveness of the plaintiff and defendant in se1ual harassment "udgments. Journal of Social Behavior and Personality 5 ;<=>;?2. !he data for this analysis are those used for the research presented in that article. !hey are in the /@// data file '0arass9:.sav.) Download it from my /@//>Data page and bring it into /@//. !o do the discriminant analysis clic, AnalyAe +lassify Discriminant. @lace the %erdict variable into the *rouping %ariable bo1 and define the range from 1 to 2. @lace the 22 rating scale variables (DBe1cit through @Bhappy) in the CIndependents) bo1. 4e are using the ratings the "urors gave defendant and plaintiff to predict the verdict. Dnder /tatistics as, for 8eans Dnivariate A#$%As Eo1Fs 8 Fishers +oefficients and DnstandardiAed +oefficients. Dnder +lassify as, for @riors +omputed From *roup /iAes and for a /ummary !able. Dnder /ave as, that the discriminant scores be saved. #ow loo, at the output. !he means show that when the defendant was "udged not guilty he was rated more favorably on all 11 scales than when he was "udged guilty. 4hen the defendant was "udged not guilty the plaintiff was rated less favorably on all 11 scales than when a guilty verdict was returned. !he !ests of (&uality of *roup 8eans show that the groups differ significantly on every variable e1cept plaintiff e1citingness calmness independence and happiness. !he discriminant function in unstandardiAed units (+anonical Discriminant Function +oefficients) is D G >:.:?< H .:IJ DBe1cit H ...... H .:29 @Bhappy. !he group centroids (mean discriminant scores) are >:.=I; for the *uilty group and 1.<91 for those "urors who decided the defendant was not guilty. 0igh scores on the discriminant function are associated with the "uror deciding to vote not guilty.
+opyright 2::I 5arl 6. 4uensch > All rights reserved.

dfa2.doc
Page 2
SS between _ groups SSwithin _ groups
!he eigenvalue =
on D (the &uantity ma1imiAed by the discriminant

SS between _ groups SStotal
function coefficients obtained) is 1.1I=. !he canonical correlation =
on D
(e&uivalent to eta in an A#$%A and e&ual to the point biserial r between *roup and D) is :.=J=. Wilks lam da is used to test the null hypothesis that the populations have identical SSwithin B groups means on D. 4il,s lambda is G so the smaller the the more doubt cast upon SStotal that null hypothesis. /@// uses a 2 appro1imation to obtain a significance level. For our data p K .:::1. 4e can determine how much of the variance in the grouping variable is e1plained by our predictor variables by subtracting the from one. For our data that is ;<L (also the value of the s&uared canonical correlation). DFA is mathematically e&uivalent to a !A"#$A. 6oo,ing at our from the perspective of a 8A#$%A when we combine the rating scales with weights that ma1imiAe group differences on the resulting linear combination the groups do differ significantly from one another. /uch a 8A#$%A is sometimes done prior to doing univariate analyses to provide a bit of protection against inflation of alpha. 2ecall that the grouping variable is predictor variable in 8A#$%A (is it what is being predicted in DFA) and the rating scales are the 8A#$%A outcome variables (and our DFA predictor variables). If the 8A#$%A is not significant we stop. If it is significant we may go on to do an A#$%A on each dependent variable. /@// gave us those A#$%As. 4e have created (or discovered) a dimension (li,e a component in @+A) on which the two groups differ. !he univariate A#$%As may help us e1plain the nature of the relationship between this discriminant dimension and the grouping variable. For e1ample some of the variates may have a significant relationship with the grouping variable and others might not but the univariate A#$%As totally ignore the correlations among the variates. It is possible for the groups to differ significantly on D but not on any one predictor by itself. !he standardi%ed discriminant function coefficients may help. !hese may be treated as Beta weights in a multiple regression predicting D from z>scores on the MFs Di = 1Z 1 + 2 Z 2 + + p Z p . $f course one must realiAe that these coefficients reflect the contribution of one variate in the conte1t of the other variates in the model. A low standardiAed coefficient might mean that the groups do not differ much on that variate or it might "ust mean that that variateFs correlation with the grouping variable is redundant with that of another variate in the model. /uppressor effects can also occur. +orrelations between variates and D may also be helpful. !hese are available in the loading or structure matri&. *enerally any variate with a loading of .J: or more is considered to be important in defining the discriminant dimension. !hese correlations may help us understand the discriminant function we have created. #ote that high scores on our D are associated with the defendant being rated as sincere ,ind happy warm and calm and
Page 3 with the plaintiff being rated as cold insincere and cruel. D scores were higher (mean G 1.<9) for "urors who voted not guilty than for those who voted guilty (mean G >:.=I). If your primary purpose is to predict group membership from the variates (rather than to e1amine group differences on the variates) you need to do classification. /@// classifies p+G i ) p+D * G i ) p+G i * D ) = g sub"ects into predicted groups using 'ayes( rule. p+Gi ) p+D * Gi )
i =1
(ach sub"ectFs discriminant score is used to determine the posterior probabilities of being in each of the two groups. !he sub"ect is then classified (predicted) to be in the group with the higher posterior probability. Ey default /@// assumes that all groups have e&ual prior probabilities. For two groups each prior G N for three 1.J etc. I as,ed /@// to use the group relative fre&uencies as priors which should result in better classification. Another way to classify sub"ects is to use Fis,er(s classification function coefficients. For each sub"ect a D is computed for each group and the sub"ect classified into the group for which eFs D is highest. !o compute a sub"ects D1 you would multiply eFs scores on the 22 rating scales by the indicated coefficients and sum them and the constant. For eFs D2 you would do the same with the coefficients for *roup 2. If D1 O D2 then you classify the sub"ect into *roup 1 if D2 O D1 the you classify em into *roup 2. !he classification results ta le shows that we correctly classified IIL of the sub"ects. !o evaluate how good this is we should compare IIL with what would be e1pected by chance. Ey "ust randomly classifying half into group 1 and half into group 2 you would e1pect to get .;(.?;;) H .;(.J<;) G ;:L correct. *iven that the marginal distribution of %erdict is not uniform you would do better by randomly putting ?;.;L into group 1 and J<.<L into group 2 ('pro a ility matc,ing)) in which case you would e1pect to be correct .?;;(.?;;) H . J<;(.J<;) G ;<.IL of the time. (ven better would be to ' pro a ility ma&imi%e) by "ust placing every sub"ect into the most li,ely group in which case you would be correct ?;.;L of the time. 4e can do significantly better than any of these by using our discriminant function. Assumptions- !ultivariate normality of the predictors is assumed. $ne may hope that large sample siAes ma,e the DFA sufficiently robust that one does not worry about moderate departures from normality. $ne also assumes that t,e variance.covariance matri& of t,e predictor varia les is t,e same in all groups (so we can obtain a pooled matri1 to estimate error variance). 'o&(s M tests this assumption and indicates a problem with our e1ample data. For validity of significance tests one generally does not worry about this if sample siAes are e&ual and with une&ual sample siAes one need not worry unless the p K . ::1. !he DFA is thought to be very robust and Eo1Fs M is very sensitive. #on>normality also tends to lower the p for Eo1Fs 8. !he classification procedures are not however so robust as the significance tests are. $ne may need to transform variables or do a &uadratic DFA (/@// wonFt do this) or as, that separate rather than pooled variance>covariance matrices be used. /illai(s criterion (rather than 4il,Fs ) may provide additional robustness for
Page 4 significance testing >> although not available with /@// discriminant this criterion is available with /@// 8A#$%A. A"#$A on D0 +onduct an A#$%A comparing the verdict groups on the discriminant function. !hen you can demonstrate that the DFA eigenvalue is e&ual to the ratio of the //between to //within from that A#$%A and that the ratio of //between to //total is the s&uared canonical correlation coefficient from the DFA. 1orrelation 'etween Groups and D0 +orrelate the discriminant scores with the verdict variable. Pou will discover that the resulting point biserial correlation coefficient is the canonical correlation from the DFA. 2A2- $btain the data file 0arass9:.dat from my /tatData page and the program DFA2.sas from my /A/ @rograms @age. 2un the program. !his program uses /A/ to do essentially the same analysis we "ust did with /@//. 6oo, at the output from @2$+ 2(*. It did a multiple regression to predict group membership (1 2) from the rating scales. #otice that the SSmodel . SSerror G the eigenvalue from the DFA and that the SSerror . SStotal G the 4il,s from the DFA. !he s&uare root of the 2 e&uals the canonical correlation from the DFA. !he unstandardiAed discriminant function coefficients (raw canonical coefficients) are e&ual to the standardiAed discriminant function coefficients (pooled within>class standardiAed canonical coefficients) divided by the pooled (within>group) standard deviations. #ote also that the DFAFs discriminant function coefficients are a linear transformation of the multiple regression !"s (multiply each by <.19J9; and you get the unstandardiAed discriminant funtion coefficients). I do not ,now what determines the value of this constant I determined it empirically for this set of data. 2eturn to 4uenschFs /tatistics 6essons @age
+opyright 2::I 5arl 6. 4uensch > All rights reserved.

Two Group Discriminant Function Analysis

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Two Group Discriminant Function Analysis

Загружено:

Авторское право:

Доступные форматы

Two Group Discriminant Function Analysis

+opyright 2::I 5arl 6. 4uensch > All rights reserved.

on D (the &uantity ma1imiAed by the discriminant

function coefficients obtained) is 1.1I=. !he canonical correlation =

+opyright 2::I 5arl 6. 4uensch > All rights reserved.

Вам также может понравиться