Discriminant Analysis

|

1. The major application area for this technique is where we want to be

able to distinguish between two or three sets of objects or people, based on
the knowledge of some of their characteristics.
2. Examples include the selection process for a job, the admission process
of an educational programme in a college, or dividing a group of people
into potential buyers and non-buyers.
3. Discriminant analysis can be, and is in fact used, by credit rating

agencies to rate individuals, to classify them into good lending risks or bad
lending risks.
4. To summarise, we can use linear discriminant analysis when we have to

classify objects into two or more groups based on the knowledge of some
variables (characteristics) related to them. Typically, these groups would be
users-non-users, potentially successful salesman ± potentially unsuccessful
salesman, high risk ± low risk consumer, or on similar lines.
v |
1. Discriminant analysis is very similar to the multiple regression
technique. The form of the equation in a two-variable discriminant analysis
is:
Y = a + k1 x1 + k2 x2
2. This is called the discriminant function. Also, like in a regression

analysis, y is the dependent variable and x1 and x2 are independent
variables. k1 and k2 are the coefficients of the independent variables, and a
is a constant. In practice, there may be any number of x variables.
3. Please note that Y in this case is a categorical variable (unlike in

regression analysis, where it is continuous). x1 and x2 are however,
continuous (metric) variables. k1 and k2 are determined by appropriate
algorithms in the computer package used, but the underlying objective is
that these two coefficients should maximise the separation or differences
between the two groups of the y variable.
4. Y will have 2 possible values in a 2 group discriminant analysis, and 3

values in a 3 group discriminant analysis, and so on.
±. K1 and K2 are also called the unstandardised discriminant function coefficients
6. As mentioned above, y is a classification into 2 or more groups and therefore, a

µgrouping¶ variable, in the terminology of discriminant analysis. That is, groups
are formed on the basis of existing data, and coded as 1 and 2 or similar to
dummy variable coding.
7. The independent (x) variables are continuous scale variables, and used as
predictors of the group to which the objects will belong. Therefore, to be able to
use discriminant analysis, we need to have some data on y and the x variables
from experience and / or past records.
:
v

Assuming we have data on both the y and x variables of interest, we estimate

the coefficients of the model which is a linear equation of the form shown
earlier, and use the coefficients to calculate the y value (discriminant score) ±
for any new data points that we want to classify into one of the groups. A
decision rule is formulated for this process ± to determine the cut off score,
which is usually the midpoint of the mean discriminant scores of the two
groups.

Then, the classification of the existing data points is done using the equation,
and the accuracy of the model is determined. This output is given by the
classification matrix (also called the confusion matrix), which tells us what
percentage of the existing data points is correctly classified by this model.
This percentage is somewhat analogous to the R2 in regression analysis
(percentage of variation in dependent variable explained by the model). Of
course, the actual predictive accuracy of the discriminant model may be less
than the figure obtained by applying it to the data points on which it was
based.
v

Just as in regression, we have the option of entering one variable at a time

(Stepwise) into the discriminant equation, or entering all variables which we
plan to use. Depending on the correlations between the independent
variables, and the objective of the study (exploratory or predictive /
confirmatory), the choice is left to the student.
Ë

1. Suppose we have two independent variables, x1 and x2. How do we

know which one is more important in discriminating between groups?
2. The coefficients of x1 and x2 are the ones which provide the answer, but
not the raw (unstandardised) coefficients. To overcome the problem of
different measurement units, we must obtain standardised discriminant
coefficients. These are available from the computer output.
3. The higher the standardised discriminant coefficient of a variable, the

higher its discriminating power.

The discriminant analysis algorithm requires us to assign an a priori

(before analysis) probability of a given case belonging to one of the
groups. There are two ways of doing this.
.We can assign an equal probability of assignment to all

groups. Thus, in a 2 group discriminant analysis, we can assign
0.± as the probability of a case being assigned to any group.
.We can formulate any other rule for the assignment of

probabilities. For example, the probabilities could proportional
to the group size in the sample data. If two thirds of the sample
is in one group, the a priori probability of a case being in that
group would be 0.66 (two thirds).
We will turn now to a complete worked example which will clarify
many of the concepts explained earlier. We will begin with the
problem statement and input data.

Suppose State Bank of Bhubaneswar wants to start credit card

division. They want to use discriminant analysis and set up a system to
screen applicants and classify them as either µlow risk¶ or µhigh risk¶
(risk of default on credit card bill payments), based on information
collected from their applications for a credit card.
Suppose SBB has managed to get from SBI, its sister bank, some data
on SBI¶s credit card holders who turned out to be µlow risk¶ (no
default) and µhigh risk¶ (defaulting on payments) customers. These
data on 18 customers are given in fig. 1.
÷
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ ÷÷
÷
÷ ÷÷ ÷÷ ÷÷ ÷÷

÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷
We will perform a discriminant analysis and advise SBB on how to set up
its system to screen potential good customers (low risk) from bad customers
(high risk). In particular, we will build a discriminant function (model) and
find out
.The percentage of customers that it is able to classify correctly.
.Statistical significance of the discriminant function.
.Which variables (age, income, or years of marriage) are relatively

better in discriminating between µlow¶ and µhigh¶ risk applicants.
.How to classify a new credit card applicant into one of the two
groups ± µlow risk¶ or µhigh risk¶, by building a decision rule and a
cut off score.
Input Data are given in fig. 1.
'

We will now find answers to all the four questions
we have raised earlier.
Q1. How good is the Model? How many of the 18
data points does it classify correctly?
To answer this question, we look at the computer
output labelled fig. 3. This is a part of the
discriminant analysis output from any computer
package such as SPSS, SYSTAT, STATISTICA,
SAS etc. (there could be minor variations in the exact
numbers obtained, and major variations could occur
if options chosen by the student are different. For
example, if a priori probabilities chosen for the
classification into the two groups are equal, as we
have assumed while generating this output, then you
will very likely see similar numbers in your output).
Fig. 3 : Classification Matrix

STAT. Classification Matrix
G roup Percent G _1 G _2
G1 10 0 .0 0 0 0 9 0
Total 9 4 .4 4 4 4 10 8
This output (fig. 3) is called the classification matrix (also known as the
confusion matrix), and it indicates that the discriminant function we have
obtained is able to classify 94.44 percent of the 18 objects correctly. This
figure is in the ³percent correct´ column of the classification matrix. More
specifically, it also says that out of 10 cases predicted to be in group 1, 9 were
observed to be in group 1 and 1 in Group 2, (from column G-1). Similarly,
from the column G-2, we understand that our of 8 cases predicted to be in
group 2, all 8 were found to be in group 2. Thus, on the whole, only 1 case out
of 18 was misclassified by the discriminant model, thus giving us a
classification (or prediction) accuracy level of (18-1)/18, or 94.444 percent.
As mentioned earlier, this level of accuracy may not hold for all future
classification of new cases. But it is still a pointer towards the model being a
good one, assuming the input data were relevant and scientifically collected.
There are ways of checking the validity of the model, but these will be
discussed separately.

2. How significant, statistically speaking, is the discriminant

function?
This question is answered by looking at the Wilks¶ Lambda and

the probability value for the F test given in the computer output,
as a part of fig. 3.(shown below)
|

a

! " ##$%

The value of Wilks¶ Lamba is 0.318. This value is between 0 and

1, and a low value (closer to 0) indicates better discriminating
power of the model. Thus, 0.318 is an indicator of the model
being good. The probability value of the F test indicates that the
discrimination between the two groups is highly significant. This
is because p<.00089, which indicates that the F test would be
significant at a confidence level of upto (1 - .00089) x 100 or
(.99911) 100 or 99.91.
÷
÷ ÷ ÷ ÷
÷ ÷
÷ ÷ ÷
÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
÷ ÷ ÷
÷ ÷ ÷ ÷
÷÷ ÷÷ ÷ ÷
÷
! ÷÷÷"# ÷÷ ÷÷÷ ÷
÷ ÷ ÷ #
#÷ !÷ ÷ ÷ ÷ ÷ ÷ $÷
÷ ÷
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ ÷÷
÷
%!!÷ % * ÷
.÷ / ÷0÷
12÷ 678 7$$÷
2÷ 80 BC08÷
÷
÷!÷ #
#÷ ÷÷÷÷÷÷
÷÷
÷ ÷ ÷C78÷ ÷D÷ ÷÷÷
÷ ÷CEE÷÷ ÷÷÷÷÷÷
÷ ÷ ÷÷C0$÷F÷÷÷÷ #÷
#÷ ÷÷ ÷ ÷ ÷÷÷
÷÷÷
÷

!

"

# $

"

# "

÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷
%& !
/

1 2
13242 72899:8

T h u s , th e n e w m e a n fo r g ro u p 1 (lo w ris k ) is ±
1 .3 7 7 9 3 , a n d th e n e w m e a n fo r g r o u p 2 (h ig h ris k )
is + 1 .3 7 7 9 2 . T h is m e a n s th a t th e m id p o in t o f th e s e
tw o is 0 . T h is is c le a r w h e n w e p lo t th e tw o m e a n s
o n a s tra ig h t lin e , a n d lo c a te th e ir m id p o in t, a s
s h o w n b e lo w -
-1 .3 7 0 + 1 .3 7
M e a n o f G ro u p 1 M e a n o f G ro u p 2
(L o w R is k ) (H ig h R is k )
÷
O

O

! "
#

O

Ò
÷
÷ ÷ ÷
'()*÷ ÷+÷
,-÷ 6789:;÷
+;;;@@9÷
-AB*÷ 7+@:;+÷
÷
÷
G(H÷ GA÷ 8÷ I(J(KLK÷ )BMN÷ O÷ LK(KK÷
I(÷(M÷K(H÷L÷÷
÷
÷ P÷Q÷+;;;@:÷R÷A÷I789:;M÷R÷SH÷I;;;;TM÷
6÷ P(÷U((K÷I;T8:9M÷
÷
VO(÷ W÷ L*K÷ AB÷ L÷ O÷ K(H÷ (÷ ÷ W÷
J(÷ O÷ AN÷ SH÷ K÷ P(÷ U((K÷ (÷
X ÷
Let us take an example of a credit card application to SBB who is aged
40, has an income of Rs. 2±,000 per month and has been married for 1±
years. Plugging these values into the discriminant function or model
above, we find his discriminant score y to be
10.0036 ± 40 (.24±60) ± 2±000 (.00008)

-1± (.0846±), which is
= 10.0036 ± 9.824 ± 2 ± 1.2697±
= - 3.0901±
According to our decision rule, any discriminant score to the left of the
midpoint of 0 leads to a classification in the low risk group. Therefore,
we should give this person a credit card, as he is a low risk customer. The
same process is to be followed for any new applicant. If his discriminant
score is to the right of the midpoint of 0, he should be denied a credit
card, as he is a µhigh risk¶ customer.
We have completed answering the four questions raised by State Bank of

Bhubaneswar.

Discriminant Analysis

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Discriminant Analysis

Загружено:

Авторское право:

Доступные форматы

| 

1. The major application area for this technique is where we want to be

3. Discriminant analysis can be, and is in fact used, by credit rating

4. To summarise, we can use linear discriminant analysis when we have to

2. This is called the discriminant function. Also, like in a regression

3. Please note that Y in this case is a categorical variable (unlike in

4. Y will have 2 possible values in a 2 group discriminant analysis, and 3

6. As mentioned above, y is a classification into 2 or more groups and therefore, a

Assuming we have data on both the y and x variables of interest, we estimate

  v 

Just as in regression, we have the option of entering one variable at a time

1. Suppose we have two independent variables, x1 and x2. How do we

3. The higher the standardised discriminant coefficient of a variable, the

The discriminant analysis algorithm requires us to assign an a priori

.We can assign an equal probability of assignment to all

.We can formulate any other rule for the assignment of

Suppose State Bank of Bhubaneswar wants to start credit card

.The percentage of customers that it is able to classify correctly.

.Statistical significance of the discriminant function.

.Which variables (age, income, or years of marriage) are relatively

Fig. 3 : Classification Matrix

2. How significant, statistically speaking, is the discriminant

This question is answered by looking at the Wilks¶ Lambda and

|   

The value of Wilks¶ Lamba is 0.318. This value is between 0 and

  !

   

10.0036 ± 40 (.24±60) ± 2±000 (.00008)

We have completed answering the four questions raised by State Bank of

Вам также может понравиться

|

v

.We can assign an equal probability of assignment to all

.We can formulate any other rule for the assignment of

.The percentage of customers that it is able to classify correctly.

.Statistical significance of the discriminant function.

.Which variables (age, income, or years of marriage) are relatively

2. How significant, statistically speaking, is the discriminant

|

!