Академический Документы
Профессиональный Документы
Культура Документы
North-Holland
X, = relevant characteristic,
1. Introduction b, = weight or score corresponding to characteris-
tic X,.
In each bank or credit company the evaluation
of new credit demands is one of the basic aspects A method most frequently used for the de-
of credit granting. Traditionally the decision termination of the coefficients b, is discriminant
whether to grant credit to an individual is taken analysis [see for example Myers and Forgy (1963)].
by a specialized person, who handles each demand However, one of the basic underlying assump-
individually. As this is a time-consuming (and tions in discriminant analysis is the assumption of
therefore expensive) process, financial institutions normally distributed variables X,, which is violated
often make use of a credit scoring system. in this case as most of the variables used in a
Such a system is a computerized procedure credit scoring system are categorical variables (e.g.
which attributes to the client a number of points profession, marital status).
(a score) according to a number of relevant char- For that reason a decision rule was determined
acteristics as income, profession, age, etc. If the using a logistic regression model, which does not
total score obtained by summation of the individ- imply that the variables X, are multivariate nor-
ual scores is high enough, i.e., if it is higher than a mally distributed. In the credit scoring area this
so-called cut-off level, the credit will be granted. model was first proposed by Chesser (1974) for
If the total score does not reach the cut-off level, forecasting commercial loan non-compliance.
the credit will be refused. In the logistic regression model the assumption
This report will focus on the practical deriva- is made that the probability of a loan to be good
tion of a credit scoring model for personal loans. is dependent on the level of the characteristics X,.
Section 1 briefly reviews the statistical method. In Specifically, one assumes that the posterior prob-
Section 2 a description of the data is given. In ability of a good loan is given by
Section 3 the resulting credit scoring model is
eh,+b,X,t +b,X,
presented.
Px = 1 + eb,+b,X,+ +b,X, (1)
where
2. The method
X, = relevant characteristic,
The two major questions to be answered in the b, = corresponding weight,
derivation of a credit scoring model are:
or equivalently, Table 1
Characteristics of applicant.
In ~=b,+b,X,+...+b,X,.
px (4
1 -Px 1. Marital status
2. Nationality
This means that the natural logarithm of the ratio 3. Sex
of the posterior probability of a good loan and the 4. No. children
posterior probability of a bad loan is equal to a 5. Age a
linear function of the characteristics X,. 6. Having a telephone a
7. Time at present address a
The weights b, are to be estimated by use of the
8. Geographical region in Belgium a
maximum likelihood method [See Altman et al. 9.. Profession a
(1981)]. 10. Working at private/public sector a
Finally, a new loan is allocated to the popu- 11. Time at present job a
12. Total monthly revenue a
lation of the good loans if its predicted probability
13. Total monthly expenses
p, is higher than a cut-off level c, which will be
14. Homeowner a
determined in the last section. 15. Previous credit experience
16. No. previous credits a
17. Duration of the loan a
18. Amount of the loan
3. The data
19. Object of the loan
For the present study, data on personal loans a Variables included in the scoring model
were collected in a Belgian credit company. The
loans dated from November 1984 till December
1986. The total sample contains three kinds of For the selection of the characteristics in the
loans: good loans (995) bad loans (1257) and scoring model, the occurrence of all possible val-
refused loans (693). Refused loans are loans which ues of each characteristic is examined separately
were not accepted by the credit company. An in the sample of good loans and bad loans, in
accepted loan is by definition a bad loan after order to obtain a first indication of the dis-
three reminders. criminating power of each characteristic. For ex-
The total sample is randomly divided in an ample, as 96% of the clients in the sample of good
original sample (containing 800 good, 1000 bad loans have no telephone, against 90% in the sam-
and 500 refused loans) which is used for the ple of bad loans, the characteristic having a tele-
derivation of the scoring model and a so-called phone or not can be considered as a possible
holdout sample (containing 195 good, 252 bad variable in the scoring model.
and 193 refused loans) used for an unbiased test For the final selection of the characteristics, a
of the model. stepwise logistic regression is carried out. This
Information was available on personal char- technique performs a number of logistic regres-
acteristics, professional situation, financial situa- sions, starting with the single variable with the
tion and loan-related characteristics. See Table 1 most predictive power, and adding one by one the
for a complete list of available characteristics. variables that give the best improvement in good-
All possible values for each item were grouped ness of fit of the model, until no further single
into different categories, and dummy variables addition achieves a specified significance level.
were defined to describe each category. [See Bartolucci and Fraser (1977).]
For example, professions were grouped into 11 As data on good and bad loans are collected
categories according to the necessary educational from a portfolio which already passed a screening
background. procedure in the credit company, a credit scoring
As frequency tables indicate that short-period model which is based only on these data gives
loans and long-period loans are more likely to be biased results if it is used for the selection of new
good than intermediate-period loans, duration of loans. Therefore data on refused loans are incor-
the loan is divided into three groups: less than 21 porated in the model in the following way. First a
months, between 22 and 47 months and more than scoring model is built using only the data of good
47 months. and bad loans. As we do not know whether a
A. Steenackers, M.J. Goouaerts / Credit scoring model 33
100 loans
(53%)a
53 accewted 47 refused
(36.9%)
42qod
YL?
refused
by the
scoring
26.5 15.9 model
accepted refused accepted refused
by the by the by the by the
scoring scoring scoring scoring
model model model model