Curso 2007-2008

Modelling Week

Second Edition

June 16 June 24, 2008

Retail Banking Sector

Coordinators:

Retail Banking Sector

Team members:

Matthew Cornford (University of Oxford)

Leticia Garca-Ergn (UCM)

Cristina Pascual Deocn (UCM)

Oscar Ivn Pascual (UCM)

Francisco Javier Plaza (UCM)

Retail Banking Sector

Index

Introduction

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

Index

Introduction

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

its money to.

When a client applies for a loan, the bank would like to

be sure that the client will pay back the full amount of

the loan.

We need effective models that allow us to predict if a

client will pay back the loan.

What we have is historical data for several variables.

We are trying to fit a model to this historical data so we

can estimate a probability of default.

Retail Banking Sector

Index

Introduction

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

completed loan agreements

Age

Income

Wealth

Marital Status

Length as a Client

Amount of Loan

Maturity

Default

Retail Banking Sector

Sample Selection

Retail Banking Sector

Modelling Sample

original data

Validation Sample

which of them really did default.

Retail Banking Sector

Index

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

some independent variables (age, income,)

First of all, we do univariate analysis.

For each variable, we calculate some statistics like mean,

standard deviation, skewness

We plot some histograms

This information can be use as a first check before

applying the model.

It would be better if the data were homogeneous.

Retail Banking Sector

Univariate Analysis

Retail Banking Sector

Retail Banking Sector

transcription mistakes.

Retail Banking Sector

Multivariate Analysis

Correlations

Retail Banking Sector

Chi-squared test

variables, i.e. which variables does default depend on.

We use the chi-squared test for that:

2

(

O

E

)

i

2 i

Ei

i 1

n

percentiles.

After doing Chi-squared test, we look at the p-value.

Retail Banking Sector

Retail Banking Sector

Retail Banking Sector

Retail Banking Sector

Retail Banking Sector

Retail Banking Sector

Retail Banking Sector

Index

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

Multivariate Analysis, the variables we include in our

model are:

Age

Income

Wealth

Marital Status

Maturity

Retail Banking Sector

glmfit in MATLAB as well, obtaining the same results.

exp( j 0

k

( x0 ,..., x k ) ( x)

1 exp( j 0

x)

j

( x)

k

exp( j 0

1 ( x)

Logit ( ( x)) log

x)

j

x)

j

( x)

0 x1 1 ... x k k

1 ( x)

Retail Banking Sector

Retail Banking Sector

Intercept

Age

Income

Wealth

Marital Status

Maturity

-1.85136

-0.02678

0.10025

-0.01761

0.79651

0.00892

the sample.

Retail Banking Sector

1

P(Default/x)=

1+e-

Where -1.85 -0.026*Age+0.1*Inc-0.017*Wlth+0.79*Marit+0.0089*Matur

Retail Banking Sector

Index

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

Model statistics:

Retail Banking Sector

model

The data is sorted from worse to better according to the

probability of default calculated with our model.

The perfect model will have the total amount of defaults at

the beginning.

We plot accumulated defaults against accumulated

observations.

Powerstat compares the area between the perfect model,

our model and a random model.

Retail Banking Sector

Retail Banking Sector

Validation

question is how to choose the level that classifies if a client

will default or not.

many observations will default and compare with which of

them are really did default.

probability has very low deviation and rounds 0.77.

Retail Banking Sector

Index

Methodology and Data

Univariate and Multivariate Analysis

Model Creation

Validation

Calibration

Retail Banking Sector

EL = PD * EAD * LGD

Is defined as default probability calibrated for a year.

EAD is the exposition to default.

LGD are losses on the exhibition.

Retail Banking Sector

However, these probabilities do not take into account when the

default happens.

This is the reason for calibration.

We want to obtain the yearly average probability of default

We need a sample of people observed in periods of years.

The model is applied and the sample is sorted by score.

We obtain a default observed rate:

Rate A (C score ) B

fminsearch, we obtain the values:

A=0.0004

B=3.7410

C=2.7870

Retail Banking Sector

and didnt cause too much difficult.

problem.

Allocation.

Index

Implementation

Conclusions

Index

Implementation

Conclusions

EAD, between n blocks of similiar customers

the profit?

probabilty of default PDi, the loss given default LGDi and the

number of customers Ni.

we can easily compute the probability of k defaults.

But the customers are correlated via the economy. We can use

Gaussian Copula to introduce a default random variable for each

customer:

m

Z i a ijY j ri wi

j 1

( Z i ) PDi Default

independent probability of default for each customer is:

m

1

i

( PDi ) a jY j

j 1

pi

ri

N i

Ni k

k

pi (1 pi )

k

P(k defaults )

When N is big enough (in the order of 10^3) we can aproximate this

binomial with normal random variable Di:

Di : N ( N i pi , N i pi (1 pi ))

i EAD

L

( LGDi Di ( N i Di ) i )

Ni

i 1

n

L : N ( L , L2 )

confidence level. So the problem becomes:

Minimise

f ( ) L

n

Subject to:

i 1

-2.3262 L + L = VaR99

Where VaR99 is the fixed level of risk the lender is willing to take.

Index

Implementation

Conclusions

First we fix s and find the VaR99 and Expected Loss for

each set of s (Black dots).

Then we find the s that minimise the Expected Loss for any fixed

VaR99 (Red Dots) using the MATLAB function fmincon.

approaches, on the order of 10^(4).

longer than with 3 blocks.

Conclusions

expected.

needs to be investigated furhter.

between the efficient border and the interest rates charged

for each block.

Questions?

