Вы находитесь на странице: 1из 21

Loans Case Study

Submission by –
Maxim Rohit,
Amiyanshu Pratihari
Abhishek Ranjan
Karthikeyan Seetharaman
Project Brief

01 Executive Summary

02 Approach

03 Analysis and Inferences

04 Summary

05 Appendix

2
Executive Summary
Key Objectives Approach Key Take always
▪ To understand the driving ▪ Structured approach as prescribed iiit-b and S Anand ✓ Understand consumer
factors (or driver variables) attributes and loan attributes
behind loan defaults. ✓ Data cleaning ✓ Segment Analysis
influence the tendency of
▪ Utilize this knowledge for its ✓ Univariate analysis ✓ Derived metrics analysis. default.
portfolio and risk assessment ✓ Bivariate analysis ✓ Correlation analysis
to minimize credit loss and ✓ Recommendations on 5
Analysis Brief
business loss. important driver variables
✓ Categorical variables are analyzed – group, sub group, loan ✓ Get the correlation matrix as
status, state, verification status etc. Bar plots used
Business Understanding Univariate analysis soon as possible and avoid
✓ Continuous variables are analyzed – amount, DTI, interest rate, analysis paralysis with lot of
▪ Consumer finance company- revol percent, etc. box plots and histograms used. graphs.
largest online loan marketplace,
facilitating personal loans, ✓ Categorical variable are ploted against other categorical
business loans, and financing of variables to gain more information on the composition of the Deliverables
medical procedures. categorical data.
✓ Ex – state vs loan status, purpose vs loan status etc.
▪ 2 risks associated with banks Bivariate and segment ▪ One zip file containing
decision to approve loans. analysis ✓ Continuous variables are plotted against Categorical variables to ✓ R Code
gain more insights. Ex total fund amount by grades
▪ 1. loss of likely to repay the loan, ✓ Presentation in PDF
then not approving the loan results ✓ Continuous variable vs continuous variables – ex dti vs revol format
in a loss of business to the utilization, charge off vs revol util etc. scatter plots used
company ✓ charge off amount is derived from subtracting total principals
received from funded amount.
▪ 2. not likely to repay the loan, then Derived metrics analysis
✓ Charge off amount percentage is derived.
approving the loan may lead to a
financial loss for the company * Univariate analysis, Bivariate analysis and respective Derived metrics analysis

3
Approach

Data cleansing Analysis


Plotting
Importing and Data cleansing Exploratory data analysis – Question the data
✓ Using Tableau and R to create
✓ NA’s analysis to perform
graphs that aid in
▪ Defining the issues
✓ Duplicate ✓ Univariate* analysis of both categorical
▪ Analysis (univariate,
and continuous variables.
bivariate)
✓ Changing the class of ▪ Segmentation analysis
observation. ✓ Bivariate analysis* –
• Categorical vs categorical
✓ Communicate inferences,
✓ Formatting and • Categorical vs continuous
understanding with supporting
standardizing date time – • Continues vs continuous
analysis and graphs to
(issue_d), percentage etc. decision making audience and
✓ Correlation matrix.
any larger audience.
✓ Creating derived metrics –
▪ Charge off amount **Univariate analysis, Bivariate analysis
▪ Charge off as a includes Derived metrics as well
percentage of
funded amount

Tools used
➢ RStudio for Import, Data cleansing, Analysis & Plotting
➢ Tableau for Analysis and Plotting

4
Analysis – Summary of Continuous variables

Investment Type Analysis

After dropping some empty and categorical variables we


are left with 41 continuous variables in total.
✓ Key variable off interest for analysis – Funded amount, annual income,
• Charged of amount (for charged of account) = funded debt to income ration, total received principal, interest rate, employment
amount – total principle received length,

✓ Derived fields –
• Charged off percentage = charged off amount divided Charged of amount
by funded amount * 100 Charged off percentage

5
Analysis – Summary of Categorical variables

Investment Type Analysis

✓ After dropping some empty and continuous variables we are left with below key categorical variables in
total. The above is just a few for reference

✓ Term, Grade, Sub Grade, employment length, home ownership, loan status, purpose of loan, state info.

✓ Few character fields like date, interest rate, revol utilization rate etc were converted to appropriate
computational fields respectively.

6
Analysis – Key Graphs 1

Bulk of the loans are taken for debt consolidation


and that is true across the sub grades

Top charge off’s happening at end of B grade and


beginning of C grade

7
Analysis – Key Graphs 2

CA tops in the loans disbursed followed by NY (both


Bulk of the loans are taken by people living on rent
in terms of fund amount and charged off amount)
or currently servicing a mortgage.

8
Analysis – Key Graphs 3

10+ bucket has the maximum loans


disbursed, followed by less than
one year. And two years.

Within 1-3 year of employment the


plausible risk is high for default.

9
Analysis – Key Graphs 4

High counts of amounts less than 10000


are issued with out verification.

10
Analysis – Key Graphs 5

The plot above show verification across grades, on the right top
loan status across the grades.

Lack of Verification is resulting to charge offs???

11
Analysis – Key Graphs 6

Loan Interest rate, revol utilization and deb to income ratio seem to correlate with grades.
• ‘A’ being the best grade the interest rate is low, they also seem to have the lowest dti and revolve utilization
• ‘G’ being the lowest grade the interest rate is high, revolve utilization and dti is also higher .

12
Analysis – Key Graphs 7

There is banding pattern at multiples of


1000 fund amount. More prominent in
5000 intervals.

Also the interest rate seem to have


horizontal banding as well. seems like
many professionals of the same type are
qualifying for similar interest rates.

13
Analysis – Key Graphs 8

There is banding pattern at multiples of


1000 fund amount. More prominent in
5000 intervals.

Also the interest rate seem to have


Plot has excluded income greater than 100,000. and only includes charged
of status.
horizontal banding as well. professionals
from similar fields earning same salary?

14
Analysis – Key Graphs 9

Clearly there is a clustering above the


diagonals from top left to bottom right.

Top right clustering seems logical for


charged off accounts. As both revolving
Dti vs revolving utilization - Plot has excluded income greater than 100,000. utilization % increases and debt to income
and only includes charged of status. ratio increases the tendency to charge of
increases.

15
Analysis – Key Graphs 10

Same graph as the previous one. But the


color represents verification status.

Looks like the reds are concentrating at


the top right?? Meaning ‘not verified’ as
Plot has excluded income greater than 100,000. and only includes charged potential reasons for charge offs??
of status.

16
Analysis – Key Graphs 11

Between 5% and 10% there seem to be less charge of


percentage.

While magically above 10% there is a jump in charge


Dti vs interest rate - Plot has excluded income greater than 100,000. and off percentage… and so is the red colors (not
only includes charged of status.
verified)???

17
Correlation

Checkpoint 6 – plot 2

18
19
Summary

✓ Top charge off’s happening at end of B grade (B3) and beginning of C grade (C2).
✓ Bulk of the loans are taken for debt consolidation and that is true across the sub grades
✓ Bulk of the loans are taken by people living on rent or currently servicing a mortgage.
✓ 10+ years of employment has the maximum loans disbursed, followed by less than one year, two years.
Within 1-3 year of employment the plausible risk is high for default.
✓ High counts of amounts less than 10000 are issued with out verification. And these are charging off.
✓ Banding patter is observed on funding amounts at 1000 interval and more prominent in 5000 interval.
✓ Dti vs revolving utilization shows- Top right clustering seems logical for charged off accounts. As both
revolving utilization % increases and debt to income ratio increases the tendency to charge of increases.
✓ Fund amount vs interest rate - the interest rate seem to have horizontal banding as well. seems like
many professionals of the same type are qualifying for similar interest rates. There seem to be correlation
to professions.

19
20
Summary

5 15
5 important driver variables
20 ➢ DTI – debt to income ratio
➢ Interest rate
25
35 ➢ Revolving utilization ratio
➢ Funding amount
➢ Sub grade
➢ Verification status

Note there seem to be evidence


that suggest that profession of the
candidate is also a strong driving
variable.

20
Thank You.
21

Вам также может понравиться