Академический Документы
Профессиональный Документы
Культура Документы
Submission by –
Maxim Rohit,
Amiyanshu Pratihari
Abhishek Ranjan
Karthikeyan Seetharaman
Project Brief
01 Executive Summary
02 Approach
04 Summary
05 Appendix
2
Executive Summary
Key Objectives Approach Key Take always
▪ To understand the driving ▪ Structured approach as prescribed iiit-b and S Anand ✓ Understand consumer
factors (or driver variables) attributes and loan attributes
behind loan defaults. ✓ Data cleaning ✓ Segment Analysis
influence the tendency of
▪ Utilize this knowledge for its ✓ Univariate analysis ✓ Derived metrics analysis. default.
portfolio and risk assessment ✓ Bivariate analysis ✓ Correlation analysis
to minimize credit loss and ✓ Recommendations on 5
Analysis Brief
business loss. important driver variables
✓ Categorical variables are analyzed – group, sub group, loan ✓ Get the correlation matrix as
status, state, verification status etc. Bar plots used
Business Understanding Univariate analysis soon as possible and avoid
✓ Continuous variables are analyzed – amount, DTI, interest rate, analysis paralysis with lot of
▪ Consumer finance company- revol percent, etc. box plots and histograms used. graphs.
largest online loan marketplace,
facilitating personal loans, ✓ Categorical variable are ploted against other categorical
business loans, and financing of variables to gain more information on the composition of the Deliverables
medical procedures. categorical data.
✓ Ex – state vs loan status, purpose vs loan status etc.
▪ 2 risks associated with banks Bivariate and segment ▪ One zip file containing
decision to approve loans. analysis ✓ Continuous variables are plotted against Categorical variables to ✓ R Code
gain more insights. Ex total fund amount by grades
▪ 1. loss of likely to repay the loan, ✓ Presentation in PDF
then not approving the loan results ✓ Continuous variable vs continuous variables – ex dti vs revol format
in a loss of business to the utilization, charge off vs revol util etc. scatter plots used
company ✓ charge off amount is derived from subtracting total principals
received from funded amount.
▪ 2. not likely to repay the loan, then Derived metrics analysis
✓ Charge off amount percentage is derived.
approving the loan may lead to a
financial loss for the company * Univariate analysis, Bivariate analysis and respective Derived metrics analysis
3
Approach
Tools used
➢ RStudio for Import, Data cleansing, Analysis & Plotting
➢ Tableau for Analysis and Plotting
4
Analysis – Summary of Continuous variables
✓ Derived fields –
• Charged off percentage = charged off amount divided Charged of amount
by funded amount * 100 Charged off percentage
5
Analysis – Summary of Categorical variables
✓ After dropping some empty and continuous variables we are left with below key categorical variables in
total. The above is just a few for reference
✓ Term, Grade, Sub Grade, employment length, home ownership, loan status, purpose of loan, state info.
✓ Few character fields like date, interest rate, revol utilization rate etc were converted to appropriate
computational fields respectively.
6
Analysis – Key Graphs 1
7
Analysis – Key Graphs 2
8
Analysis – Key Graphs 3
9
Analysis – Key Graphs 4
10
Analysis – Key Graphs 5
The plot above show verification across grades, on the right top
loan status across the grades.
11
Analysis – Key Graphs 6
Loan Interest rate, revol utilization and deb to income ratio seem to correlate with grades.
• ‘A’ being the best grade the interest rate is low, they also seem to have the lowest dti and revolve utilization
• ‘G’ being the lowest grade the interest rate is high, revolve utilization and dti is also higher .
12
Analysis – Key Graphs 7
13
Analysis – Key Graphs 8
14
Analysis – Key Graphs 9
15
Analysis – Key Graphs 10
16
Analysis – Key Graphs 11
17
Correlation
Checkpoint 6 – plot 2
18
19
Summary
✓ Top charge off’s happening at end of B grade (B3) and beginning of C grade (C2).
✓ Bulk of the loans are taken for debt consolidation and that is true across the sub grades
✓ Bulk of the loans are taken by people living on rent or currently servicing a mortgage.
✓ 10+ years of employment has the maximum loans disbursed, followed by less than one year, two years.
Within 1-3 year of employment the plausible risk is high for default.
✓ High counts of amounts less than 10000 are issued with out verification. And these are charging off.
✓ Banding patter is observed on funding amounts at 1000 interval and more prominent in 5000 interval.
✓ Dti vs revolving utilization shows- Top right clustering seems logical for charged off accounts. As both
revolving utilization % increases and debt to income ratio increases the tendency to charge of increases.
✓ Fund amount vs interest rate - the interest rate seem to have horizontal banding as well. seems like
many professionals of the same type are qualifying for similar interest rates. There seem to be correlation
to professions.
19
20
Summary
5 15
5 important driver variables
20 ➢ DTI – debt to income ratio
➢ Interest rate
25
35 ➢ Revolving utilization ratio
➢ Funding amount
➢ Sub grade
➢ Verification status
20
Thank You.
21