Вы находитесь на странице: 1из 8

Analytical

Roadmap: BFS
Capstone Project
The Problem Statement
• CredX is a leading credit card provider that gets thousands of credit card applicants
every year
• Past few years experiencing credit loss
• The CEO believes that the best strategy to mitigate credit risk is to ‘acquire the right
customers’.
Project Risk Analysis
Assumptions: Constraints:
• Rejected data is used only for Scorecard • Given dataset for analysis is highly
verification. Not for EDA/Modeling. imbalance. Small percent of total dataset is
• Missing data/Outliers/Invalid data treatment
is done either by replacing WOE values/limit of defaulters.
values
Problem Solving Methodology
Weight of Evidence
Data Cleaning
(WOE) ModelModel Model
&
& Building
Building Evaluation
Manipulation
Information Value
Analysis

• Read Demographic & Credit Bureau Data CSV file


• Replace dot ‘.’ from column name with underscore ‘-’
• View the summary & structure to have understanding of both the datasets
• Remove Duplicate Application_ID from Demographic & Credit Bureau datasets
• Merge Demographic & Credit Bureau datasets to create a Master dataset using Application_ID as
a common variable
• Find and remove empty rows or NA from target variable i.e Performance_Tag. Empty rows or NA
in Performance_Tag column shows the Rejected credit card applicants. And create new master
dataset
• Create a dataset with rejected credit card applicants i.e. Dataset with all the rows value in
Performance_Tag is empty or NA. There are total 1425 records
• Remove Application_ID column from master dataset
• Check for outliers and perform outlier treatment on both continuous & categorical variable
Problem Solving Methodology
EDA,
Weight of Evidence
Data Cleaning Weight of Evidence
(WOE) ModelModel Model
& (WOE)
& Building
Building Evaluation
Manipulation &
Information Value
Information Value
Analysis
Analysis

• Perform Exploratory Data Analysis (EDA) using master dataset


• Create Information Value (IV) table to find the important variables using master
dataset
• Create new dataset with WOE values
• And perform analysis over the distribution of each variable with WOE values vs IV
Problem Solving Methodology
EDA,
Weight of Evidence
Data Cleaning Weight of Evidence
(WOE) ModelModel Model
& (WOE)
& Building
Building Evaluation
Manipulation &
Information Value
Information Value
Analysis
Analysis

• Split master dataset into train & test dataset


• Perform SMOTE sampling technique to create model.
Problem Solving Methodology
EDA,
Weight of Evidence
Data Cleaning Weight of Evidence
(WOE) ModelModel Model
& (WOE)
& Building
Building Evaluation
Manipulation &
Information Value
Information Value
Analysis
Analysis

• Check model on test dataset


Future Roadmap
• TBD