preview

Developing A Credit Scoring Model

Better Essays

INTRODUCTION
The dataset used for the project is the German credit dataset that consists of customers’ financial and credit information and the resulting classification of customers as “good” or “bad” credit risks. This is a well-known publicly available dataset containing observations on 20 variables of 1000 past applicants of which 700 are classified as “good” credit risk and 300 are classified as “bad” credit risk.
This report lists the detailed steps involved in developing a credit scoring model that can be used to determine if a new applicant is a good credit risk or a bad one, based on their predictor variables.
Tools Used:
SAS Enterprise Miner 4.3
IBM SPSS Statistics 22
Modeling Techniques Used:
Decision Tree
DATA PREPARATION AND EXPLORATION
The modeling process incorporated in this project is based on the Enterprise Miner SEMMA methodology which stands for Sampling, Exploring, Modifying, Modeling, and Assessing data. The goal of this project is to develop a credit score model that can be used as a prediction model for any prospective customers. Hence, the next step was to prepare the collected data.
The German credit score dataset was provided in a comma separated values (.csv) format. When the dataset was opened through MS Excel, the values of the variables were displayed as numbers without any logical understanding of what they meant. A screen shot of the data viewed through Excel is provided in Figure 1.
The description of the data was provided separately (See

Get Access