Sie sind auf Seite 1von 19

Application of ANN

in Predicting Credit
Card Default

By
Jacob Philip
16EE117
Credit Card Default

Credit card default happens when the customer does not


make the minimum payment by the due date listed.

Problems for the Bank


• Manpower involved in retriving the payment.
• Time is wasted.
• Loss of money.
Solution →Using ANN to predict customers defaulting in
the next payment.
ANN Approach

Benefits
• ANNs have the ability to learn and model non-linear and
complex relationships between inputs and outputs.
• After learning, ANNs can predict on unseen data,thus
making the model generalize well.
• Ability to work with incomplete knowledge(depending the
importance of the missing data).
Objective
The objective of this project is to analyse the available data
of the customers with a neural network model and to
classify the customers as defaulting or not in the next
month.

The neural network model used should be able to learn


efficiently and be able to perform with minimum errors with
testing data.

The Dataset used contains customer information like age,


education, marital status and payment history which will be
preprocessed and given as input to the model.
Data Filtering. Preprocessing. Splitting data to training,validation and test
data set.

Model Creating MLP network with appropriate no. of input neurons,no.


of hidden layers and one output layer.

Training
Training the model with Backpropagation
algorithm.Adjusting training parameter appropriately for
efficient learning.

Performance Using Testing data and Confusion Matrix to measure


performance.

Optimising Changing no. of hidden layers and adjusting


learning parameters.
About the Dataset
• The dataset is based on credit
payments of customers in
Taiwan and was taken over a
period of 6 months from April
to September.
• There are 30,000 samples of
individual customers data
provided in the dataset.
However the dataset consist of
customers who had not used
credit for 6 months.These data
are discarded
The following figure show the
number of samples for each
cases after filtering the
dataset.
Dataset Information
1. Amount of given Credit -It is the credit limit given to the
customer.

1. Gender - 1=male,2=female.

1. Education - 1 = graduate school,2 = university; 3 = high


school, 4 = others.

1. Marital status - 1=married,2=single,3=other.

1. Age - (in years)


The following atttributes have monthly data over 6 months
from April to September in year 2005.
6. Status of past payments- Range(-2 to 9)
(-1)→ paid duly,1→payment delay for 1
month,2→payment delay for 2 months and so on.
7. Amount of the credit bill due -(Integers)
Amount of credit the customer has transacted
during the particular month.
8. Amount paid by customer-(Integer)
Amount paid by customer each month according to
terms and conditions to maintain credit status.
9. Default next month-(Binary)
This is the output variable that is used for classifying
if the customer will default next month . 0→Not
Defaulting and 1→Defaulting
Dataset
Preprocessing
The primary purpose of data pre-processing is to modify
the input variables so that the training is faster and
accurate.
From the dataset we can see large numerical values of
payment columns.Giving this directly to the model will
result in large values of weights and for convergence to
occur faster, the learning rate will have to be increased.
The data transformation are categorised into three groups
• Linear Scaling.
• Statistical Standardization(use deviation from mean).
• various other mathematical functions.
Some of the Linear scaling considered are

1. Min Max scaling


Xnew=(Xold-Xmin)/(Xmax-Xmin)

1. Dividing by Maximum value


Xnew=Xold/Xmax

1. Scaling to Interval(0.1-0.9) -
Xnew=[0.9(X-Xmin)-0.1(X-Xmax)]/(Xmax-Xmin)
MLP Classifier
A multilayer Perceptron is a
variant of the original
Perceptron model. It has
one or more hidden layers
between its input and output
layers, the neurons are
organized in layers, the
connections are always
directed from lower layers to
upper layers, the neurons in
the same layer are not
interconnected.
The MLP model being implemented will have

• Input Layer - This layer will have a number of neurons


same as the no. of input columns ie. 23 inputs.

• Hidden Layers- The no. of hidden layersand no. of


neurons will be determined through trial and error by
comparing the performance.

• Output Layers-The dataset has one output label. So


there will be one output neuron in this layer.

• The transfer function chosen is logistic sigmoid.


Training
The MLP is trained using Backpropagation Algorithm.
• Output of a Neuron - Fi(ai), where F is the transfer
function of the ith neuron and ai= ∑jwijFj(aj) is its input
and it is a weighted sum of all neuron outputs from the
previous layer (in this case jth layer).

• Weight-Update Rule - ∆wij=-ր∂C/∂wij ,C is the


error/Cost funsction and wij is the connection from the
ith neuron from the previous layer to the jth neuron of
the next layer.
Transfer Function used is logistic sigmoid
f(x)=1/(1+ex). f'(x)=f(x)(1-f(x)).

Cost Funstion used is Sum Square Error.


Esse=(1/2)∑p∑j(targj-outj)2

The weight update for MLP is simplified to

∆whl=ր∑δln.outhn-1

δln= (∑δkn+1.wlkn+1)(f'(∑j(outjn.wjln))) is backpropagated from


output layers to the input layers.
Performance
TP-True Positive
TN-True Negative
FP-False Positive
FN-False Negative
T-Total case=TP+TN+FP+FN
Measurement Formulae

• Accuracy=(TP+TN)/T

• Error=(FP+FN)/T
Optimizing
Through fine tuning of learning parameters and initial
weights,the neural network can archieve shorter learning
time and better generalizing ability.

• Initializing of weights to different sets of random small


number to prevent local minimum.
• Tuning the regularisation (momentum) factor to
overcome flat spots.
• Changing the number of hidden layers.
• Dynamically changing the learning rate during training.
Reference
• Multilayer Perceptron: Architecture,Optimization and
Training.Hassan Ramchoun, Mohammed Amine Janati
Idrissi, Youssef Ghanou, Mohamed Ettaouil.
• Confusion Matrix-based Feature Selection by Sofia Visa,
Brian Ramsay, Anca Ralescu, Esther van der Knaap.
• Some methods of pre-processing input data for neural
networks.Krystyna Kuźniar, Maciej Zając.
• Yeh, I. C., & Lien, C. H. The comparisons of data mining
techniques for the predictive accuracy of probability of default
of credit card clients.
• Review on Methods of Selecting Number of Hidden Nodes in
Artificial Neural Network.Foram S. Panchal, Mahesh
Panchal.

Das könnte Ihnen auch gefallen