Sie sind auf Seite 1von 8

Journal of Intelligent & Fuzzy Systems 34 (2018) 1543–1549 1543

DOI:10.3233/JIFS-169449
IOS Press

Hybrid credit scoring model using


neighborhood rough set and multi-layer
ensemble classification
Diwakar Tripathi∗ , Damodar Reddy Edla and Ramalingaswamy Cheruku
Department of Computer Science and Engineering, National Institute of Technology Goa, Ponda, India

Abstract. Credit scoring is a procedure to estimate the risk related with credit products which is calculated using applicants’
credentials and applicants’ historical data. However, the data may have some redundant and irrelevant information and
features, which lead to lower accuracy on the credit scoring model. So, by eliminating the redundant features can resolve
the problem of credit scoring dataset. In this work, we have proposed a hybrid credit scoring model based on dimensionality
reduction by Neighborhood Rough Set (NRS) algorithm and layered ensemble classification with weighted voting approach
to improve the classification performance. For classifiers’ raking, we have proposed a novel classifier ranking algorithm as
an underlying model for representing ranks of the classifiers based on classifier accuracy. It is used on seven heterogeneous
classifiers for finding the ranks of those classifiers. Further five best ranking classifiers are used as base classifier in layered
ensemble framework. Results of the ensemble frameworks (Majority Voting (MV), Weighted Voting (WV), Layered Majority
Voting (LMV), Layered Weighted Voting (LWV)) with all features and after feature reduction by various existing feature
selection algorithms are compared in terms of accuracy, sensitivity, specificity and G-measure. Further, results of ensemble
frameworks with NRS are also compared in terms of ROC curve analysis. The experimental outcomes reveal the success of
proposed methods in two benchmarked credit scoring (Australian credit scoring and German loan approval) datasets obtained
from UCI repository.

Keywords: Weighted voting, classification, feature selection, ensemble learning, credit scoring

1. Introduction scoring is not a single step process, it is done by finan-


cial institutions in many phases such as application
Credit scoring is a way of calculate the risk asso- scoring, behavioral scoring, collection scoring and
ciated with credit products [1] using applicants’ fraud detection [2]. When a new application arrives
credentials (such as annual income, job status, res- for new credit, application scoring is done for the
idential status and etc.), historical data and statistical evaluation of the legitimateness or suspiciousness of
techniques. It tries to separate the effects of var- new applicant. That evaluation is done on the basis of
ious applicant characteristics on criminal behavior social, financial, and other data collected at the time
and defaults. The main focus of credit score model of the application. Behavioral scoring is similar as
is to determine whether credit consumer belongs to application scoring, but it is for the existing customers
either legitimate or suspicious customer group. Credit to analyses the existing consumer’s behavior patterns
∗ Corresponding
to support dynamic portfolio management processes.
author. Diwakar Tripathi, Department of
Computer Science and Engineering, National Institute of Tech-
Collection scoring is about to separate the customers
nology Goa, Ponda-403401, India. E-mail: diwakartripathi@ into different groups (early, middle, late recovery),
nitgoa.ac.in. to put more, moderate or less level of attention on

1064-1246/18/$35.00 © 2018 – IOS Press and the authors. All rights reserved
1544 D. Tripathi et al. / Hybrid credit scoring model

these groups. Fraud scoring models rank the appli- performance of different feature selection techniques
cants according to the relative probability that an followed with ensemble classification performance.
applicant may be dishonest. In this paper, our focus Further, it is followed by concluding remarks and
is on the application scoring problem. references.
Many researchers such as [3–5] have proposed bio-
inspired algorithms for feature selection to improve
the performance of the classifiers. Ping et al. (2011) 2. Proposed methodology
[5], in this study a hybrid Neighborhood Rough Set
and the Support Vector Machine (SVM) based classi- Here we have proposed a hybrid approach which
fier is proposed for credit scoring. Hu et al. [6, 7] have combines the NRS approach for feature selection
presented a neighborhood rough set model to resolve and layered ensemble classification as in Fig. 1.
the problem of heterogeneous feature subset selec- Our proposed approach for credit scoring works
tion. Neighborhood relations are a kind of similarity on three phases, first phase for finding the rank of
relations, which satisfy the properties of reflexivity the classifiers and weights assigning to these clas-
and symmetry and draw the objects together for sim- sifiers, second is associated with feature selection
ilarity or indistinguishability in terms of distances and using NRS algorithm and third phase is and construc-
the samples in the same neighborhood granule are tion of the ensemble framework. Preprocessed data
close to each other. which includes the data cleaning, data transformation
Conventional credit scoring models are based on and data normalization is used for rank assignment
individual classifiers or a simple combination of these to classifier and feature selection phase. Detailed
classifiers which tend to show moderate performance. descriptions about proposed hybrid approach is given
Many classifiers such as Naive Bayes (NB), SVM, in following subsections.
Decision Tree (DT), Neural Network based classi-
fiers and more have been proposed to learn problems 2.1. Phase-1: Assignment of rank and weights
thus far. However, all of them have their own posi- to classifiers
tive and negative aspects. So they are good only for
specific problems. But there is no specific way to rec- There is no specific way to recognize which clas-
ognize which classifier is a better or good classifier sifier is a better or good classifier for a specific
for a specific problem. Thus ensemble classifier is a
strong approach to produce a near to optimal clas-
sifier for any problem [8]. This method reinforces
the ensemble in error-prone subspaces, and hence it
can lead to better performance for the classification.
Generally the result of combination of diverse classi-
fiers, is better classification [9, 10]. Basically there are
two types of ensemble frameworks as homogeneous
and heterogeneous ensemble frameworks [10, 11]. In
homogeneous ensemble framework, the base classi-
fiers of same type are used whereas heterogeneous
ensemble framework is composed of base classifiers
that belong to different types. Most popular ways to
combine the base classifiers are as majority voting
and weighted voting [9, 12]. A multi-layer classi-
fiers ensemble framework [10, 11] is used based on
the optimal combination of heterogeneous classifiers.
The multi-layer model overcomes the limitations of
conventional performance bottlenecks by utilizing an
ensemble of five heterogeneous classifiers.
Rest of the paper is organized as follows: Section 2,
describes our proposed work flow for credit scoring
data classification. Section 3, presents experimental
results obtained from the proposed model and relative Fig. 1. Proposed work flow for credit scoring.
D. Tripathi et al. / Hybrid credit scoring model 1545

problem. Thus ensemble learning is a strong from a feature set. Complete algorithm is described
approach to produce a near to optimal classifier in algorithm (1) [6, 7, 13].
for any problem. In this phase seven classifiers as
Naive Bayes (NB), Multilayer Feed Forward Neu- Algorithm 1 NRS algorithm for feature selection.
ral Network (MLFFNN) and Decision Tree (DT), Input: Hybrid decision table < U, Ac ∪ An ∪ D >
Quadratic Discriminant Analysis (QDA), Time Delay where U is the sample set, called the universe, Ac
Neural Network (TDNN), Deep Tenser Neural Net- and An are categorical and numerical attributes,
work (DTNN), Decision Tree (DT) and Probabilistic respectively, β is the threshold for computing
Neural Network (PNN) are initially utilized to find variable precision lower approximations and k is
the rank of each classifier. Equation (1) is used as size of k-nearest neighborhood;
an underlying model for representing ranking of the Output: One reduct red.
classifiers. Further the five classifiers with best rank- 1: ∀a ∈ Ac : compute equivalence relation Ra ;
ing are arranged as in Fig. 1. In case of multi-layer 2: ∀a ∈ An : compute neighborhood relation Na or
ensemble framework C1 and C2 with highest ranking ka ;
are at the second layer and rest of three classifiers are 3: φ → red :red is the pool to contain the selected
in first layer. attributes;
In this work we have used weighted voting 4: For each ai ∈ A − red;
approach with heterogeneous multi-layer ensemble β β
Compute SIG(ai , red, D) =γred∪ai (D) − γred (D)
framework. In case of weighted voting a weight is β
:where γφ (D) = 0
assigned to each base classifiers. For weight assign-
end
ment, classification accuracy is used as parameter
5: select the attribute ak which satisfies:
to calculate the weights for the classifiers. A classi-
SIG(ak , red, D)= maxi (SIG(ai , red, D)
fier with the highest accuracy is assigned the highest
6: if (SIG(ak , red, D) >0)
weight and the lowest weight is assigned to a clas-
red ∪ ak → red
sifier with the lowest accuracy. These weights are
Goto step 4;
calculated by the Equation (1). Initially equal weights
else
are assigned to each base classifier, then dataset is
return (red);
applied for classification to calculate the accuracy.
7: end
Further the weights are updated (Equation (1)) and
this procedure is repeated for n iteration and mean
of the updated weights of n iterations are assigned to
the respective classifier. 2.3. Phase-3: Ensemble framework
 
i,j 1 Accj
WU = Woi,j + log (1) Proposed ensemble framework is as in Fig. 1,
2 1 − Accj where C1 , C2 , C3 , C4 and C5 are the classifiers,
i,j i,j which are chosen as best classifiers among seven het-
Where, WU and Wo represent the weight updated erogeneous classifiers in phase-1. Data with selected
and old weight at ith iteration for jth classifier features is fed with weights assigned to respective
and Accj represents the accuracy obtained by jth classifier for evaluation of the final results against the
classifier. input samples. Further the five classifiers with best
ranking are arranged as in Fig. 1. In this framework
2.2. Phase-2: Feature selection C1 and C2 with highest ranking are at the second
layer and rest of three classifiers are in first layer.
Feature selection is a standout amongst the most Combiner-1 aggregates the results obtained by three
basic issues in the field of machine learning. The classifiers associated with it and combiner-2 aggre-
main aim of feature selection is to determine a mini- gates the results obtained by two classifiers associated
mal subset of features from a set of features. In other with it and result obtained by combiner-1.
words, feature selection is a process of finding a sub- Combiner-1 and combiner-2 aggregates the out-
set of features that ideally is necessary and sufficient put predicted by associated classifiers using the
to describe the target concept from the original set of Equation (2).
features in a given dataset. Here we have used a NRS
feature selection approach to select the best features O = W1 ∗ X1 + W2 ∗ X2 + ... + Wn ∗ Xn (2)
1546 D. Tripathi et al. / Hybrid credit scoring model

Table 1
Descriptions about datasets used
Dataset Samples Classes Features Class-1/Class-2 Categorical/Numerical
Australian 690 2 14 383/317 6/8
German 1000 2 20 700/300 13/7

Where, Wi and Xi are the weight and predicted (TN) and False Negative (FN), where positive corre-
output of the ith classifier respectively. sponds to credit approved and negative corresponds
to credit not approved cases. Observed positive and
actual positive is TP, observed negative and actual
3. Experimental results negative is TN, observed negative and actual positive
is FP, observed positive and actual negative is FN.
This section have four sub section which cov- TP + TN
ers about datasets used in this work, performance Accuracy = (3)
TP + TN + FP + FN
measures used to show the proof of proposed work,
feature selection results by various feature selection TP
Sensitivity = (4)
approaches along with results obtained by various TP + FN
ensemble framework and ROC curve analysis. TN
Specificity = (5)
TN + FP
3.1. Datasets used in experiment 
G − measure = sensitivity ∗ specificity (6)
Australian dataset and German dataset (categori-
cal) are used in this work. These datasets are acquired 3.3. Results and analysis
from the UCI Machine Learning Repository [14, 15].
All datasets have combination of attribute types con- The experimental results described in this section
tinuous and nominal. Australian dataset is related are performed on a DELL PC with a 3.60 GHz Intel
to credit approval and German dataset is related to Core I7 vPRO CPU, 8 GB RAM and 64 bit Win-
loan application. To protect confidentiality of the dows 7 operating system. Implementation is done
data, the values of some attributes are replaced by using Matlab R2012a. As per the proposed model,
random meaningless symbols. Detailed description preprocessed dataset is applied in layered ensemble
about datasets are illustrated in Table 1. framework for the classification. In this work, pre-
processing consists of four main steps as treatment
3.2. Performance measures for the missing values, transformation, normalization
and feature selection. Results of feature selection on
There are several measures to evaluate the classifi- Australian and German dataset by various approaches
cation measures commonly available in the literature. such as Stepwise Regression (STEP), Classification
Accuracy (Equation 3) is not sufficient as a perfor- and Regression Tree (CART), Correlations (CORR),
mance measure, if there is significant class imbalance Multivariate Adaptive Regression Splines (MARS),
towards a class in the dataset. Dataset used in this T-Test and NRS are presented in Tables 2 and 3
experimental work is binary class dataset having the respectively.
positive (credit approved) and negative (credit not In this work, we have used weighted voting
approved) classes. Sensitivity (Equation 4) represents approach. So, weights are assigned to classifiers are
the accuracy of only positive samples prediction and calculated as described earlier section in the both
specificity (Equation 5) is about the negative sam-
ples prediction accuracy. G-measure (Equation 6) is Table 2
Selected Features in Australian dataset
a measure of a test’s accuracy. It considers both the
positive and negative accuracies of the test to compute Method Features No of Features
the score and can be interpreted as geometric mean STEP 7, 8, 9, 12, 14 5
CART 5, 7, 8, 9, 10, 14 6
of sensitivity and specificity. It would be 1 at best CORR 2, 3, 4, 7, 8, 9, 10, 14 8
case and 0 at worst case. Four performance measures MARS 3, 5, 7, 8, 10, 14 6
defined with respect to the confusion matrix are as T-Test 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14 12
NRS 8, 9, 14 3
True Positive (TP), False Positive (FP), True Negative
D. Tripathi et al. / Hybrid credit scoring model 1547

Table 3 Table 6
Selected Features in German dataset Performance comparisons on Australian dataset
Method Features No of Method Ensemble Accuracy Sensitivity Specificity G-
Features Approach measure
STEP 1, 2, 3, 7, 8, 15, 20 7 ALL MV 90.92 95.70 85.79 90.61
CART 1, 2, 3, 4, 5, 9, 10, 18 8 WV 90.92 95.70 85.79 90.61
CORR 1, 2, 3, 5, 6, 7, 12, 13, 14, 15, 20 11 LMV 90.82 93.57 88.19 90.84
MARS 1, 2, 3, 4, 5, 6, 9, 15, 16, 17, 20 11 LWV 93.02 98.80 85.86 92.10
T-Test 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 20 15 STEP MV 91.70 92.58 90.73 91.65
NRS 1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14 12 WV 91.70 92.58 90.73 91.65
LMV 93.59 97.18 89.79 93.41
LWV 93.92 99.38 88.13 93.59
Table 4 CART MV 91.38 95.37 87.13 91.16
Weights assigned to classifiers in single layer ensemble WV 91.38 95.37 87.13 91.16
framework LMV 93.04 97.89 87.86 92.74
LWV 93.80 99.51 87.52 93.32
Dataset W1 W2 W3 W4 W5 CORR MV 92.81 95.96 89.46 92.65
Australian 0.1644 0.1381 0.2589 0.2365 0.2020 WV 92.81 95.96 89.46 92.65
German 0.2185 0.1358 0.2254 0.2290 0.1912 LMV 93.20 95.73 90.46 93.06
LWV 94.74 98.78 90.46 94.53
MARS MV 91.52 95.34 87.46 91.32
WV 91.52 95.34 87.46 91.32
Table 5 LMV 92.83 96.90 88.52 92.62
Weights assigned to classifiers in layered ensemble framework LWV 93.45 99.06 87.52 93.11
Dataset W11 W12 W13 W21 W22 W23 T-Test MV 91.22 94.12 88.13 91.08
Australian 0.4084 0.2298 0.3618 0.2253 0.3861 0.3887 WV 91.22 94.12 88.13 91.08
German 0.5837 0.1084 0.3079 0.1822 0.3728 0.4450 LMV 92.65 94.06 91.19 92.61
LWV 93.29 98.44 87.86 93.00
NRS MV 92.37 94.45 90.13 92.26
WV 92.37 94.46 90.13 92.27
case (five classifiers are combined with weighted vot- LMV 94.93 97.84 91.86 94.80
ing in single and multi-layered approach). Table 4 LWV 95.39 99.69 90.86 95.17
represents the weights assigned to classifier on sin-
gle layer approach where W1 , W2 , W3 , W4 and W5
are the weights assigned to classifiers C1 , C2 , C3 , C4 G-measure. These results are presented in Tables
and C5 respectively with respective dataset. Table 5 6 and 7 for Australian dataset and German dataset
represents the weights assigned to classifier on multi- respectively.
layer approach where W11 , W12 , and W13 are the Starting with the Australian dataset regarding
weights assigned to C3 , C4 and C5 (classifiers at accuracy, (sensitivity, specificity) and G-measure,
layer 1) respectively, further W21 , W22 , and W23 are the proposed approach achieves 95.39%, 99.69%,
the weights assigned results obtained by combiner-1, 90.86% and 95.17% respectively. In German
C1 and C2 respectively with respective dataset. dataset regarding accuracy, sensitivity specificity and
To make balanced 10-Fold Cross Validation G-measure, the proposed approach achieves 86.47%,
(10-FCV), in each fold we arranged similar num- 98.78%, 72.33% and 84.53% respectively. As per
ber of samples towards each class because German results in Tables 6 and 7, results obtained by proposed
dataset is imbalance dataset. The 10-FCV, first the approach are the best in terms of accuracy, sensitivity
whole dataset is divided into two parts data-1 and and G-measure on both datasets.
data-2 which contains the samples of class 1 and
class 2 respectively. Further data-1 is divided into 3.4. ROC curve analysis
10 parts and data-2 is also divided into 10 parts.
Data-1 part-1 and data-2 part-1 is considered as To validate the separation and discrimination abil-
fold-1 similarly fold-2 and so on. Because classifi- ity of the proposed models and to measure their
cation algorithms suffer when the class distribution performance from a different perspective as well as
is imbalanced towards one of the classes [16]. In this measuring their sensitivity and specificity over vari-
section, we compare the average of 10-FCV on two ous thresholds, ROC curves are depicted for the MV,
credit scoring data sets results obtained from our pro- WV, LMV, and LWV approaches. Figures 2 and 3
posed work with existing feature selection in terms display the ROC curves for all the above-mentioned
of classification accuracy, sensitivity, specificity and ensemble approaches with feature selected by NRS
1548 D. Tripathi et al. / Hybrid credit scoring model

Table 7
Performance comparisons on German dataset
Method Ensemble Accuracy Sensitivity Specificity G-
Approach measure
ALL MV 79.36 92.34 51.67 69.07
WV 79.37 73.97 69.24 71.56
LMV 80.53 80.47 80.67 80.56
LWV 84.26 93.62 71.00 81.53
STEP MV 80.10 92.97 52.67 69.98
WV 80.11 74.88 69.75 72.27
LMV 82.98 83.91 81.00 82.44
LWV 85.68 94.38 74.00 83.57
CART MV 78.08 95.31 41.33 62.76
WV 78.08 95.31 41.33 62.76
LMV 85.85 86.72 84.00 85.35
LWV 83.83 96.72 56.33 73.81
CORR MV 78.62 91.88 50.33 68.00
WV 78.62 91.88 50.33 68.00
LMV 76.81 77.34 75.67 76.50
LWV 82.55 94.69 56.67 73.25
MARS MV 79.89 94.37 49.00 68.00
WV 79.89 94.38 49.00 68.00
Fig. 3. Fold-1 ROC on German dataset.
LMV 82.77 83.28 81.67 82.47
LWV 84.04 96.09 58.33 74.87
T-Test MV 80.53 93.28 53.33 70.53
WV 80.53 93.28 53.33 70.53 of German dataset, conclusions are same as for the
LMV 83.94 83.75 84.33 84.04 Australian dataset.
LWV 84.64 97.50 60.33 76.70
NRS MV 77.98 92.66 46.67 65.76
WV 77.98 92.66 46.67 65.76
LMV 82.98 83.28 82.33 82.80 4. Conclusion
LWV 86.47 98.78 72.33 84.53

In this paper, a hybrid approach based on NRS


for feature selection and multi-layer ensemble clas-
sification which are aggregated with weighted voting
approach is proposed for credit scoring. Further the
results of proposed approach is compared with MV,
WV and LMV with all features and with selected
features by STEP, CART, CORR, MARS are com-
pared in terms of accuracy, sensitivity, specificity
and G-measure on two benchmark dataset Australian
and German dataset. According to results obtained
by the proposed approach outperformed all other
approaches in terms of accuracy, sensitivity and G-
measure on Australian and German dataset. Results
obtained by the proposed approach outperformed
by comparing to other approaches such as MV, WV,
LMV in terms of ROC curve analysis on Australian
Fig. 2. Fold-1 ROC on Australian dataset. and German dataset.

on datasets as aforementioned on fold-1 (first 90% as


training and rest for the test dataset). References
In case of Australian dataset, all other ensem-
ble classifiers’ curve lies below LWV ROC curve [1] L.J. Mester, et al., WhatâĂŹs the point of credit scoring?
Business Review 3(Sep/Oct) (1997), 3–16.
for all threshold values. So it shows that for Aus-
[2] G. Paleologo, A. Elisseeff and G. Antonini, Subagging for
tralian dataset the LWV method is the best for all the credit scoring models, European Journal of Operational
required values of sensitivity and specificity. In case Research 201(2) (2010), 490–499.
D. Tripathi et al. / Hybrid credit scoring model 1549

[3] B.-W. Chi and C.-C. Hsu, A hybrid approach to integrate multilayer classifier ensemble framework, Journal of
genetic algorithm into dual scoring model in enhancing the Biomedical Informatics 59 (2016), 185–200.
performance of credit scoring model, Expert Systems with [11] S. Bashir, U. Qamar, F.H. Khan and L. Naseem, Hmv:
Applications 39(3) (2012), 2650–2661. A medical decision support framework using multi-layer
[4] C.-L. Huang and J.-F. Dun, A distributed pso–svm hybrid classifiers for disease prediction, Journal of Computational
system with feature selection and parameter optimization, Science 13 (2016), 10–25.
Applied Soft Computing 8(4) (2008), 1381–1391. [12] M. Ala’raj and M.F. Abbod, A new hybrid ensemble
[5] S. Oreski and G. Oreski, Genetic algorithm-based heuris- credit scoring model based on classifiers consensus sys-
tic for feature selection in credit risk assessment, Expert tem approach, Expert Systems with Applications 64 (2016),
systems with Applications 41(4) (2014), 2052–2064. 36–55.
[6] Q. Hu, D. Yu, J. Liu and C. Wu, Neighborhood rough set [13] Y. Ping and L. Yongheng, Neighborhood rough set and svm
based heterogeneous feature subset selection, Information based hybrid credit scoring classifier, Expert Systems with
Sciences 178(18) (2008), 3577–3594. Applications 38(9) (2011), 11300–11304.
[7] Q. Hu, J. Liu and D. Yu, Mixed feature selection based on [14] “Australian dataset.” [Online]. Available: https://archive.ics.
granulation and approximation, Knowledge-Based Systems uci.edu/ml/machine-learningdatabases/statlog/australian/
21(4) (2008), 294–304. [15] “Statlog (german credit data) data set.” [Online]. Available:
[8] H. Parvin, M. MirnabiBaboli and H. Alinejad-Rokny, https://archive.ics.uci.edu/ml/datasets/Statlog+(German+
Proposing a classifier ensemble framework based on clas- Credit+Data)
sifier selection and decision tree, Engineering Applications [16] J. Van Hulse, T.M. Khoshgoftaar and A. Napolitano, Exper-
of Artificial Intelligence 37 (2015), 34–42. imental perspectives on learning from imbalanced data,
[9] M. Ala’raj and M.F. Abbod, Classifiers consensus system in Proceedings of the 24th international conference on
approach for credit scoring, Knowledge-Based Systems 104 Machine Learning ACM, 2007, pp. 935–942.
(2016), 89–105.
[10] S. Bashir, U. Qamar and F.H. Khan, Intellihealth: A med-
ical decision support application using a novel weighted
Copyright of Journal of Intelligent & Fuzzy Systems is the property of IOS Press and its
content may not be copied or emailed to multiple sites or posted to a listserv without the
copyright holder's express written permission. However, users may print, download, or email
articles for individual use.

Das könnte Ihnen auch gefallen