Sie sind auf Seite 1von 33

Modeling to predict Churn

Submitted By: Group B-9


Arpit Singh
Nishant Sinha
Pratish Sharman
Pratiyush Rai
Rajatpreet Chhabra
Tanvi Nakra
Modeling Steps
• First we converted all categorical data into binary.
• There were 5 categorical variable:
1. Marital Status
2. Occupational group
3. LTV_Group
4. Customer Type
5. Gender
Assumption: In occupational Group, we merged
“Constructionn” into “Construction” & “Others”
into “Other”
Truth Table
• For, Marital status : “Other” as a base
• For, LTV_Group: “Other” as a base
• For Gender : “Female” as a base
• For “Occupational Group” and “Customer Type”,
as there were 11 & 9 factor respectively. So, we
run CHAID analysis to classify the factor into
groups.
(For Customer Type, Demos, Ordry i.e, Node 3 as a
base & For, Occupational Group, Medical i.e, Node
7 as a base.)
CHAID
Logistic Regression
Dependent Variable: Churn_CAT
Continuous Independent Categorical Independent Variable
Variable • Gender_Cat
• Average_Daily_Balance • LTV_good
• LTV_vbad
• Interest_Paid
• Married_new
• Cash_Advances • UnMarried_new
• Balance_Transferred • CT1 (HMTLE)
• Age_of_Account__Months • CT2 (LMCOA,SMCOA,EMCOA)
• Age_Group • CT3 (DROAM,RENTS)
• CT4 (IROAM)
• Bill_Cycle
• OCC1(Services, Construction,
• Customer_Value Finance,Trade,Transport,Governme
• Credit_Limit nt,Export)
• OCC2 (Others, Computers)
• OCC3 (Manufacturing)
For Customer Type, Demos, Ordry i.e,
Node 3 as a base & For, Occupational
Group, Medical i.e, Node 7 as a base
Initial Results-Logistic Regression
Assumptions
• We took significance level at 10%
• So we remove variable that have Sig. Value
greater than .10
• Hence, we removed Average daily balance,
Cash Advances, Gender_CAT, CT4, & OCC 3.
Final Model Result
Classification Table

With a cut-off value of 0.5, 99.5% of the time when the customers did
not churn, our model predicted correctly and 3.6% of the time when
the customers churned our model predicted correctly. Overall
success rate is 93.3%.
(<0.5 = 0(not churned); >0.5=1(churned))
Odds Ratio
• P / 1- P is the odds ratio or the odds of success
• When the probability of success or P is ½ or 50-
50, odds for success equals .5/1-.5 = 1.0. This
means that success is equally as likely as failure
• Thus, predicted probability of .5 and an odds
ratio of 1.0 are our points of comparison when
making inferences
• Thus, the exponential beta value in the SPSS
output can be calculated into a percent by 100
(exp b –1) or the percentage change in odds for
each unit increase in the independent variable
Odds ratios in logistic regression
• Can be thought of as likelihood or odds
success based impact of predictors in model
• Interval : the odds of success for those who
are a unit apart in X, net of other predictors.
• Every unit increase in X has an exponential
effect on the odds of success so an odds ratio
can be >1
• P = e α + βX / 1 + e α + βX or odds / 1 + odds
Slope in logistic regression models (FYI)
• When β = 0, P does not change as X increases (X
has no bearing on probability or odds of success )
so the curve is flat, there is just a straight line
• For β > 0, P increases as X increases (probability of
success increases thus curve increases)
• For β < 0, P decreases as X increases (probability
of success decreases thus curve decreases)
Odds ratio and % change in odds by
Interest Paid
•  = .00 ,Thus, log odds of churn of customer does not change with
change in interest paid
• Exp  = 1.00, 𝑒  =
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑃𝑎𝑖𝑑+1)
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑃𝑎𝑖𝑑

• % change in odds of churning for each additional unit of Interest


Paid is
100(exp  - 1) = 100 (1 – 1) = 0
Hence, No change as Interest paid increases
Odds ratio and % change in odds by
Balance Transferred
•  = .00 ,Thus, log odds of churn of customer does not change with
change in Balance Transferred
• Exp  = 1.00, 𝑒  =
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐵𝑎𝑙𝑎𝑛𝑐𝑒 𝑇𝑟𝑎𝑛𝑠𝑓𝑒𝑟𝑟𝑒𝑑+1)
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐵𝑎𝑙𝑎𝑛𝑐𝑒 𝑇𝑟𝑎𝑛𝑠𝑓𝑒𝑟𝑟𝑒𝑑

• % change in odds of churning for each additional unit of Balance


Transferred is
100(exp  - 1) = 100 (1 – 1) = 0
Hence, No change as Balance Transferred increases
Odds ratio and % change in odds by Age of
Account Months
• Age  =-.051 (beta negative). Thus, log odds of churn of
customer decreases as Age of account months increases.
• Exp  = .950
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐴𝑔𝑒 𝑜𝑓 𝑎𝑐𝑐𝑜𝑢𝑛𝑡 𝑚𝑜𝑛𝑡ℎ𝑠+1)
• 𝑒𝐵 = = 0.950
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐴𝑔𝑒 𝑜𝑓 𝑎𝑐𝑐𝑜𝑢𝑛𝑡 𝑚𝑜𝑛𝑡ℎ𝑠

• % change (in this case a reduction in) in odds of churning for


each additional month of account age is
100(exp  - 1) = 100 (.95 – 1) = -5 %

Thus, customer is less likely to churn with account months


ages
Odds ratio and % change in odds by Age Group

• Age  =.038. Thus, log odds of churn of


customer increases as Age Group increases.
• Exp  = 1.039
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐴𝑔𝑒 𝑔𝑟𝑜𝑢𝑝+1)
• 𝑒𝐵 = = 0.950
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐴𝑔𝑒 𝑔𝑟𝑜𝑢𝑝
• % change in odds of churning for each additional
unit of age is
100(exp  - 1) = 100 (1.039 – 1) =3.9 %
Thus Customer is more likely(increase) to churn as
unit increase in Age.
Odds ratio and % change in odds by Bill_Cycle

• Age  =.020. Thus, log odds of churn of


customer increases as Bill Cycle increases.
• Exp  = 1.021
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐵𝑖𝑙𝑙 𝐶𝑦𝑐𝑙𝑒+1)
• 𝑒𝐵 = = 1.021
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐵𝑖𝑙𝑙 𝐶𝑦𝑐𝑙𝑒
• % change in odds of churning for each additional
unit of Bill Cycle is
100(exp  - 1) = 100 (1.021 – 1) =2.1 %
Thus Customer is more likely(increase) churn as
unit increase in Bill Cycle.
Odds ratio and % change in odds by
Customer Value
•  = .00 ,Thus, log odds of churn of customer does not change with
change in Customer Value
• Exp  = 1.00, 𝑒  =
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐶𝑢𝑠𝑡𝑜𝑚𝑒𝑟 𝑉𝑎𝑙𝑢𝑒+1)
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐶𝑢𝑠𝑡𝑜𝑚𝑒𝑟 𝑣𝑎𝑙𝑢𝑒

• % change in odds of churning for each additional unit of Customer


Valueis
100(exp  - 1) = 100 (1 – 1) = 0
Hence, No change as Customer Value increases
Odds ratio and % change in odds by
Credit Limit
•  = .00 ,Thus, log odds of churn of customer does not change with
change in Credit Limit
• Exp  = 1.00, 𝑒  =
𝑂𝑑𝑑𝑠 𝑜𝑓 (𝐶𝑟𝑒𝑑𝑖𝑡 𝐿𝑖𝑚𝑖𝑡+1)
𝑂𝑑𝑑𝑠 𝑜𝑓 𝐶𝑟𝑒𝑑𝑖𝑡 𝐿𝑖𝑚𝑖𝑡

• % change in odds of churning for each additional unit of


𝐂𝐫𝐞𝐝𝐢𝐭 𝐋𝐢𝐦𝐢𝐭 is
100(exp  - 1) = 100 (1 – 1) = 0
Hence, No change as Credit Limit increases
Odds ratio and % change in odds by LTV_Group
(Good)
•  = -1.004 (beta negative). Thus, odds of churning of good
category among LTV_Group is lower (referent category is
others and beta is negative)

𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑔𝑜𝑜𝑑


• 𝑒𝐵 = = 0.367
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟𝑠
• Exp  = .367 Odds of churning are 0.367 times as for good
category as they are for others
• % change (in this case an decrease in) in odds of churning
when a person is in good category is
100(exp  - 1)=100(.367 – 1) = -63.30 %
• Hence, person is less likely(not churn) to churn if he is in good
category.
Odds ratio and % change in odds by LTV_Group
(Very_Bad)

•  = .070 (beta possitive). Thus, odds of churning of


very bad category among LTV_Group is greater (referent
category is others and beta is possitive)

• Exp  = 1.072 Odds of churning are 1.07 times as


large for very bad category as they are for others
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑉𝑒𝑟𝑦_𝐵𝑎𝑑
• 𝑒𝐵 = = 1.072
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟𝑠
• % change in odds of churning when a person is in
Very_bad category is 100(exp  - 1)=100(1.072 – 1) = 7.2%
• Hence, person is more likely(will churn) to churn if he is in
very bad category.
Odds ratio and % change in odds by Marital
Status (Married)

•  = .887 (beta possitive). Thus, odds of churning of


married category among Marital status group is higher
(referent category is others and beta is positive)
• Exp  = 2.429 Odds of churning are 2.429 times as
large for married category as they are for others
𝐵 𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒 = = 2.429
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an increase in) in odds of churning
when a person is married is 100(exp  - 1)=100(2.429 – 1)
= 142.9%
• Hence, person is more likely to churn if he is married.
Odds ratio and % change in odds by Marital
Status (Unmarried)

•  = 1.397 (beta possitive). Thus, odds of churning of


unmarried category among Marital status group is greater
(referent category is others and beta is positive)
• Exp  = 4.042 Odds of churning are 4.042 times as large
for married category as they are for others
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = 4.042
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an increase in) in odds of churning when
a person is married is 100(exp  - 1)=100(4.042 – 1) = 304%
• Hence, person is more likely to churn if he is unmarried.
• Unmarried person is more likely to churn
than married person.
Odds ratio and % change in odds by Customer
Type (HMTLE) CT1

•  = .467 (beta possitive). Thus, odds of churning of


married category among Marital status group is higher
(referent category is DEMOS & ORDRY)
• Exp  = 1.595 Odds of churning are 1.595 times as
large for HMTLE category as they are for DEMOS &
ORDRY
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = 1.595
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an increase in) in odds of churning
when a person is married is 100(exp  - 1)=100(1.595 – 1)
= 59.5%
• Hence, person is more likely to churn if he is HMTLE Type.
Odds ratio and % change in odds by Customer
Type (LMCOA,SMCOA,EMCOA) CT2

•  = -.391 (beta negative). Thus, odds of churning of


CT2 category among Customer type group is lower
(referent category is DEMOS & ORDRY)
• Exp  = .676, Odds of churning are .676 times as
large for CT2 category as they are for DEMOS &
ORDRY
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = .676
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an decrease) in odds of churning
when a person is from CT2 category is 100(exp  -
1)=100(.676 – 1) = -32.4%
• Hence, person is less likely to churn if he is from CT2
category i.e, LMCOA,SMCOA,EMCOA.
Odds ratio and % change in odds by Customer
Type (DROAM,RENTS) CT3

•  = -1.349 (beta negative). Thus, odds of churning of


CT3 category among Customer type group is lower
(referent category is DEMOS & ORDRY)
• Exp  = .260, Odds of churning are .260 times as
large for CT3 category as they are for DEMOS &
ORDRY
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = .260
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an decrease) in odds of churning
when a person is from CT3 category is 100(exp  -
1)=100(.260 – 1) = -74%
• Hence, person is less likely to churn if he is from CT3
category i.e, DROAM,RENTS.
Odds ratio and % change in odds by Occupational
Group (Services, Construction,
Finance,Trade,Transport,Government,Export)
OCC1
•  = .406 (beta possitive). Thus, odds of churning of
OCC1 category among Occupational group is higher
(referent category is Medical)
• Exp  = 1.5, Odds of churning are 1.5 times as
large for OCC1 category as they are for Medical
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = 1.5
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an increase in) in odds of
churning when a person is from OCC1 Category is
100(exp  - 1)=100(1.595 – 1) = 50%
• Hence, person is more likely to churn if he is from
OCC1.
Odds ratio and % change in odds by Occupational
Group (Others, Computers) OCC2

•  = .432 (beta possitive). Thus, odds of churning of OCC2


category among Occupational group is higher (referent category
is Medical)
• Exp  = 1.541, Odds of churning are 1.541 times as large
for OCC2 category as they are for Medical
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑀𝑎𝑟𝑟𝑖𝑒𝑑
• 𝑒𝐵 = = 1.541
𝑂𝑑𝑑𝑠 𝑜𝑓𝑐ℎ𝑢𝑟𝑛 𝑜𝑓 𝑜𝑡ℎ𝑒𝑟
• % change (in this case an increase in) in odds of churning when
a person is from OCC2 Category is 100(exp  - 1)=100(1.541 –
1) = 54.1%
• Hence, person is more likely to churn if he is from OCC2
category (Others, Computers).
• OCC2 category person are more likely to churn than OCC1
category person.
KS max

KS max =46%
Significance of KS max
• The Kolmogorov-Smirnov test (KS-test) tries to
determine if two datasets differ significantly.
• KS max = 46%, implies if we target 36%(70000) of
the customer then we can predict the 81%(10000) of
the total churned customer.
• Hence, organization can reduce their cost drastically
by targeting only 36% of customer to get 81% of total
churned.
Decision Support System
Decision Support System
Constant -3.24338 -3.2434
LTV_good -1.00354 0 0.0000
LTV_vbad 0.06956 1 0.0696
Married_new 0.88731 0 0.0000
UnMarried_new 1.39683 1 1.3968
CT1 0.46708 1 0.4671
CT2 -0.39086 0 0.0000
CT3 -1.34888 0 0.0000
OCC1 0.40554 0 0.0000
OCC2 0.43241 1 0.4324
Interest_Paid 0.00006 1090 0.0654
Balance_Transferred 0.00003 4356 0.1168
Mean Values But one can Age_of_Account__Months -0.05116 63 -3.2232
take any value for Age_Group 0.03840 79 3.0332
Continuous variable Bill_Cycle 0.02048 17.14 0.3510
Customer_Value -0.00003 2700 -0.0699
Credit_Limit 0.00000 10764 0.0338
Sum -0.5704

Probability 0.3612
Result will not churn

We have created a decision support system in excel, where one can change the value
according to a particular customer to predict whether it will churn or not.
Reference attached excel file.
Decision Support System
Decision Support System
Constant -3.24338 -3.2434
LTV_good -1.00354 0 0.0000
LTV_vbad 0.06956 1 0.0696
Married_new 0.88731 0 0.0000
UnMarried_new 1.39683 1 1.3968
CT1 0.46708 1 0.4671
CT2 -0.39086 0 0.0000
CT3 -1.34888 0 0.0000
OCC1 0.40554 0 0.0000
OCC2 0.43241 1 0.4324
Interest_Paid 0.00006 1090 0.0654
Balance_Transferred 0.00003 4356 0.1168
Age_of_Account__Months -0.05116 63 -3.2232
one can take any value
Age_Group 0.03840 85 3.2636
for Continuous variable Bill_Cycle 0.02048 45 0.9215
Customer_Value -0.00003 2700 -0.0699
Credit_Limit 0.00000 10764 0.0338
Sum 0.2305

Probability 0.5574
Result will Churn
Thank You