Sie sind auf Seite 1von 20

EMPLOYEE ATTRITION ANALYSIS

GROUP MEMBERS
Nida Karim k18-0867
Muhammad Zain ul Haq k18-0887
Motivation

 Workforce is a critical element for an organization’s success.


 Happy and motivated employees contribute positively towards the company’s goal.
 High number of resignations and retirements cost an organization heavily.
 Organizations may face
 Project Delays
 Recruitment cost
 Training cost for new employees

 Analyzing employee data can help to,


 Identify factors leading to attrition
 Redesign policies that might cause employee attrition
About The Data
 The data comprised of 3 CSV files namely,
 General Employee Data
 Employee Survey Data
 Manager Survey Data

 General Employee data had general employee details


 Age, Education, Experience, Marital Status, Department, Income, Last Promotion Year, Job
Level, Job Role etc.

 Employee Survey Data had data which was filled in by employees about their job
 Environment Satisfaction, Job Satisfaction, Work Life Balance

 Manager Survey Data had data which filled in by managers about their employees
 Job Involvement, Performance Rating

 Apart from a few columns in General Employee Data, all the columns were continuous
in nature.
Methodology
Following Methodology was followed to perform this analysis

Class Imbalance
Data Cleaning
 Due to continuous nature of data, the only fault found in data was of null values.
 Nulls were replaced by 0s
Data Manipulation
 Exploratory Data Analysis
 Performed basic analysis on data
 Checked Data Types, plotted simple graphs, plotter correlation matrix

 Joining Data
 Joined General Employee Data with Manager and Employee Survey data
 Formed a denormalized table
Exploratory Data Analysis
Exploratory Data Analysis
Correlation Matrix
Correlation Matrix
After dropping unnecessary columns we obtained the following correlation matrix
Encoding Schemes
 Two smaller dataframes were made for encoding purposes
1. Categorical Attributes Dataframe
2. Continuous Attributes Dataframe
 Categorical Attributes were encoded using pd.dummies()
 One Hot Encoding
Encoding Schemes
 Our Label “Attrition” contained two classes
 Yes

 No

 This column was Label Encoded converting Yes into 1 and No into 0

 Making in a Binary Classification Problem

 The two smaller dataframes where then concatenated


Dimensionality Reduction
 Filter based Feature Selection technique was used
 P- Value was used as statistical criteria
 Features were selected using the following criteria

 1. Define attributes and label.


 2. Perform Logistic Regression on entire data.
 3. Evaluate Results Summary
 4. Remove features with high P-values
 5. Repeat until P-Values are on a similar scale
Dimensionality Reduction
After Dimensionality Reduction
Class Imbalance
 We identified that our dataset contains class imbalance when we applied our
first algorithm, i.e. Logistic Regression,

 Spread of our Label “Attrition” was as follows


Class Imbalance

 Applied Kmeans SMOTE to counter class imbalance


 K-means SMOTE works in three steps:
 Cluster the entire input space using k-means.
 Distribute the number of samples to generate across clusters:
 Filter out clusters which have a high number of majority class samples.
 Assign more synthetic samples to clusters where minority class samples are sparsely
distributed.
 Oversample each filtered cluster using SMOTE
 Result:
Supervised Learning Algorithms
 Finally we applied the following classification algorithms,
1. Logistic Regression
2. SVM Linear Kernel
3. SVM Polynomial Kernel
4. KNN with N=3
5. KNN with N=5
6. KNN with N=7
 We avoided Tree based classifiers due to continuous nature of data
Results
 Following results were obtained after machine learning
Future Work
 In future other techniques for Dimensionality Reduction can be used like
 PCA, SVD, Wrapper based Feature Selection
 A Neural Network based approach can also be applied to predict Attrition
 A KPI based dashboard can be provided to the higher management for basic
analysis.

Das könnte Ihnen auch gefallen