Beruflich Dokumente
Kultur Dokumente
GROUP MEMBERS
Nida Karim k18-0867
Muhammad Zain ul Haq k18-0887
Motivation
Employee Survey Data had data which was filled in by employees about their job
Environment Satisfaction, Job Satisfaction, Work Life Balance
Manager Survey Data had data which filled in by managers about their employees
Job Involvement, Performance Rating
Apart from a few columns in General Employee Data, all the columns were continuous
in nature.
Methodology
Following Methodology was followed to perform this analysis
Class Imbalance
Data Cleaning
Due to continuous nature of data, the only fault found in data was of null values.
Nulls were replaced by 0s
Data Manipulation
Exploratory Data Analysis
Performed basic analysis on data
Checked Data Types, plotted simple graphs, plotter correlation matrix
Joining Data
Joined General Employee Data with Manager and Employee Survey data
Formed a denormalized table
Exploratory Data Analysis
Exploratory Data Analysis
Correlation Matrix
Correlation Matrix
After dropping unnecessary columns we obtained the following correlation matrix
Encoding Schemes
Two smaller dataframes were made for encoding purposes
1. Categorical Attributes Dataframe
2. Continuous Attributes Dataframe
Categorical Attributes were encoded using pd.dummies()
One Hot Encoding
Encoding Schemes
Our Label “Attrition” contained two classes
Yes
No
This column was Label Encoded converting Yes into 1 and No into 0