Sie sind auf Seite 1von 15

Business Research Methods

Discriminant Analysis

Groups 3 & 6
26th August 2015

Agenda

What is Discriminant Analysis?


Calculation of Cut-off Discriminant Score
Assumptions of Discriminant Analysis
Why Discriminant Analysis?
Limitations of Discriminant Analysis
Graphical Explanation
Example
Comparison with Other Methods
Hands-on: Model using SPSS

26/8/2015

What is Discriminant Analysis?

Discriminant

Analysis is a multivariate technique that predicts a


categorical dependent variable based on a linear combination of
independent variables

Major goal of discriminant analysis is to perform the classification function

The linear combination of independent variables is known as a


discriminant function
where
= i th applicants discriminant score
= constant
= discriminant coefficient for the n th variable
= i th applicants value on the n th independent variable
26/8/2015

Calculation of Cut-off Discriminant


Score
Group

#Observati
Probability Mean of Z
ons

High
Income

45

0.45

1.5

Low Income

55

0.55

0.4

Total

100

Cut-off Score
= 0.895

= (0.45 x 1.5) + (0.55 x 0.4)

26/8/2015

Assumptions of Discriminant
Analysis

The observations are a random sample


Each predictor variable is normally distributed
There must be at least two groups or categories, with each
observation belonging to only one group so that the groups are
mutually exclusive and collectively exhaustive
Each of the allocations for the dependent categories in the initial
classification are correctly classified
The attribute(s) used to separate the groups should discriminate
quite clearly between the groups so that group or category
overlap is clearly non-existent or minimal

26/8/2015

Why Discriminant Analysis?

To investigate differences between groups on the basis of the


attributes of the cases, indicating which attributes contribute
most to group separation

Addresses the question of how to assign new observations to


groups

The most parsimonious way to distinguish between groups

26/8/2015

Limitations of Discriminant Analysis

Extremely sensitive to the presence of outliers

If the independent variables in the discriminant function are


highly correlated, the standardized discriminant function
coefficients will not reliably assess the relative importance of the
independent variables

26/8/2015

Points to Note

Weights
are assigned to the variables to maximize the ratio of

the difference between the means of the two groups to the


standard deviation within groups. So in case of a dependent
variable with two groups, the following condition applies

Number of discriminant functions:

26/8/2015

Graphical Explanation of Discriminant


Analysis

Discriminant Distributions

Two distributions overlap too


much

Very less overlap between the two


distributions

There will be many


misclassification cases

Misclassification cases will be


minimum
26/8/2015

Graphical Explanation of Discriminant


Analysis

Scatter graph displaying


distributions by axis

New axis creating greater


discrimination

26/8/2015

10

Example

Suppose a personnel manager for an electrical wholesaler has


been keeping records on successful versus unsuccessful sales
employees
The personnel manager believes it is possible to predict whether
an applicant will succeed on the basis of age, sales aptitude test
scores, and mechanical ability scores
The problem is to find a linear function of the independent
variables that shows large differences in group means
Therefore, the first task is to estimate the coefficients of
discriminant function

26/8/2015

11

Example (contd.)

The
personnel manager finds the standardized weights in the

equation to be

where
= Age
= Sales aptitude test score
= Mechanical ability score

Observations from the equation:


Age has the maximum weight, hence it is more important than the other
two variables
26/8/2015

12

Validation of the Discriminant


Function

In the example, current employees with known characteristics are


used in constructing the model
Each observation (current employee) is placed into one of the
groups based on the independent variables
The information can be provided in the confusion matrix as below

Predicted Group

Actual
Group

Successfu Unsuccess
l
ful

Successful

34

40

Unsuccessful

38

45

The matrix shows that the number of correctly classified


employees is 85% (72 out of 85)
26/8/2015

13

Comparison

Discriminant Analysis vs Logistic Regression


The dependent variable is categorical in both the cases
Dependent variable in logistic regression can have maximum two groups
In case of discriminant analysis, the dependent variable can have more than
two groups

Discriminant Analysis vs Cluster Analysis


In Discriminant Analysis, groups are know a priori; i.e., all the observations
are supposed to be correctly classified at the outset. Objective of analysis is
to predict that classification from the predictor variables
Cluster Analysis is used when the natural clusters are not known. The
objective is to discover if there are any natural groups
In cluster analysis, one begins with groups that are undifferentiated, and tries
to form groups and subgroups
26/8/2015

14

What Next?

Das könnte Ihnen auch gefallen