Beruflich Dokumente
Kultur Dokumente
INTERDEPENDENCE
Learning Objectives
After reading this chapter student should be able to:
Know the interdependence multivariate technique that can be used in
data analysis
Conduct w exploratory factor analysis to reduce factors
Interpret results from an exploratory factor analysis
Understand how cluster analysis can be used to cluster objects and
individuals
Understand how multidimensional scaling can be used in research
Introduction
As was discussed in Chapter 13, when the research does not distinguish
between independent and dependent variables then the interdependence
techniques can be used. The techniques under this group are factor
analysis, cluster analysis and multidimensional scaling.
Factor Analysis
Factor analysis is typically known as a data reduction technique. This is a
technique that tries to statistically identify a reduced number of factors from
a larger number of items which are typically called the measured variables.
The factors identified are called latent variables as they are not measured
directly. To do this analysis the researcher does not have to distinguish
between dependent and independent variables.
Method
The principal focus of a factor analysis is to reduce the number of variables
from a larger number to a manageable number of factors to simply the
subsequent analysis. The technique relies on the correlations between the
large number of items by looking at the correlation and the intercorrelations.
There are several approaches that can be used to reduce the number of
items like unweighted least squares, generalized least squares, maximum
likelihood, principal axis factoring, alpha factoring, image factoring and
principal component analysis but principal component analysis is the one
most popularly used.
There several questions that the researcher needs to answer before the
analysis is started as follows:
A factor analysis can be run to see how many factors can be derived before
proceeding further to test the relationship in a model. The process of setting
up the analysis in the SPSS is illustrated next.
Once the analysis is done we will get a long list of output as which can be
interpreted next.
Interpretation
The first table that we will be interested in is this table called “KMO and
Bartlett’s Test” which measures the extent to which there are sufficient
correlations which is suitable for a factor analysis. The value will range from
0 to 1 with a value of 0.5 and above deemed acceptable. The bartlett’s test
the hypothesis that there is lack of sufficient correlation for a factor analysis
to be carried out. The KMO value of 0.847 and the bartlett’s test which is
significant (p< 0.01) indicates that the data is suitable for a factor analysis.
Once this is cleared, the next issue will be to look at the individual measure
of sampling adequacy (MSA). This is given in the table labeled anti image
matrices. Under the table anti image correlations, in the diagonals we will
see some values with a superscript a, these value should be greater than 0.5
if we have values less than 0.5 then we should consider depleting them one
at a time starting with the lowest loading first. Once that is done then only
we move on to the next table which is the communalities.
Anti-image
Anti-image Matrices
Reward1 Reward2 Reward3 Recip1 Recip2 Recip3 Recip4 Recip5 Sw1 Sw2 Sw3 Sw4 Sw5
Anti-image Reward1 .806a -.482 -.451 -.023 -.148 .152 .017 .071 -.036 -.024 .089 .026 -.100
Correlation
Reward2 -.482 .749a -.546 -.187 .021 .011 .097 -.049 .152 -.007 -.212 .219 -.085
Reward3 -.451 -.546 .741a .230 .110 -.196 -.105 .000 -.100 .033 .124 -.273 .171
Recip1 -.023 -.187 .230 .831a -.195 -.277 -.136 -.031 .172 -.097 .082 -.158 -.152
Recip2 -.148 .021 .110 -.195 .881a -.155 -.083 -.248 -.163 -.077 .036 -.037 .164
Recip3 .152 .011 -.196 -.277 -.155 .870a -.144 -.179 .008 -.111 .068 -.044 .068
Recip4 .017 .097 -.105 -.136 -.083 -.144 .893a -.356 -.022 -.137 -.089 .135 -.058
Recip5 .071 -.049 .000 -.031 -.248 -.179 -.356 .843a .012 .128 -.109 .017 -.021
Sw1 -.036 .152 -.100 .172 -.163 .008 -.022 .012 .858a -.434 .050 -.235 -.469
Sw2 -.024 -.007 .033 -.097 -.077 -.111 -.137 .128 -.434 .891a -.390 -.046 .125
Sw3 .089 -.212 .124 .082 .036 .068 -.089 -.109 .050 -.390 .867a -.446 -.293
Sw4 .026 .219 -.273 -.158 -.037 -.044 .135 .017 -.235 -.046 -.446 .884a -.162
Sw5 -.100 -.085 .171 -.152 .164 .068 -.058 -.021 -.469 .125 -.293 -.162 .873a
Communalities
Initial Extraction
Reward1 1.000 .979
Reward2 1.000 .981
Reward3 1.000 .982
Recip1 1.000 .521
Recip2 1.000 .577
Recip3 1.000 .629
Recip4 1.000 .625
Recip5 1.000 .620
Sw1 1.000 .865
Sw2 1.000 .826
Sw3 1.000 .865
Sw4 1.000 .851
Sw5 1.000 .826
Extraction Method: Principal Component Analysis.
Next we will look at the table labeled “Total Variance Explained” to assess
how much of the variance has been explained by the extracted factors and
how many factors has been extracted.
aa
Total Variance Explained
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
% of Cumulative % of Cumulative % of Cumulative
Component Total Variance % Total Variance % Total Variance %
1 5.718 43.985 43.985 5.718 43.985 43.985 4.167 32.055 32.055
2 2.805 21.575 65.560 2.805 21.575 65.560 2.999 23.067 55.121
3 1.624 12.494 78.054 1.624 12.494 78.054 2.981 22.933 78.054
4 .652 5.017 83.071
5 .544 4.183 87.253
6 .486 3.741 90.994
7 .412 3.168 94.162
8 .246 1.892 96.054
9 .207 1.593 97.647
10 .167 1.284 98.930
11 .097 .742 99.673
12 .023 .178 99.850
13 .019 .150 100.000
Extraction Method: Principal Component
Analysis.
Next from the table above we can know that based on the eigen value more
than 1, 3 factors can be extracted. See the column on initial eigen value we
can see three eigen values 5.178, 2.805 and 1.624 are more than 1 and the
next one 0.652 is less than one so the program will stop extracting more
factors.
The scree plot below can also be used to decide on number of factors that
can be derived. The scree plot follows the phenomenon of rock falls; the
bigger rocks will fall first while the smaller rocks will fall later. The bigger
rocks are important factor while the smaller ones which will explain very
small amounts of variance can be ignored. We can see that based on eigen
value only 3 factors will be extracted but based on the scree plot it may be
plausible to go up to 4 factors. Since initially we had 3 factors so we will
stop at the 3 factor solution.
The total variance explained by the 3 factor solution is 78.054% which can
be considered high. Generally we are looking for at least 50% variance
explained. Each of the 3 factors explains a portion of the total variance (see
the rotation sum of squared loadings) and can be broken down as
Reward (23.067%), Reciprocal (22.933%) and Self-worth (32.055%).
The next big task is to assign items to factors based on the assessment of
the loadings and cross-loadings. Items loading high on one factor and
loading low on other factors can be uniquely assigned to each factor. From
the table we can see that the first three items are all loading high on the first
second factor, the next five items are all loading high on the third factor
whereas the last five items are loading high on the first factor. Thus we can
also rename the first factor as self-worth, second factor as reward and
the third factor as reciprocal.
The next thing that can be done is to compute the factors that will be used
in subsequent analysis. There are three ways suggested in the literature
which are 1) Surrogate variable, 2) Summated scale and 3) Factor scores.
Each has its advantages and drawbacks. The most commonly used method
for computing factors is the summated scale which lends itself to
generalizability and transferability.
Cluster Analysis
Cluster analysis is a multivariate approach for identifying objects or
individuals that are similar to one another based on some criteria or
characteristics. This analysis will classify individuals or objects into a small
number of mutually exclusive and collectively exhaustive groups based on a
set of variables called cluster variate. The focus of cluster analysis is not to
estimate the variate but to compare objects based on the variate; in a sense
similar to factor analysis that group variables, whereas cluster analysis
groups objects and individuals.
Method
Most cluster analysis uses the basic 5 steps described below although they
may differ.
1. Selection of sample to be clustered
2. Definition of variables that will be used to measure the objects or
individuals
3. Computation of similarities among the entities through correlation,
Euclidean distances, and other techniques
4. Selection of mutually exclusive clusters or hierarchically arrange
clusters
5. Cluster comparison and validation.
Cluster Membership
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 2 2 1
6 2 2 1
7 2 2 1
8 2 2 1
9 3 3 2
10 3 3 2
11 4 3 2
12 4 3 2
We should also look at the dendrogram and it clearly shows that a 3 cluster
solution is the best. Cluster 1 (1,2,3,4) can be classified as power users,
Cluster 2 (5,6,7,8) can be classified as casual users and Cluster 3
(9,10,11,12) as starters.
* * * H I E R A R C H I C A L C L U S T E R A N A L Y S I S * * *
C A S E 0 5 10 15 20 25
Label Num +---------+---------+---------+---------+---------+
6 ─┐
8 ─┤
5 ─┼─────────────┐
7 ─┘ │
3 ─┐ ├─────────────────────────────────┐
4 ─┤ │ │
2 ─┼─────────────┘ │
1 ─┘ │
11 ─┐ │
12 ─┼───────────────────────────────────────────────┘
9 ─┤
10 ─┘
To validate we can use a one-way ANOVA to test the 2 variables against the
3 clusters and the results clearly confirm our findings. The power users are
low on price sensitivity and high on attitude whereas the starters are low on
attitude and high on price sensitivity. The casual users are the ones in
between the two extreme clusters.
Multidimensional Scaling
Multidimensional scaling refers to a series of techniques that helps the
researcher identify key dimensions underlying respondents’ evaluation of
objects and then position these objects in this dimensional space sometimes
called a perceptual map. The analysis works on the judgment of the
respondents based on similarity of the objects and these similarities are
reflected in the relative distance among the objects in the multidimensional
space. This is commonly used in marketing studies to identify key
dimensions underlying customer evaluations of products, services, or
company. For example a customer maybe asked to rate the similarity of
several cars based on pairs which will give us many paired comparison.
Then a plot will be generated to explain the differences.
Method
When respondents assess the objects they may use different types of
measures which can be classified into objective dimension and subjective
dimension. The objective dimension are quantifiable (physical or observable)
while the other is not easily quantifiable (perceptions). The subjective
dimension may or may not be based on the objective dimension. The same
object having the same physical characteristics (objective dimension) but
they may be viewed differently by the respondents based on quality
(perceived dimension). The researcher needs to understand how objective
and subjective dimensions relate to the axes in the multidimensional space
used in the perceptual map. The perceptual map can be visually depicted as
follows:
Dimension 1
C
B
F
Dimension 2
D
G H
Next we will analyze the matrix using the SPSS using the Analyze, Scale,
Multidimensional Scaling tab. The next screen shows some options that we
need to choose. Once the analysis is executed we will get the output that
can be interpreted.
Once we have the solution we need to identify the 2 dimensions of attributes
that the respondents have used in evaluating the similarity and
dissimilarity. From experience we may classify the first dimension as
Quality and the second dimension as Prestige. Then we can see if the
grouping in the multidimensional space makes sense.
Summary
In this chapter we have looked at the multivariate techniques which are
called interdependence techniques namely the factor analysis, cluster
analysis and the multidimensional scaling. The exploratory is a technique
that tries to statistically identify a reduced number of factors from a larger
number of items which are typically called the measured variables. The
subsequent factors identified are called latent variables as they are not
measured directly. The pattern loading will be used to assign items into
particular factors. Next we looked at cluster analysis which is an approach
for identifying objects or individuals that are similar to one another based on
some criteria or characteristics. This analysis will classify individuals or
objects into a small number of mutually exclusive and collectively
exhaustive groups based on a set of variables called cluster variate. Lastly
we looked at the multidimensional scaling technique that helps the
researcher identify key dimensions underlying respondents’ evaluation of
objects and then position these objects in this dimensional space sometimes
called a perceptual map.
Review Questions
1. What is the purpose of doing an exploratory factor analysis?
2. How can we decide on the number of factors to be extracted?
3. There are 2 main rotation techniques in factors analysis, how do we
decide which one to choose?
4. Explain what you understand about factor loadings and communality.
5. Explain what is your understanding of the term percentage variance
explained. What is the use of this measure?
6. Discuss the three ways how factors can be computed after the factor
analysis.
7. What is the difference between cluster analysis and factor analysis?
8. Explain how the cluster analysis works.
9. Explain the principle on which the multidimensional scaling analysis
works?
10. The following question is based on the data “Data Knowledge
Sharing”. The task is to do a factor analysis of these 13 items which
measures 3 factors, Attitude (Att), Subjective norm (Sn) and Perceived
behavioral control (Pbc) and the items are as listed below: