Beruflich Dokumente
Kultur Dokumente
1. The purpose of factor analysis is to reduce the initial number of variables into a smaller and therefore
more manageable (easier to analyze and interpret) set of underlying dimensions, called factors.
2. There should be no dependent or independent variables in factor analysis. Mixing dependent and
independent variables in a single factor analysis and then examining the dependence relationships is
inappropriate.
3. R Factor Analysis vs. Q Factor Analysis
a) R Factor Analysis (most common type) analyzes the variables (usually the columns of the input
matrix)
b) Q Factor Analysis analyzes the respondents (not very popular – for analysis of subgroups of
respondents, Cluster Analysis is rather used)
4. Assumptions of factor analysis:
a) The variables should be metric (interval or ratio scale).
b) Sample size n = at least 5 (ideally 10 or even 20) times the number of variables.
c) The sample should be homogeneous with respect to the underlying factor structure. For example,
if you know that some variables differ because of gender, it is wrong to apply factor analysis to a
sample of males and females together. In such cases, you should perform two factor analyses, one
on a sample of males, and the other on a sample of females.
d) The typical for other techniques assumptions of normality, homoscedasticity, and linearity are not
very important in factor analysis, i.e. they are – as always – welcome, however, they are not
crucial, unless one wants to apply statistical tests (rarely used) of the significance of the factors.
Some degree of multicollinearity is even desirable in that the correlation matrix should reveal a
substantial number of correlations greater than 0.30 (however, most of the partial correlations
should – ideally - be less than 0.30; they are displayed by SPSS in the anti-image correlation
matrix).
e) Always remember to check for and remove the outliers.
f) The correlation matrix should also be examined with the Bartlett test of sphericity wich measures
the presence of correlations among the variables.
g) Another measure to quantify the degree of intercorrelations among the variables and hence the
appropriateness of factor analysis is KMO MSA (Kaiser-Meyer-Olkin Measure of Sampling
Adequacy; 0KMO1.
KMO >0.8 (meritorious*)
0.7< KMO < 0.8 (middling)
0.6 < KMO < 0.7 (mediocre)
0.5 < KMO < 0.6 (miserable)
KMO less than 0.5 (unacceptable)
Original terms used by Kaiser
Check Correlation Matrix: is there a sufficient number of correlations greater than 0.30 or less
than –0.30? Are they significant? YES. There are 6 such correlations (+ the 7th very close to
0.30) among the 15 correlation coefficients.
KMO = 0.660 (mediocre) OK (in general, should be greater than 0.50)
Bartlett’s Test of Spehericity: Ho: The variables are uncorrelated in the population. Because Sig.
= 0.000 Reject Ho, which is the desired result.
Based on KMO and Bartlett’s Test of Sphericity FA is appropriate for analyzing the
correlation matrix.
Anti-image Correlation Matrix
1
The elements on the main diagonal are the individual variables’ MSA’s: they should be greater
than 0.5 (a variable with MSA < 0.5 should be removed from further analysis; however, do
not eliminate all such variables at once. First, remove the variable with the lowest MSA,
repeat the FA, remove the new lowest MSA, until all the MSA’a are greater than 0.50)
Communalites:
There are several methods of analyzing the data in FA: the most popular ones are
(i) Principal Components Analysis (PCA) – this method is used in the current example
used to extract the minimum number of factors accounting for maximum variance in
the data for use in subsequent multivariate analysis (e.g. cluster analysis)
PCA analyzes the original correlation matrix, where the main diagonal has all 1’s, i.e.
it is based on the total variance
(ii) Common Factor Analysis (CFA) – also known as Principal Axis Factoring (SPSS)
used to determine all the underlying dimensions
CFA analyzes the correlation matrix, where the 1’s on the main diagonal have been
replaced with the communalities, i.e. it is based on the common variance
Communality = Sum of squares of the variable loadings (elements of
Component Matrix described below) across all factors:
Ex. Variable 1: Prevents Cavities: Communality = 0.926 = (0.928)^2 +
(0.253)^2.
CFA has several problems, such as no single solution. Therefore, the PCA is more
widely used than the CFA
Let’s repeat the analysis with varimax rotation: choose in Rotation – Varimax
2
Shiny Teeth
Freshens Breath
Attractive Teeth
Assign labels to this factors. For example, the first factor could be named “Health Benefit” factor, whereas
the second – “Social Image” factor.
The results of varimax rotation look good, there is no need for oblique rotation, which might help if, for
example, one variable were highly correlated on both factors.