Sie sind auf Seite 1von 2

A correspondence table is any two-way table whose cells contain some

measurement of correspondence between the rows and the columns. The measure
of correspondence can be any indication of the similarity, affinity, confusion,
association, or interaction between the row and column variables. A very common
type of correspondence table is a crosstabulation, where the cells contain frequency
counts.
Such tables can be obtained easily with the Crosstabs procedure. However, a
crosstabulation does not always provide a clear picture of the nature of the
relationship between the two variables. This is particularly true if the variables of
interest are nominal (with no inherent order or rank) and contain numerous
categories. Crosstabulation may tell you that the observed cell frequencies differ
significantly from the expected values in a 10x9 crosstabulation of occupation and
breakfast cereal, but it may be difficult to discern which occupational groups have
similar tastes or what those tastes are.
Correspondence Analysis allows you to examine the relationship between two
nominal variables graphically in a multidimensional space. It computes row and
column scores and produces plots based on the scores. Categories that are similar
to each other appear close to each other in the plots. In this way, it is easy to see
which categories of a variable are similar to each other or which categories of the
two variables are related. The Correspondence Analysis procedure also allows you
to fit supplementary points into the space defined by the active points.
If the ordering of the categories according to their scores is undesirable or
counterintuitive, order restrictions can be imposed by constraining the scores for
some categories to be equal. For example, suppose that you expect the variable
smoking behavior, with categories none, light, medium, and heavy, to have scores
that correspond to this ordering. However, if the analysis orders the
categories none, light, heavy, and medium, constraining the scores for heavy and
medium to be equal preserves the ordering of the categories in their scores.
The interpretation of correspondence analysis in terms of distances depends on the
normalization method used. The Correspondence Analysis procedure can be used to
analyze either the differences between categories of a variable or the differences
between variables. With the default normalization, it analyzes the differences
between the row and column variables.
The correspondence analysis algorithm is capable of many kinds of analyses.
Centering the rows and columns and using chi-square distances corresponds to
standard correspondence analysis. However, using alternative centering options
combined with Euclidean distances allows for an alternative representation of a
matrix in a low-dimensional space.

Normalization
Normalization is used to distribute the inertia over the row scores and column scores. Some aspects of
the correspondence analysis solution, such as the singular values, the inertia per dimension, and the
contributions, do not change under the various normalizations. The row and column scores and their
variances are affected. Correspondence analysis has several ways to spread the inertia. The three most

common include spreading the inertia over the row scores only, spreading the inertia over the column
scores only, or spreading the inertia symmetrically over both the row scores and the column scores.
Row principal. In row principal normalization, the Euclidean distances between the row points
approximate chi-square distances between the rows of the correspondence table. The row scores are the
weighted average of the column scores. The column scores are standardized to have a weighted sum of
squared distances to the centroid of 1. Since this method maximizes the distances between row
categories, you should use row principal normalization if you are primarily interested in seeing how
categories of the row variable differ from each other.
Column principal. On the other hand, you might want to approximate the chi-square distances between
the columns of the correspondence table. In that case, the column scores should be the weighted
average of the row scores. The row scores are standardized to have a weighted sum of squared
distances to the centroid of 1. This method maximizes the distances between column categories and
should be used if you are primarily concerned with how categories of the column variable differ from each
other.
Symmetrical. You can also treat the rows and columns symmetrically. This normalization spreads inertia
equally over the row and column scores. Note that neither the distances between the row points nor the
distances between the column points are approximations of chi-square distances in this case. Use this
method if you are primarily interested in the differences or similarities between the two variables. Usually,
this is the preferred method to make biplots.
Principal. A fourth option is called principal normalization, in which the inertia is spread twice in the
solutiononce over the row scores and once over the column scores. You should use this method if you
are interested in the distances between the row points and the distances between the column points
separately but not in how the row and column points are related to each other. Biplots are not appropriate
for this normalization option and are therefore not available if you have specified the principal
normalization method.

Das könnte Ihnen auch gefallen