Notes On Cluster Analysis

Notes on Cluster Analysis
Brian T. Ratchford
September 2005
In this note I will discuss the use of clustering in segmentation studies using a simplified
example to impart a basic understanding of widely-used procedures. I will also provide
an introduction to the ConneCtor PDA case, and the use of the Marketing Engineering
software in solving that case.
Cluster analysis refers to a set of procedures for grouping observations (people, stores)
into relatively homogeneous subsets based on a set of variables that describe them or
their responses. For segmentation these variables are the bases for segmentation. Cluster
analysis techniques are primarily exploratory, and aimed at providing useful descriptions
rather than statistical tests. Differing clustering methods may not give the same results,
and a degree of judgment is required in choosing a method, in deciding on the number of
clusters to work with, and in interpreting the meaning of the clusters. Unlike regression,
in which least squares provides optimal estimates under certain conditions, there is no
clearly optimal method of clustering.
Clustering procedures begin by calculating a matrix of distances between pairs of

observations, or observations and cluster means, across variables. The grouping method
uses this distance matrix as the basis for clustering. The two traditional approaches to
clustering are:
 Hierarchical methods, which form clusters sequentially in a tree structure, either
starting from a number of clusters equal to the number of observations and
joining cases into smaller numbers of clusters until only one big cluster remains
(agglomerative method), or starting from one big cluster and working down to a
number of clusters equal to the number of observations.
 Partitioning methods, which assign cases to some predetermined number of
clusters. The partitioning method implemented in SPSS and SAS is K-Means,
which will be described later in this note.
Hierarchical procedures have an advantage of directly incorporating the distance between

each pair of observations into the analysis, while K-Means and other partitioning
procedures consider only distances between observations and cluster means. However,
the hierarchical methods become unwieldy when applied to large samples, making K-
Means the method used most commonly on datasets with more than 200 or so cases.
Numerical Example of Clustering
Consider the 15 data points on the next page, which represent the age and income of 15
persons. The object is to group the 15 persons into a parsimonious grouping that makes
the resulting groups (clusters) as homogeneous as possible. The grouping will be based
on the age and income measures alone. While clustering is most useful when there are a
large number of grouping variables, the 2-dimensional example will allow a visual
representation of the data points.
1
Original Units Standardized
Subject Age Income Age Income
John 60 $67,187.22 0.5715 1.0078
Mary 43 $87,748.31 -0.3567 1.9020
Sue 62 $61,240.25 0.6807 0.7492
Peggy 70 $72,999.80 1.1175 1.2606
Bob 59 $74,183.57 0.5169 1.3121
Paul 19 $42,510.30 -1.6672 -0.0653
Mary 21 $32,964.65 -1.5580 -0.4804
Jenny 19 $35,627.11 -1.6672 -0.3646
Julie 37 $44,036.92 -0.6844 0.0011
Bill 39 $33,434.88 -0.5752 -0.4599
Bernie 60 $24,700.72 0.5715 -0.8398
Bonnie 66 $15,005.66 0.8991 -1.2614
Judy 64 $25,253.94 0.7899 -0.8157
Dan 66 $17,635.69 0.8991 -1.1470
Mark 58 $25,642.92 0.4623 -0.7988
The data points are represented in the above table and are plotted in the graph on the
following page. From the graph it can be seen that there appear to be 3-5 groups. There is
at least an older, low income group (Bernie, Bonnie, Judy, Dan, Mark); an older high
income group (Bob, John, Sue, Peggy, and possibly Mary); and a younger moderate
income, younger group (Paul, the other Mary, Jenny, Julie and Bill). But one can see that
it might be better to split Mary from the high income, older group, and to split Julie and
Bill from their younger counterparts. The use of formal clustering procedures will help in
that decision. When, as is usual, there are many variables, it is very difficult to visualize
clusters and formal procedures need to be used to form groups.
In order to form clusters we need a measure of similarity or distance between every pair
of subjects (or objects). If the data are continuously scaled, this is usually this is taken to
be the squared Euclidean distance between each pair. If we compute this distance on the
raw data, much more weight will be given to income simply becasue it is in much bigger
units than age. To give equal weight to each variable inpout to clustering, we
Must standardize the variables if the units of measurement are different. This is done by
subtracting the mean from each case (age = 19, income = 15066), and dividing the result
by the standard deviation (age = 18.31, income = 22996). The standardized measures are
presented in the final two columns of the above table (note the analogy to using Beta
weights in a regresión).
A distance matrix for the 15 cases that was computed on the standardized measures is
presented in the table labeled “Proximity Matrix.” From the matrix we can see, for
example, that John is close to Sue and Bob, distant from Jenny, etc. Therefore John, Sue
and Bob are much more likely to be in the same cluster than John and Jenny. This matrix
is the input to clustering.
2
Plot of Data for Clustering
100000
However, in order to demonstrate how discriminant analysis works, I will continue with
the example. The panel “Canonical Discriminant Function Coefficients” provides the
Mary
weights used in scoring each sample member. Consider only the 1st two functions since
the 3rd has very little explanatory power. Then for any male member of the sample
without a technical degree or job we could calculate:
80000
Score on function 1: 2.664(0) + .383(0) + .514(0) – 1.578 = - 1.578 Bob Peggy
Score on function 2: 1.007(0) + 1.497(0) + 1.843(0) – 1.415 = - 1.415John
For a female without a technical degree or job the scores would be: Sue
Score60000
on function 1: 2.664(1) + .383(0) + .514(0) – 1.578 = 1.087
Score on function 2: 1.007(1) + 1.497(0) + 1.843(0) – 1.415 = -.408
For a female with a technical Julie

job, but no technical degree the scores would be:
Paul
Score on function 1: 2.664(1) + .383(0) + .514(1) – 1.578 = 1.601
40000
Score on function 2: Jenny
1.007(1)
Mary
+ 1.497(0) + 1.843(1)
Bill – 1.415 = 1.435
Mark
Bernie Judy
20000 Dan
Bonnie
INCOME
0
10 20 30 40 50 60 70 80
AGE
3
Proximity Matrix
Squared Euclidean Distance

1: 2: 3: 4: 7: 8: 10: 11: 12:
Case John Mary Sue Peggy 5:Bob 6:Paul Mary Jenny 9:Julie Bill Bernie Bonnie 13:Judy 14:Dan 15:Mark
1:John .00 1.66 .08 .36 .10 6.16 6.75 6.90 2.59 3.47 3.41 5.26 3.37 4.75 3.28
2:Mary 1.66 .00 2.41 2.58 1.11 5.59 7.12 6.85 3.72 5.63 8.38 11.58 8.70 10.87 7.96
3:Sue .08 2.41 .00 .45 .34 6.18 6.52 6.75 2.42 3.04 2.54 4.09 2.46 3.64 2.44
4:Peggy .36 2.58 .45 .00 .36 9.51 10.19 10.40 4.83 5.83 4.71 6.41 4.42 5.84 4.67
5:Bob .10 1.11 .34 .36 .00 6.67 7.52 7.58 3.16 4.33 4.63 6.77 4.60 6.19 4.46
6:Paul 6.16 5.59 6.18 9.51 6.67 .00 .18 .09 .97 1.35 5.61 8.02 6.60 7.76 5.07
7:Mary 6.75 7.12 6.52 10.19 7.52 .18 .00 .03 1.00 .97 4.66 6.65 5.63 6.48 4.18
8:Jenny 6.90 6.85 6.75 10.40 7.58 .09 .03 .00 1.10 1.20 5.24 7.39 6.24 7.20 4.72
9:Julie 2.59 3.72 2.42 4.83 3.16 .97 1.00 1.10 .00 .22 2.28 4.10 2.84 3.83 1.95
10:Bill 3.47 5.63 3.04 5.83 4.33 1.35 .97 1.20 .22 .00 1.46 2.82 1.99 2.65 1.19
11:Bernie 3.41 8.38 2.54 4.71 4.63 5.61 4.66 5.24 2.28 1.46 .00 .29 .05 .20 .01
12:Bonnie 5.26 11.58 4.09 6.41 6.77 8.02 6.65 7.39 4.10 2.82 .29 .00 .21 .01 .40
13:Judy 3.37 8.70 2.46 4.42 4.60 6.60 5.63 6.24 2.84 1.99 .05 .21 .00 .12 .11
14:Dan 4.75 10.87 3.64 5.84 6.19 7.76 6.48 7.20 3.83 2.65 .20 .01 .12 .00 .31
15:Mark 3.28 7.96 2.44 4.67 4.46 5.07 4.18 4.72 1.95 1.19 .01 .40 .11 .31 .00
This is a dissimilarity matrix
4
* * * * * * H I E R A R C H I C A L C L U S T E R A N A L Y S I S * * * * * *
Dendrogram using Ward Method
Rescaled Distance Cluster Combine
C A S E 0 5 10 15 20 25
Label Num +---------+---------+---------+---------+---------+
Bonnie 12 
Dan 14 
Bernie 11  
Mark 15  
Judy 13  
Mary 7   
Jenny 8   
Paul 6   
Julie 9  
Bill 10  
John 1  
Sue 3  
Bob 5  
Peggy 4  
Mary 2 
5
Hierarchical Clustering of Example Data
Hierarchical clusters are formed starting with clusters = number of objects to be
clustered. Ward’s method of hierarchical clustering is a commonly-used method. This
involves joining the closest, next closest, etc., until there is one big cluster by minimizing
sum of squared distance between objects and their cluster center. This sum = 0 if clusters
equal the number of cases, the starting point. This sum equals the total sum of squared
distance from overall mean if there is one cluster, the ending point. The sum of squared
distance between objects and the center of the corresponding cluster, the within cluster
distance, is the error due to grouping the objects. Choosing a number of clusters is a
tradeoff between this error and parsimony in explaining the data.
Results for the example are summarized in the dendogram on the preceding page. The
scale at the top of the dendogram gives the total distance (total error) between the
members and centers of their clusters (for some reason SPSS rescales this to a 0-25
scale). If there are 15 clusters, one for each case, this distance is 0. With 5 clusters the
distance goes to about one, so the error from moving to 5 clusters is relatively small,
about 1/25 = 4%. Thus one could say that the R2 for the 5 cluster solution is 1-.04 = .96.
Moving to 3 clusters increases the error to about 3/15 = 12%, or decreases R2 to about .
88. Whether one works with 3-5 clusters depends on how much error one is willing to
accept compared to the benefits of a parsimonious grouping. Since moving to 2 clusters
increases the error to about 20/25 = 80%, it is unlikely that 2 clusters will be deemed
satisfactory. Moving to one cluster increases the error to 100%. Cluster membership for
the 3-5 cluster solution is summarized below. Referring to the scatter-plot of the data you
can see that people who are closest to one another are grouped into the same cluster.
Cluster Membership
Case 5 Clusters 4 Clusters 3 Clusters

1:John 1 1 1
2:Mary 2 2 1
3:Sue 1 1 1
4:Peggy 1 1 1
5:Bob 1 1 1
6:Paul 3 3 2
7:Mary 3 3 2
8:Jenny 3 3 2
9:Julie 4 3 2
10:Bill 4 3 2
11:Bernie 5 4 3
12:Bonnie 5 4 3
13:Judy 5 4 3
14:Dan 5 4 3
15:Mark 5 4 3
6
K-Means Clustering
K-Means clustering (Quick Cluster in SPSS, Fastclus in SAS) is the best option when
there are a large number of subjects or objects to cluster (more than 200 or so). K-Means
is a top-down approach, in which one must specify the number of clusters to be formed at
the outset (of course, solutions with different numbers of clusters can be compared).
For this defined number of clusters, the procedure starts with a trial solution consisting of
initial cluster centres (means on each standardized variable). For the initial centers, the
program assigns each case to the cluster with the nearest center. Given this assignment,
the program recomputes the center of each cluster. Then it again assigns cases to the
cluster with the nearest center. Means of each cluster are then recomputed, and the
procedure continúes until it converges and the assignments do not change. A weakness of
the procedure is that results can be sensitive to the chosen starting values – these can be
supplied by the user or automatically chosen by the program.
An output for a 3-cluster solution using K-Means is presented below. The variable Agestd
is the standardized value of age, Incomestd is the standardized value of income. The
starting values chosen by the program are presented in the 1st table “Initial Cluster
Centers.” The next table shows that convergence is obtained in 2 iterations – at this point
cluster centers no longer change with the reassignment of cases. Cluster membership is
presented next. While this will not always be the case, it can be seen that this is the same
as for the Ward clustering presented above. The distances in this table are distance
between that person and the center of their cluster. Mary is relatively far from the center
of Cluster 2, as can be verified from the scatter-plot presented earlier.
The “Final Cluster Centers” table gives the mean of each variable for members in each
cluster, and is critical for interpreting the results. Cluster 1 is seen to be higher than
average in age (.7244 > 0) and lower than average in income (-.9725 <0); Cluster 2 is
seen to be above average in both age and income, though not as high in age as Cluster 1;
Cluster 3 is below average in age and slightly below in income. One can use this
information to name the clusters. For example, Cluster 1 might be the senior citizens,
Cluster 2 the well-to-do empty nesters, Cluster 3 the young working people.
The table “Distances between Final Cluster Centers” shows how distant the clusters are
from one another. One would like the distances between a given respondent and their
cluster center to be small relative to these distances, which is the case in this example.
The table indicates that Clusters 2 and 3 are farthest from one another, while 1 and 3 are
closest.
The ANOVA table presents tests for whether the averages for each variable differ
significantly across clusters. The column labeled Sig. indicates that they do. The final
table indicates that there are 5 people in each cluster.
7
K Means Clustering Output
Initial Cluster Centers
Cluster
1 2 3
Agestd .8991 -.3567 -1.5580
Incomestd -1.2614 1.9020 -.4804
Iteration Historya
Change in Cluster Centers

Iteration 1 2 3
1 .338 1.084 .387
2 .000 .000 .000
a. Convergence achieved due to no or small change in
cluster centers. The maximum absolute coordinate
change for any center is .000. The current iteration is 2.
The minimum distance between initial centers is 2.578.
Cluster Membership
Case Number Subject Cluster Distance

1 John 2 .247
2 Mary 2 1.084
3 Sue 2 .527
4 Peggy 2 .612
5 Bob 2 .067
6 Paul 3 .484
7 Mary 3 .387
8 Jenny 3 .446
9 Julie 3 .611
10 Bill 3 .681
11 Bernie 1 .202
12 Bonnie 1 .338
13 Judy 1 .170
14 Dan 1 .247
15 Mark 1 .314
Final Cluster Centers
Cluster
1 2 3
Agestd .7244 .5060 -1.2304
Incomestd -.9725 1.2463 -.2738
8
Distances between Final Cluster Centers
Cluster 1 2 3
1 2.230 2.076
2 2.230 2.308
3 2.076 2.308
ANOVA
Cluster Error
Mean Square df Mean Square df F Sig.
Agestd 5.736 2 .211 12 27.240 .000
Incomestd 6.435 2 .094 12 68.383 .000
The F tests should be used only for descriptive purposes because the clusters have been
chosen to maximize the differences among cases in different clusters. The observed significance
levels are not corrected for this and thus cannot be interpreted as tests of the hypothesis that the
cluster means are equal.
Number of Cases in each Cluster

Cluster 1 5.000
2 5.000
3 5.000
Valid 15.000
Missing .000
For comparison, a 4-cluster K-Means solution for the example is presented below. The
result shows that the initial Cluster 2 has become 2 clusters, with Mary having her own
cluster. This is consistent with the scatter-plot presented earlier. The 4th cluster now has a
much cleaner interpretation as old and rich, and distances to the center of this cluster have
been reduced considerably. Also we now see that Clusters 1 and 2 are the most distant
from one another.
K Means Clustering Output

Initial Cluster Centers
Cluster
1 2 3 4
Agestd .8991 -.3567 -1.5580 1.1175
Incomestd -1.2614 1.9020 -.4804 1.2606
9
Iteration Historya
Change in Cluster Centers

Iteration 1 2 3 4
1 .338 .000 .387 .434
2 .000 .000 .000 .000
a. Convergence achieved due to no or small change in
cluster centers. The maximum absolute coordinate
change for any center is .000. The current iteration is 2.
The minimum distance between initial centers is 1.608.
Cluster Membership
Case Number Subject Cluster Distance

1 John 4 .168
2 Mary 2 .000
3 Sue 4 .336
4 Peggy 4 .434
5 Bob 4 .308
6 Paul 3 .484
7 Mary 3 .387
8 Jenny 3 .446
9 Julie 3 .611
10 Bill 3 .681
11 Bernie 1 .202
12 Bonnie 1 .338
13 Judy 1 .170
14 Dan 1 .247
15 Mark 1 .314
Final Cluster Centers
Cluster
1 2 3 4
Agestd .7244 -.3567 -1.2304 .7217
Incomestd -.9725 1.9020 -.2738 1.0824
Distances between Final Cluster Centers
Cluster 1 2 3 4
1 3.071 2.076 2.055
2 3.071 2.345 1.354
3 2.076 2.345 2.377
4 2.055 1.354 2.377
10
ANOVA
Cluster Error
Mean Square df Mean Square df F Sig.
Agestd 4.134 3 .145 11 28.483 .000
Incomestd 4.469 3 .054 11 83.049 .000
The F tests should be used only for descriptive purposes because the clusters have been
chosen to maximize the differences among cases in different clusters. The observed significance
levels are not corrected for this and thus cannot be interpreted as tests of the hypothesis that the
cluster means are equal.
Number of Cases in each Cluster

Cluster 1 5.000
2 1.000
3 5.000
4 4.000
Valid 15.000
Missing .000
An Introduction to the ConneCtor PDA Case
This case provides a good example of the format of the typical segmentation study, and of
the data that result from such a study. The datasets for this case contain 15 variables that
describe possible bases for segmentation, and 17 potential segment descriptors. The
analysis can easily be run using the Marketing Engineering software that accompanies the
text (there is also be a copy of the software on the network portal under the heading of
marketing applications).
To run the analysis using the Marketing Engineering software, open ConneCtor PDA
2001 data file in Excel. Cl;ick on segmentation and classification in the ME – XL
window (which appears when the add in software is properly installed). Click on run
segmentation. A menu asking for number of clusters and other details will come up. Enter
the desired number of clusters, click on hierarchical or K means depending on which type
you want, click on “standardize data” to run the analysis on standardized data, and click
on enable discriminant analysis if you want an analysis of descriptors to run. Click on
next when all the boxes have been clicked. The cell range for segmentation analysis
should appear automatically. Click OK, and the cell range for discriminant analysis will
appear automatically. Click Ok, and the analysis will run in Excel. There will be output
windows for segmentation and discrimination. A dendogram windo also appears, but my
version of Excel didn’t fill this in correctly (fortunately it isn’t that important).
11
I will briefly summarize some of the Marketing Engineering program output for the PDA
case. I will present some results for the 3-cluster solution, which is probably fewer than
you would want to work with in doing the case. A dendogram for the Ward solution is
presented below. We see that within cluster distance drops considerably between 3 and 4
clusters (1.11 to .34), and does not drop much thereafter. A summary in Table 1 indicates
that a 3 cluster solution explains about 65% of the distance in the data, a 4 cluster
solution about 89%. Though one might still choose to work with 3 clusters for reasons of
parsimony, this would be throwing away a lot of information relative to 4 or more
clusters.
12
Table 1
Summary of Error Associated with Different Numbers of Clusters – PDA Case
Within Cluster Improvement
Clusters Distance on One Cluster* R-Squared**
1 3.13
2 1.45 1.68 0.54
3 1.11 2.02 0.65
4 0.34 2.79 0.89
5 0.28 2.85 0.91
6 0.27 2.86 0.91
* Equal 3.13 - Column B
** Equal improvement/total distance
Means are presented in Table 2. Cluster 1, which comprises about 58 percent of the
sample, is above average on the use of all devices, such as cell phone and web. Cluster 2,
which comprises about 32 percent of the sample, uses only pager extensively, and has the
highest demand for remote access and a large display. This cluster is willing to pay the
least. Cluster 3, which comprises about 10 percent of the sample, places a high premium
on sharing information rapidly and is willing to pay the most.
Table 2
Cluster Means – PDA Case
Variable Overall CL1 CL2 CL3
x1 - innovator scale 3.44 4.22 2.43 2.19
x2 - freq use pager 5.20 5.31 5.63 3.19
x3 - freq use cell phone 5.62 5.95 5.43 4.31
x4 - freq use personal info tools 3.98 5.04 2.33 3.06
x5 - freq others send time-sensitive info 4.45 4.47 3.88 6.12
x6 - freq use time sensitive info 4.47 4.47 3.90 6.25
x7 - freq need remote access 4.00 3.20 5.04 5.31
x8 - importance of sharing rapidly 3.75 3.35 3.73 6.12
x9 - importance view on large display 4.79 4.34 5.55 5.00
x10 - importance access to e-mail 4.73 5.83 3.31 2.88
x11 - importance web access 4.46 5.76 3.04 1.44
x12 - importance multimedia 3.98 5.17 2.45 1.94
x13 - importance communication
4.63 4.74 4.16 5.50
device
x14 - monthly price 28.70 27.70 25.30 45.30
x15 - invoice price 331.00 335.00 273.00 488.00
Proportion 0.58 0.32 0.10
13
While the cluster means provide the potential bases for segmentation, the descriptor
variables help us to flesh out the identity of the cluster members. Absent data on the bases
for segmentation for those not in the survey used to determine the clusters, the descriptors
can also be used to classify people who were not surveyed into clusters. This can be very
useful for targeting, for example for sending mailings to different people with different
characteristics.
Unless your goal is to predict cluster membership based on descriptors, looking at

variation in descriptor means by cluster is normally sufficient to reveal associations
between descriptors and cluster membership, and there is no need to go any further. The
means in Table 3 indicate that Cluster 1 is associated with higher education, income,
professional, sales; Cluster 2 is associated with being younger, sales, service, middle
income, time away from home; Cluster 3 is construction, emergency, lower income,
education, less use of tech devices. Based on knowledge of the cluster means on both the
bases and descriptors, the clusters need to be named. I will leave it to you to supply the
names.
If you do wish to predict cluster membership based on knowledge of the descriptors only,
discriminant analysis can be done. Discriminant analysis can also lead to useful
insights. Discriminant analysis can be thought of as similar to regression, except that
the dependent variable is membership in a group rather than a continuously scaled
variable like sales volume. So the dependent variable is cluster membership =
whether a person is a member of Cluster 1, Cluster 2, Cluster 3. As in regression, the
object of discriminant analysis is to provide a set of weights for the descriptor
variables that will best explain group membership. Unlike regression, more than one
set of weights can be obtained. The first function (set of weights) is chosen to
maximize variation between groups; the second function is chosen to do the same
subject to being uncorrelated with the first function; and so on. Scores for each
respondent can be obtained on each function, and used to classify each respondent
into a particular group.
I have placed the discriminant function weights (correlations between each descriptor
variable and the corresponding function) in the final two columns of Table 3. These show
the extent to which each descriptor is associated with each discriminant function, i.e., the
extent to which each descriptor separates the clusters. We see that the first function is
positively associated with education, income, owning a PDA, owning a PC, Reading
Business Week, and negatively associated with time away from the office and reading
Field and Stream. These are the variables that best discriminate between the clusters. It
appears that this describes a continuum from white collar in the home office, to blue
collar in the field. From looking at the cluster means it is easy to see how this function
discriminates between the clusters. The second function is harder to interpret. It is
associated with owning a PDA and not being maintenance and service.
14
Table 3
Descriptor Results for 3 Cluster Solution – PDA Case
Correlations with
Discriminant
Means Functions
Variable Overall CL1 CL2 CL3 Func1 Func2
z2 – education 2.506 2.774 2.196 1.938 0.558 0.257
z13 - time away from office 4.206 3.656 4.843 5.375 -0.525 -0.242
z10 - own pda 0.438 0.624 0.176 0.188 0.506 0.423
z12 - own pc 0.981 1.000 1.000 0.813 0.474 -0.399
z16 - read field & stream 0.125 0.032 0.235 0.313 -0.430 -0.217
z3 - income 66.894 72.871 60.529 52.438 0.424 0.146
z14 - read business week 0.275 0.376 0.176 0.000 0.391 0.093
z4 - construction 0.081 0.054 0.059 0.313 -0.329 0.260
z6 - sales 0.300 0.376 0.235 0.063 0.305 0.033
z5 - emergency 0.038 0.022 0.020 0.188 -0.297 0.260
z11 - own cell phone 0.875 0.903 0.902 0.625 0.289 -0.240
z8 - professional 0.162 0.194 0.157 0.000 0.202 -0.077
z17 - read modern gourmet 0.019 0.032 0.000 0.000 0.136 0.110
z7 - maintenance and
service 0.181 0.075 0.392 0.125 -0.231 -0.581
z1 - age 40.006 41.409 36.765 42.188 0.058 0.272
z9 - computer 0.231 0.269 0.137 0.313 0.035 0.264
z15 - read pc magazine 0.244 0.215 0.294 0.250 -0.070 -0.115
Cluster Proportion 0.58 0.32 0.10
15

Notes On Cluster Analysis

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Notes On Cluster Analysis

Hochgeladen von

Copyright:

Verfügbare Formate

Notes on Cluster Analysis

Clustering procedures begin by calculating a matrix of distances between pairs of

Hierarchical procedures have an advantage of directly incorporating the distance between

Numerical Example of Clustering

For a female with a technical Julie

Squared Euclidean Distance

Dendrogram using Ward Method

Rescaled Distance Cluster Combine

Case 5 Clusters 4 Clusters 3 Clusters

Change in Cluster Centers

Case Number Subject Cluster Distance

Final Cluster Centers

Number of Cases in each Cluster

K Means Clustering Output

Change in Cluster Centers

Case Number Subject Cluster Distance

Final Cluster Centers

Distances between Final Cluster Centers

Number of Cases in each Cluster

An Introduction to the ConneCtor PDA Case

Unless your goal is to predict cluster membership based on descriptors, looking at

Das könnte Ihnen auch gefallen