Mda12 PDF

Mineração de Dados Aplicada
Clustering
Loı̈c Cerf
September, 1st 2018
DCC – ICEx – UFMG
Example of applicative problem
Student profiles
Given the marks students received for different courses, group the
students so that two students in a same group received about the
same marks for each course and two students in different groups
have different profiles.
Loı̈c Cerf Mineração de Dados Aplicada

2 / 46
N
Outline
1 Definition
2 Classical Algorithms
3 Assessing a Clustering
4 Case study
5 Clustering in KNIME

3 / 46
N
Definition
Outline
1 Definition
4 Case study

4 / 46
N
Definition
An optimization problem
Definition
Partitioning the objects into clusters so that each cluster contains
similar objects and objects in different clusters are dissimilar.
Input:
a1 a2 ... an
o1 d1,1 d1,2 ... d1,n
o2 d2,1 d2,2 ... d2,n
.. .. .. .. ..
. . . . .
om dm,1 dm,2 . . . dm,n

5 / 46
N
Definition
Definition
Partitioning the objects so that the intra-cluster similarities are
maximized and the inter-cluster similarities are minimized.
Input:
a1 a2 ... an
o1 d1,1 d1,2 ... d1,n
o2 d2,1 d2,2 ... d2,n
.. .. .. .. ..
. . . . .
om dm,1 dm,2 . . . dm,n

5 / 46
N
Definition
Definition
Partitioning the objects so that the intra-cluster similarities are
maximized and the inter-cluster similarities are minimized.
Output:
a1 a2 ... an cluster
o1 d1,1 d1,2 ... d1,n c1
o2 d2,1 d2,2 ... d2,n c2
.. .. .. .. .. ..
. . . . . .
om dm,1 dm,2 . . . dm,n cm

5 / 46
N
Definition
Illustration
Clustering objects, described with two interval-scaled attributes,
using the Euclidean distance.
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
6 / 46
N
Definition
Illustration
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
6 / 46
N
Definition
Illustration
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
6 / 46
N
Definition
Illustration
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
6 / 46
N
Definition
Inductive database formalism
Querying patterns:
{X ∈ P | Q(X , D)}
where:
D is the dataset,
P is the pattern space,
Q is an inductive query.

7 / 46
N
Definition
Querying a clustering:
{X ∈ P | Q(X , D)}
where:
D is the dataset,

7 / 46
N
Definition
{X ∈ P | Q(X , D)}
where:
D is a set of objects O associated with a similarity measure,

7 / 46
N
Definition
{X ∈ P | Q(X , D)}
where:

∀` ∈ {1, . . . , k}, C` 6= ∅

O k
P is {(C1 , . . . , Ck ) ∈ (2 ) | ∀m 6= `, C` ∩ Cm 6= ∅ },

 k
∪`=1 C` = O

7 / 46
N
Definition

{X ∈ P | Q(X , D)}
where:

∀` ∈ {1, . . . , k}, C` 6= ∅

P is {(C1 , . . . , Ck ) ∈ (2O )k | ∀m 6= `, C` ∩ Cm 6= ∅ },

 k
∪`=1 C` = O
Q is a function to optimize. It quantifies how similar are pairs of
objects in a same cluster and/or how dissimilar are those in two
different clusters.

7 / 46
N
Definition

{X ∈ P | Q(X , D)}
where:

∀` ∈ {1, . . . , k}, C` 6= ∅

P is {(C1 , . . . , Ck ) ∈ (2O )k | ∀m 6= `, C` ∩ Cm 6= ∅ },

 k
∪`=1 C` = O
Q is a function to optimize. It quantifies how similar are pairs of
objects in a same cluster and/or how dissimilar are those in two
different clusters.
Variants exist, e. g., authorizing some overlapping of the clusters.

7 / 46
N
Definition
Inexactness
Every object influences the clustering and the number

|O|of
ways to
k
partition |O| objects in k ∈ N clusters is huge: O k! .

8 / 46
N
Definition
Inexactness

|O|of
ways to
k
That is why clustering is usually solved in an approximate way.

8 / 46
N
Definition
Inexactness

|O|of
ways to
k
That is why clustering is usually solved in an approximate way.
Domain decomposition consists of using a cheap clustering method

to get coarse clusters that other clustering algorithms can
independently process in a second step.

8 / 46
N
Classical Algorithms
Outline
1 Definition
4 Case study

9 / 46
N
Hierarchical agglomeration: illustration

(Agglomerative) hierarchical clustering of objects, described with
two interval-scaled attributes, using the Euclidean distance.
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
10 / 46
N
Dendrogram

11 / 46
N
Distance

12 / 46
N
Hierarchical agglomeration: algorithm

A greedy algorithm:
1 Initialize the clusters with the individual objects;

13 / 46
N

A greedy algorithm:
2 At each iteration, agglomerate the two closest clusters;

13 / 46
N

A greedy algorithm:
3 Stop when the desired number of clusters is reached.

13 / 46
N

A greedy algorithm:
No guarantee whatsoever on the optimality of the solution;

13 / 46
N

A greedy algorithm:

Choice of a similarity between clusters;

13 / 46
N

A greedy algorithm:

Quadratic or cubic time complexity in the number of objects;

13 / 46
N

A greedy algorithm:

The process should stop before agglomerating dissimilar clusters (no
need to guess this number beforehand) and the clusters are
hierarchically organized;

13 / 46
N

A greedy algorithm:

Outlier detection (cluster containing one single object).

13 / 46
N

A greedy algorithm:

Outlier detection (cluster containing one single object).

13 / 46
N
Linkage criteria
The similarity between two clusters can be defined as:
Complete linkage the worst similarity between any pair of objects
taken from the two clusters;
to provide:
Complete linkage spherical clusters of approximately equal
diameters (clustering computed in O(|O|2 ) time);

14 / 46
N
Linkage criteria
Single linkage the best similarity between any pair of objects taken
from the two clusters;
to provide:
Single linkage “chains” of similar objects (clustering computed in
O(|O|2 ) time);

14 / 46
N
Linkage criteria
Single linkage the best similarity between any pair of objects taken
from the two clusters;
Group average linkage the average similarity over all pairs of
objects taken from the two clusters.
to provide:
Single linkage “chains” of similar objects (clustering computed in
O(|O|2 ) time);
Group average linkage the most natural linkage (clustering
computed in O(|O|3 ) time).
14 / 46
N
Divisive hierarchical clustering
All objects are initially in one single cluster. Every cluster is

recursively split into two until every object is alone in a cluster.

15 / 46
N
Divisive hierarchical clustering
All objects are initially in one single cluster. Every cluster is

recursively split into two until every object is alone in a cluster.
Considering all possible split to find the best one takes exponential
time. That is why a split is usually find in an approximate way,
e. g., using 2-means.

15 / 46
N
k-means: illustration
3-means clustering of objects, described with two interval-scaled
attributes, using the Euclidean distance.
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
16 / 46
N
k-means: algorithm
Seeking the centers of k clusters by expectation-maximization:
1 Randomly choose k centers µ1 , . . . , µk in the object space;

17 / 46
N
k-means: algorithm
2 Until convergence or a specified maximal number of iterations:

17 / 46
N
k-means: algorithm
E Assign each object to the cluster C` with the closest center µ` ;

17 / 46
N
k-means: algorithm
M Update the center µ` of each cluster to the mean of the objects
assigned to it.

17 / 46
N
k-means: algorithm
assigned to it.
The number of clusters must be guessed beforehand;

17 / 46
N
k-means: algorithm
assigned to it.

Pk P
Convergence to a local minimum of `=1 o∈C` ko − µ` k2 ;

17 / 46
N
k-means: algorithm
assigned to it.

Pk P
Spherical clusters of approximately equal diameters;

17 / 46
N
k-means: algorithm
assigned to it.

Pk P
Sensible to outliers, which should be removed beforehand
(k-medoids uses the Manhattan distance and the median).

17 / 46
N
k-means: algorithm
assigned to it.

Pk P
Linear time complexity in the number of objects, attributes, clusters
and iterations (small in practice).
17 / 46
N
k-means: algorithm
assigned to it.

Pk P
Linear time complexity in the number of objects, attributes, clusters
and iterations (small in practice).
17 / 46
N
The elbow method
Plot in
Pfunction
P of k a measure of the quality of the clustering,
e. g., k`=1 o∈C` ko − µ` k2 that k-means locally minimizes.
Choose k after a large drop.

18 / 46
N
The elbow method
Plot in
Pfunction
More principled methods exist, e. g., finding the best trade-off

between quality and compression.

18 / 46
N
The elbow method
Plot in
Pfunction
More principled methods exist, e. g., finding the best trade-off

between quality and compression.
No such method is implemented in KNIME. If the time complexity

of a hierarchical agglomeration (using a complete linkage for
similarly-shaped clusters) is not prohibitive, the number of clusters
can be chosen from the dendrogram. Outliers can be identified
(and removed) in this way too.

18 / 46
N
Tendency to produce equi-sized clusters
Dataset k-means clustering EM clustering

0.9 0.9 0.9
0.8 0.8 0.8
0.7 0.7 0.7
0.6 0.6 0.6
0.5 0.5 0.5
0.4 0.4 0.4
0.3 0.3 0.3
0.2 0.2 0.2
0.1 0.1 0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

19 / 46
N
EM
The dataset D is seen as a random sample from an

|A|-dimensional random variable O (i. e., independent and
identically distributed) whose probability density function is given
as a mixture model of the k clusters.

20 / 46
N
EM

EM searches, by expectation-maximization, a parametrization of

the model that locally maximizes the likelihood that D is indeed a
random sample of O.

20 / 46
N
EM

EM searches, by expectation-maximization, a parametrization of

the model that locally maximizes the likelihood that D is indeed a
random sample of O.
The distribution of a cluster usually is assumed multivariate

normal, thus parametrized with a location (the center of the
cluster) and a covariance matrix.

20 / 46
N
EM with |A| = 1 and k = 2: illustration
EM clustering of objects in a one-dimensional space.
Dataset:

21 / 46
N
Iteration 1:

21 / 46
N
Iteration 5:

21 / 46
N
k-means specializes EM
k-means specializes EM:

EM the probability of a cluster given an object weights
the contribution of the object to the cluster;

22 / 46
N

k-means that probability is 1 for the cluster with the closest
center, 0 for the other clusters.

22 / 46
N

k-means that probability is 1 for the cluster with the closest
center, 0 for the other clusters.
Like k-means, EM:

requires the number of clusters to be guessed beforehand;
converges to a local optimum of the objective function (the
likelihood, i. e., the probability of the data given the mixture model);
is sensible to outliers, which should be removed beforehand.

22 / 46
N
k-means vs. EM
k-means produces spherical clusters of approximately equal

diameters, whereas EM produces ellipsoidal clusters of any sizes.

23 / 46
N
k-means vs. EM

k-means is faster than EM: computing the probabilities of every

cluster given an object requires an O(k|A|2 ) time for a Gaussian
mixture (O(k|A|) for k-means); the convergence is slower because
EM must learn the O(k|A|2 ) real parameters of a Gaussian
mixture (O(k|A|) for k-means).

23 / 46
N
k-means vs. EM

k-means is faster than EM: computing the probabilities of every

cluster given an object requires an O(k|A|2 ) time for a Gaussian
mixture (O(k|A|) for k-means); the convergence is slower because
EM must learn the O(k|A|2 ) real parameters of a Gaussian
mixture (O(k|A|) for k-means).
The covariances can be fixed to 0 (diagonal covariance matrices)

so that EM only learns and uses 2k|A| real parameters, hence a
reduced running time... and quality.

23 / 46
N
EM with full covariance matrices

24 / 46
N
EM with diagonal covariance matrices

25 / 46
N
Fuzzy c-means
Fuzzy c-means is k-means with a fuzzy (rather than crisp)

membership of every object to every cluster: every object is
associated with k normalized weights. The weights increase with
the similarity between the object and the center of the cluster. A
hyper-parameter controls how fast they increase. The
maximization step becomes the computation of a weighted mean.

26 / 46
N
Fuzzy c-means

Like EM, fuzzy c-means associates every object with degrees of

membership to every cluster. Besides that, it has the same
advantages and drawbacks as k-means.

26 / 46
N
Fuzzy c-means

Like EM, fuzzy c-means associates every object with degrees of

membership to every cluster. Besides that, it has the same
advantages and drawbacks as k-means.
EM is not included in KNIME. Fuzzy c-means is.

26 / 46
N
Non-convex clusters (kernel 3-means)

27 / 46
N
Non-convex clusters (3-means)

27 / 46
N
Kernel k-means and spectral clustering
Problem
k-means, EM and fuzzy c-means only find convex clusters.

28 / 46
N
Problem
Ideas
Kernel methods Using a nonlinear function, the objects are

mapped to a higher-dimensional space where the
clusters hopefully become convex;
Spectral methods Mapping the objects to a space whose base
is the eigenvectors of an affinity matrix.

28 / 46
N
Problem
Ideas
Kernel methods Using a nonlinear function, the objects are

mapped to a higher-dimensional space where the
clusters hopefully become convex;
Spectral methods Mapping the objects to a space whose base
is the eigenvectors of an affinity matrix.
Both families of methods are related.

28 / 46
N
Kernel k-means
Given, conceptually, a nonlinear mapping Φ of the objects to a

higher-dimensional space, kernel k-means is k-means in this space.

29 / 46
N
Kernel k-means
Given, conceptually, a nonlinear mapping Φ of the objects to a

higher-dimensional space, kernel k-means is k-means in this space.
Knowing the scalar product κ(o, µ` ) = Φ(o) · Φ(µ` ) is enough to

compute kΦ(o) − Φ(µ` )k2 , i. e., Φ needs not be applied. κ is a
continuous, symmetric and semi-definite function:
Polynomial kernel κ(o, µ` ) = (o · µ` + a)b ;
o·µ`
Gaussian kernel κ(o, µ` ) = e − 2σ2 ;
Sigmoid kernel κ(o, µ` ) = tanh(a(o · µ` ) + θ);
...

29 / 46
N
Normalized cut (a spectral clustering)
Definition
Removing from the (non-negative and symmetric) similarity
graph the edges with a small total weight so that k “reason-
ably large” connected components are obtained.

30 / 46
N
Definition
O k
P the partitioning (C1 , . . . , Ck ) ∈ (2 )
Approximately compute
Pk o ∈C ,o ∈O\C` s(o i ,o j )
that minimizes `=1 Pi ` j s(oi ,oj ) .
oi ∈C` ,oj ∈O

30 / 46
N
Definition
O k
P the partitioning (C1 , . . . , Ck ) ∈ (2 )
Approximately compute
Pk o ∈C ,o ∈O\C` s(o i ,o j )
that minimizes `=1 Pi ` j s(oi ,oj ) .
oi ∈C` ,oj ∈O
Method
Extract the k smallest eigenvectors of an affinity matrix, e. g.,
the normalized Laplacian of the similarity matrix. Cluster (e. g.,
with k-means) the objects rewritten w.r.t. these k attributes.

30 / 46
N
Assignment of new objects to clusters
(Kernel) k-means, EM and fuzzy c-means explicitly model every

cluster with a center (and a covariance matrix for EM).

31 / 46
N

As a consequence, a new object can be assigned to the most

probable cluster. In KNIME, “Cluster Assigner” does so.

31 / 46
N

As a consequence, a new object can be assigned to the most

probable cluster. In KNIME, “Cluster Assigner” does so.
The completed clustering is not guaranted to be a local extremum

of the objective function.

31 / 46
N
DBSCAN: illustration
DBSCAN clustering of objects, described with two interval-scaled
attributes, using the Euclidean distance.
x y
o1 91 70
o2 129 91
o3 359 243
o4 322 254
o5 100 104
o6 464 113
o7 342 297
o8 410 65
o9 334 329
.. .. ..
. . .
32 / 46
N
DBSCAN: algorithm
A density-based algorithm:
1 At each iteration, choose an unlabeled object;

33 / 46
N
DBSCAN: algorithm
2 List the sufficiently similar objects;

33 / 46
N
DBSCAN: algorithm
3 If there are too few of them, label the object as outlier;

33 / 46
N
DBSCAN: algorithm
4 Otherwise cluster these objects as well as those listed by the same
recursive process applied to the newly clustered objects.

33 / 46
N
DBSCAN: algorithm
One single user-defined density for all clusters (OPTICS addresses

this problem);

33 / 46
N
DBSCAN: algorithm

this problem);
Choice of a similarity (shape of the clusters);

33 / 46
N
DBSCAN: algorithm

this problem);
O(|O| log |O|) average time complexity using an appropriate index
structure (O(|O|2 ) worst case);

33 / 46
N
DBSCAN: algorithm

this problem);
Outlier detection;

33 / 46
N
DBSCAN: algorithm

this problem);
Outlier detection;
Single linkage.
33 / 46
N
Configuration
Configuring data mining algorithms is hard. It often relies on

sampling in the hyper-parameter space and keeping the best
output. Metaheuristics can be used too.
Nevertheless, understanding the various algorithms and the effect

of their hyper-parameters helps.

34 / 46
N
Configuration
Configuring data mining algorithms is hard. It often relies on

sampling in the hyper-parameter space and keeping the best
output. Metaheuristics can be used too.
Nevertheless, understanding the various algorithms and the effect

of their hyper-parameters helps.

34 / 46
N
Assessing a Clustering
Outline
1 Definition
4 Case study

35 / 46
N
An unsupervised task
Clustering is an unsupervised task: it is about discovering a hidden

organization of the objects.

36 / 46
N
An unsupervised task
Clustering is an unsupervised task: it is about discovering a hidden

organization of the objects.
As a consequence, there are only intrinsic measures of the quality

of a clustering.

36 / 46
N
Intra and inter-cluster similarities
Given one cluster, the intra-cluster similarity can be defined as:

the minimal similarity between two objects in the cluster;
Given two clusters, the inter-cluster similarity can be defined as:

the maximal similarity between objects in the two clusters;

37 / 46
N

the average similarity between two objects in the cluster;

the average similarity between objects in the two clusters;

37 / 46
N

the average similarity between two objects in the cluster;
the average similarity to the center of the cluster.

the average similarity between objects in the two clusters;
the similarity between the centers of the two clusters.

37 / 46
N
Internal evaluation
BetaCV the ratio of the average intra-cluster similarity to the

average inter-cluster similarity;

38 / 46
N
Internal evaluation

Dunn the ratio of the minimal intra-cluster similarity to the
maximal inter-cluster similarity;

38 / 46
N
Internal evaluation

Davies-Bouldin the average similarity between each cluster and its
most similar one (minimal ratio of the sum of the
two intra-cluster similarities to their inter-cluster
similarity), averaged over all the clusters;

38 / 46
N
Internal evaluation

Davies-Bouldin the average similarity between each cluster and its
most similar one (minimal ratio of the sum of the
two intra-cluster similarities to their inter-cluster
similarity), averaged over all the clusters;
Silhouette for each object, the difference between the average
similarity to the objects in the same cluster and the
greatest average similarity to the objects in another
cluster divided by the greatest term.

38 / 46
N
Comparing clusterings
A quality measure is not meaningful, unless compared to that of

another clustering:
of the same dataset to select the best clustering;

39 / 46
N
Comparing clusterings
A quality measure is not meaningful, unless compared to that of

another clustering:
of the same dataset to select the best clustering;
of a randomized version of the dataset to have an information
about the tendency of the objects to be clustered.

39 / 46
N
Randomization of a dataset
Uniform distribution between the Normal distribution parametrized

extrema of each attribute: from the dataset:

40 / 46
N
Stability of a clustering
If a clustering method involves randomness (e. g., k-means), the

stability of its output over several runs is an indicator of its quality.

41 / 46
N

The clustering to keep is the one with the best quality.

41 / 46
N

In KNIME, the k first objects are the initial centers of k-means or

fuzzy c-means. The “Shuffle” node can be used upstream for an
actual random initialization.

41 / 46
N

In KNIME, the k first objects are the initial centers of k-means or

fuzzy c-means. The “Shuffle” node can be used upstream for an
actual random initialization.

41 / 46
N
Similarity between two partitions
A correlation between nominal attributes (the two partitions)

measures their similarity. The entropy (“Entropy scorer” in
KNIME) is such a measure.

42 / 46
N

The Fowlkes-Mallows index, the Rand index and the adjusted Rand
index (all absent from KNIME) are alternatives. They are all based
on the number of pairs of objects that are in the same/different
cluster(s) in one clustering and in the same/different cluster(s) in
the other clustering.

42 / 46
N

The Fowlkes-Mallows index, the Rand index and the adjusted Rand
index (all absent from KNIME) are alternatives. They are all based
on the number of pairs of objects that are in the same/different
cluster(s) in one clustering and in the same/different cluster(s) in
the other clustering.
Those same measures can help to interpret a clustering, correlating

it with an external nominal attribute, which is not used to cluster.

42 / 46
N
Case study
Outline
1 Definition
4 Case study

43 / 46
N
Clustering in KNIME
Outline
1 Definition
4 Case study

44 / 46
N
Clustering in KNIME
Clustering in KNIME
Practice
1 After figuring out an appropriate k with a hierarchical clustering,

cluster, with k-means, the bears in bears.csv according to
their attributes Headlen, Headwth and Chest.
2 Test the stability of k-means’ clustering.
3 Does the sex partly explain the clustering? The age?

45 / 46
N
License
2012–2018
c Loı̈c Cerf
These slides are licensed under the Creative Commons
Attribution-ShareAlike 4.0 International License.

46 / 46
N

Mda12 PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Mda12 PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Loı̈c Cerf Mineração de Dados Aplicada

Inductive database formalism

Variants exist, e. g., authorizing some overlapping of the clusters.

Every object influences the clustering and the number

Loı̈c Cerf Mineração de Dados Aplicada

Every object influences the clustering and the number

That is why clustering is usually solved in an approximate way.

Loı̈c Cerf Mineração de Dados Aplicada

Every object influences the clustering and the number

That is why clustering is usually solved in an approximate way.

Domain decomposition consists of using a cheap clustering method

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: illustration

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Hierarchical agglomeration: algorithm

No guarantee whatsoever on the optimality of the solution;

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Loı̈c Cerf Mineração de Dados Aplicada

Divisive hierarchical clustering

All objects are initially in one single cluster. Every cluster is

Loı̈c Cerf Mineração de Dados Aplicada

Divisive hierarchical clustering

All objects are initially in one single cluster. Every cluster is