Sie sind auf Seite 1von 4

CLUSTERING CONSTRAINT BASED CLUSTER

ANALYSIS
Constraint based Clustering
Constraint based Clustering finds clusters that
satisfy user-specified preferences or constraints
Desirable to have the Clustering process take the user
preferences and constraints into consideration
Expected number of clusters
Maximal / Minimal Cluster size
Weights for dimensions / Important dimensions
Mining becomes focused
Categories of Constraints
Constraints on Individual objects

Ex: Luxury mansions worth over a million dollars

Processed through selection


Constraints on the selection of Clustering parameters

Number of clusters, radius, MinPts

Not strictly constraint based clustering


Constraints on distance or similarity functions

Different measures for specific attributes /


Objects

Weighting process Clustering with obstacle


objects
User specified constraints on properties of individual
clusters

Clusters satisfy given properties


Semi-supervised clustering based on partial
supervision

Pair-wise constraints

Clustering with Obstacle Objects


City rivers, lakes, bridges, roads etc
Obstacles must be avoided
Distance function between objects must be re-defined
Straight ine distance is meaningless
When using a partitioning approach distance
calculation with obstacles becomes expensive
k-means not suitable as cluster centre may lie
on an obstacle
k-medoids can be used and distance between
objects can be determined using triangulation
Point p is visible from q in region R if straight line
between p and q does not intersect any obstacle
Visibility graph - VG
Each vertex of the obstacle has a corresponding
node
Edge between two vertices only if they are visible
to each other
Additional points can be added and paths can be
determined
To reduce cost of distance computation points can be
grouped into micro-clusters
Triangulate a region
Group nearby points in same triangle into micro
clusters
Process micro-clusters instead of points
Computation of shortest paths in terms of:

VV indices pair of obstacle objects

MV indices for pair of micro-cluster and


obstacle objects

User-Constrained Cluster Analysis


Example: Relocating package delivery centres
N customers : high-value and ordinary customers
Determine locations for k service stations
Constraints
Each station should server
At least 100 high value customers
At least 5000 ordinary customers
Constrained Optimization problem
Direct Mathematical approach is expensive
Micro-Clustering
Initially find a partition of k-groups satisfying given

constraints
Iteratively refine solution
Move m customers from cluster Ci to Cj if Ci has
atleast m surplus customers
Movement done if total sum of distances (objects
Centers) is reduced
Can be directed by selecting promising points
Dead lock has to be avoided (constraint cannot be
satisfied)
Instead of points can work on micro-clusters
Semi-Supervised Cluster Analysis
Constraint based Semi-supervised Clustering
Relies on user provided labels or constraints
Initialize based on labeled objects
Modify Objective function
Distance based Semi-supervised clustering
Adaptive distance measure trained to satisfy
labels or constraints

CLTree (Clustering based on decision TREEs)

Integrates
unsupervised
clustering
with
supervised classification
Transforms clustering task into Classification

Points to be clustered Y

Adds a set of non-existence points - N

Non-existence points

Not added physically


For decision tree construction only number of N
points are needed not actual points
At the root node, the number of inherited N
points is 0.
At any current node, E, if the number of N
points inherited from the parent node of E is less
than the number of Y points in E, then the
number of N points for E is increased to the
number of Y points in E.
Basic idea is to use an equal number of N
points to the number of Y points.
Decision tree Splitting
Information gain
CLTree forms initial cuts and looks ahead to find
better partitions that cut less into cluster regions
CLTree
Handles high dimensional space
Sub space clusters are determined
Empty regions can also be detected

Das könnte Ihnen auch gefallen