Beruflich Dokumente
Kultur Dokumente
Follow Us
DnI Institute
Build Data and Decision Science Experience
Menu
Like 46
Share
Share
1/11
10/12/2016
Business Scenario: We have height and weight information. Using these two variables, we need
to group the objects based on height and weight information.
If you look at the above chart, you will expect that there are two visible clusters/segments and we
want these to be identified using K Means algorithm.
Data Sample
Height
Weight
185
72
170
56
168
60
179
68
182
72
188
77
180
71
180
70
183
84
180
88
180
67
177
76
Step 1: Input
Dataset, Clustering Variables and Maximum Number of Clusters (K in Means Clustering)
In this dataset, only two variables height and weight are considered for clustering
Height
Weight
185
72
170
56
168
60
179
68
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
2/11
10/12/2016
182
72
188
77
180
71
180
70
183
84
180
88
180
67
177
76
Height
Weight
K1
185
72
K2
170
56
Height
Weight
185
72
170
56
3/11
10/12/2016
Centroid
Cluster
Height
Weight
K1
185
72
K2
170
56
Cluster 2
Assignment
(185-185) +(72-72)
=0
(185-170) +(72-56)
= 21.93
(170-185) +(56-72)
= 21.93
(170-170) +(56-56)
=0
We have considered two observations for assignment only because we knew the
assignment. And there is no change in Centroids as these two observations were only
considered as initial centroids
Step 4: Move on to next observation and calculate Euclidean Distance
Height
Weight
168
60
(168-185) +(60-72)
=20.808
(168-185) +(60-72)
= 4.472
Since distance is minimum from cluster 2, so the observation is assigned to cluster 2. Now revise
Cluster Centroid mean value Height and Weight as Custer Centroids. Addition is only to cluster
2, so centroid of cluster 2 will be updated
Updated cluster centroids
Updated Centroid
Cluster
K=1
K=2
Height
Weight
185
72
(170+168)/2
(56+60)/2
= 169
= 58
Step 5: Calculate Euclidean Distance for the next observation, assign next observation based on
minimum euclidean distance and update the cluster centroids.
Next Observation.
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
4/11
10/12/2016
Height
Weight
179
68
Euclidain
Distance
Distance
from
from
Cluster 1
Cluster 2
Assignment
7.211103 14.14214
Updated Centroid
Height
Weight
K=1
182
70.6667
K=2
169
58
Cluster Centroids
Cluster
Updated Centroid
Height
Weight
K=1
182.8
72
K=2
169
58
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
5/11
10/12/2016
K Means
k means clustering algorithm, k means clustering example, k means clustering explained, k means steps,
simple explanation k means, Working of k means
Interview Process - Evaluating Analytical Skills
Facebook Groups - Who is contributing?
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
6/11
10/12/2016
Vishal Nigam
September 25, 2016 at 1:57 pm | Reply
DnI Institute
September 25, 2016 at 2:21 pm | Reply
Thanks Vishal
Nitesh
October 8, 2016 at 5:55 am | Reply
Very good..example..
but there is a text mistake in step 4.. euclidean distance from cluster 2
DnI Institute
October 8, 2016 at 6:13 am | Reply
Leave a Comment
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
7/11
10/12/2016
Name *
Email *
Website
Post Comment
Search
Categories
Campaign Analytics
Career
Cross Sell Modeling
Customer Analytics
Customer Retention
Decision Tree
Forecasting
Fraud Analytics
Insurance
jobs
K Means
Logistic Regression
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
8/11
10/12/2016
Machine Learning
Market Basket
Multiple Regression
Next Best Action
Predictive Modeling
R
R for Data Science
R Visualization
Random Forest
Retail Analytics
Risk Analytics
SAS
Segmentation
Social Media
Statistical Tests
Statistics
Support Vector Machine
Survival Model
Technology
Tool
Views
Views
Chi Square Test using SAS
ANOVA using SAS and Example
Retain Statement - Explained with Examples
Machine Learning for Retailers
Machine Learning - Steps to Build Regression Model
Scenarios: Binary Predictive Models
Logistic Regression using R: German Credit Example
10 Most Commonly Used Character Functions in SAS
K Means Clustering Examples and Practical Applications
Data Science for Schools and Educational Institutes
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
9/11
10/12/2016
Offerings
Trainings
Internship
Mentorship
Data Science Views
Consulting
Login
You are not logged in.
Username
Password
10/11
10/12/2016
http://dni-institute.in/blogs/k-means-clustering-algorithm-explained/
11/11