Beruflich Dokumente
Kultur Dokumente
Density – members of the cluster are grouped by III. HOW THE K-MEAN CLUSTERING
regions where observations are dense and similar ALGORITHM WORKS
In Phase-I, we find the initial clusters, while in If D of Dp and M is less than or equal to other
Phase-II, data elements are moved in appropriate distances of Mi (1 i k) then Dp stay in same cluster
clusters. Else Dp having less D is assigned to
Phase-I: To find the initial clusters Corresponding Ci
Input: Array {a1, a2, a3,…… an}, K //Number of For each cluster Cj (1<= j<= k), Recompute the M
Required clusters and move Dp until no change in clusters.
Output: A set of Initial Clusters.
Steps:
Find the size of cluster Si (1= i = k) byFloor (n/k).
Where n= number of data points Dp (a1, a2, a3,
…… an) K= number of clusters.
Create K number of Arrays Ak
Move data points (Dp) from Input Array to Ak
untill Si =Floor (n/k).
Continue Step 3 untill all Dp removedfrom input
array
Exit with having k initial clusters.
V. CONCLUSION
In our work we have combined various similarity
measures to generate an effective matching function.
Effectiveness of the matching function depends upon
all similarity measures based on weight given by
genetic algorithm. So to have an effective matching
function both semantic and syntactic aspects should
be taken into consideration while choosing similarity
measures.In basic K-mean clustering, initial clusters
are based on randomly selected centroids.
In this papers an enhanced K-mean algorithm is
introduced and compared with the basic K-mean
algorithm. In enhanced K-mean clustering algorithm
any type of integer data is used. The performance of
basic K-mean clustering algorithm in terms of
number of iterations and time complexity is improved
.ACKNOWLEDGMENT
I would like to express my gratitude to my Guide “Mr.
Sachin Shrivastava” for his support, guidance and
helps throughout this research .The Research on “An
Optimized K-Means Algorithm” has been given to me as
part of curriculum in two years master’s degree in
computer science & engineering. I have tried my best
to present this information as clearly as possible using
the basic terms.I will failed my duty if I don’t
acknowledge esteemed scholarly guidance, assistance
and knowledge
REFERENCES
[1] Malay K. Pakhira, “A Modified K Means
algorithm to avoid empty clusters”,
International Journal of Recent Trends in
Engineering, Vol 1, No. 1, May 2009 .
[2] Anil K. Jain, M. N. Murty, P. J. Flynn, “Data
Clustering: A Review,” ACM Computing Su
rveys, 31(3).
[3] T. Kanungo, D. M. Mount, N. Netanyahu, C.P
iatko,R. Silverman, and A. Y. Wu, “An efficie
ntkmeansclustering algorithm: Analysis andim
plementation”IEEETransaction Pattern Analys
is and Machine Intelligence,2002.
[4] Kiri Wagstaff and Claire Cardie Department o
fcomputer science, Cornell University, USA “
Constrainedk- means algorithm with backgrou
nd knowledge”.