Sie sind auf Seite 1von 6

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)

Implementation of Image Segmentation for Natural Images


using Clustering Methods
Saiful Islam1, Dr. Majidul Ahmed2
1

Assistant Professor, Dhamdhama Anchalik College, Department of Computer Science (HoD), Assam, INDIA
Assistant Professor, Gauhati Commerce College, Department of Information Technology (HoD), Assam, INDIA

The image segmentation for natural images is one of the


classical problems in computer vision. The task of
partitioning a natural image into regions with homogeneous
texture, commonly referred to as image segmentation, is
widely accepted as a crucial function for high-level image
understanding, significantly reducing the complexity of
content analysis of images. Image segmentation is typically
used to locate objects and boundaries in images. It is one of
the most difficult tasks in image processing because it
determines the quality of the final result of analysis. There
are many applications of natural image segmentation. Some
of them applications are Medical imaging, Diagnosis,
Treatment planning, Face recognition, Iris recognition,
Fingerprint recognition, Traffic control systems, and so on.

Abstract Natural image is one of the fundamental


problems in image processing and Computer Vision. Image
segmentation is the process of partitioning an image into
multiple meaningful regions or sets of pixels with respect to a
particular application. Image segmentation is a critical and
essential component of image analysis system. In literature,
there are many image segmentation techniques. One of the
most important techniques is Clustering methods for natural
image segmentation. Clustering methods were one of the first
techniques used for the segmentation of natural images.
Clustering in image segmentation is defined as the process of
identifying groups of similar image primitive. The purpose of
clustering is to get meaningful result, effective storage and fast
retrieval in various fields. In literature, there are many
Clustering methods for natural image segmentation. In this
paper, we used three Clustering methods to implement and
comparisons between them for segmentation of Natural
images and they are K-Means clustering, K-Medoids
clustering and Hierarchical clustering.

B. Clustering
The process of grouping a set of physical or abstract
objects into classes of similar objects is called clustering. A
cluster is a collection of data objects that are similar to one
another within the same cluster and are dissimilar to the
objects in other clusters. A cluster is an ordered list of
objects, which have some common characteristics.
Clustering is a data mining technique used in statistical data
analysis, data mining, pattern recognition, image analysis
etc.
Clustering is a classification technique. Given a vector of
N measurements describing each pixel or group of pixels in
an image, a similarity of the measurement vectors and
therefore their clustering in the N-dimensional
measurement space implies similarity of the corresponding
pixels or pixel groups. Therefore, clustering in
measurement space may be an indicator of similarity of
image regions, and may be used for segmentation purposes.
The vector of measurements describes some useful image
feature and thus is also known as a feature vector.
Similarity between image regions or pixels implies
clustering in the feature space.

Keywords Clustering Methods, Hierarchical Clustering,


K-Means Clustering, K-Medoids Clustering, MATLAB,
Natural Image Segmentation.

I. INTRODUCTION
A. Image and Image Segmentation
An image is an two dimensional array or matrix of
square pixels arranged in columns and rows. Each pixel
represents the color or gray at a single point in the image.
An image comes from imago which is Latin word.
Natural images consist of an overwhelming number of
visual patterns generated by very diverse stochastic
processes in nature. Natural images are particularly noisy
due to the environment they were produced. A digital
image is a numeric representation (binary) of a twodimensional image. Any image from a scanner, or from a
digital camera, or in a computer, is a digital image.
The segmentation of natural images has become a very
important task in todays scenario. Image segmentation is a
fundamental process in many image, video, and computer
vision applications.

175

International Journal of Emerging Technology and Advanced Engineering


Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)
K-Means is the clustering algorithm used to determine
the natural spectral groupings present in a data set. This
algorithm partitions the image into K clusters (C1,
C2,.., CK), represented by their centers or means.
The center of each cluster is calculated as the mean of all
the instances belonging to that cluster. The main idea is to
define k centroids, one for each cluster. These centroids
should be placed in a cunning way because of different
location causes different result. Therefore, the better option
is to place them as possible far away from each other. The
next step is to take each point belonging to a given data set
and associate it to the nearest centroid. When no point is
pending, the first step is completed and an early group age
is done. At this point, it is need to re-calculate k new
centroids as centers of the clusters resulting from the
previous step. After these k new centroids, a new binding
has to be done between the same data points and the nearest
new centroid. As a result a loop has been generated and this
loop we may notice that the k centroids change their
location step by step until no more changes are done. In
other words centroids do not move any more. Finally, this
algorithm aims at minimizing an objective function. The
K-means algorithm is an iterative technique that is used to
partition an image into K clusters.
The K-Means clustering algorithm is composed of the
following steps:
1. Place K points into the space rep resented by the
objects that are being clustered. These points
represent initial group centroids.
2. Assign each object to the group that has the
closest centroid.
3. When all objects have been assigned, recalculate
the positions of the K centroids.
4. Repeat Step 2 and 3 until the centroids no longer
move.
Experiment: The K-Means clustering algorithm is
implemented using MATLAB and tested with the
following image.

Fig.1. Similar data points grouped together into Clusters.

II. OBJECTIVES OF THIS PAPER


The objectives of this paper are as below
1. In this paper, we identify multiple objectives associated
with image segmentation problems for natural images.
2. The main aim of this paper is, provide to implement and
comparisons the three most important Clustering methods
like K-Means clustering, K-Medoids clustering and
Hierarchical clustering for natural image segmentation and
to find their advantages and disadvantages.
III. METHODOLOGY (CLUSTERING METHODS)
Clustering methods are commonly applied in image
segmentation and statistic. Clustering methods can be
classified into Supervised Clustering and Unsupervised
Clustering. A supervised clustering demands human
interaction to decide the clustering criteria and it includes
hierarchical approaches such as relevance feedback
methods. On the other hand, an unsupervised clustering
decides the clustering criteria by itself and it includes
density based clustering methods. According to the
characteristics of clustering algorithm, we can roughly
subdivide into partitional algorithms and hierarchical
algorithms. A partitional algorithm divides a data set into a
single partition, whereas a hierarchical algorithm divides a
data set into a sequence of nested partitions. There are
many clustering methods in the literature for natural image
segmentation. Among these methods we have used the
most commonly used clustering methods for natural image
segmentation are K-Means clustering, K-Medoids
clustering and Hierarchical clustering.
A. K-Means Clustering
The most popular method for image segmentation is KMeans clustering. The K-Means clustering is one of the
simplest unsupervised learning algorithms that solve the
well known clustering problems.

Fig.2

176

International Journal of Emerging Technology and Advanced Engineering


Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)
Advantages and disadvantages of K-Means clustering:
The advantages of K-Means Clustering are as below

The algorithm takes the input parameter K, the number


of clusters to be partitioned among a set of n objects. A
typical K-Medoids algorithm for partitioning based on
medoid or central objects is as follows:

K-Means Clustering works well when clusters are


not well separated from each other, which is
frequently encountered in images.
K-Means algorithm is easy to implement.
Its time complexity is O(n), where n is the number
of patterns. It is faster than the hierarchical
clustering.
It works perfectly fine with all natural images.

Input:
K: The number of clusters.
D: A data set containing n objects.
Output:
A set of K clusters that minimizes the sum of the
dissimilarities of all the objects to their nearest medoid.
Arbitrarily choose K objects in D as the initial
representative objects.
The most common K-Medoids clustering is the
Partitioning Around Medoids (PAM) algorithm and it is as
follows:
1. Randomly select K of the n data points as the
medoids.
2. Associate each data point to the nearest medoid.
3. For each medoid m and each data point o
associated to m swap m and o and compute the
total cost of the average dissimilarity of o to all
the data points associated to m. Select the medoid
o with the lowest cost of the configuration.
4. Repeat alternating steps 2 and 3 until there is no
change in the assignments.

The disadvantages of K-Means Clustering are as below

The number of clusters is to be predefined in each


iteration, which creates a problem for of huge
image databases.
K-means clustering has problems when clusters
are of different sizes, Densities, and Non-globular
shapes.
K-means clustering has problems when the data
contains outliers.
We cannot show the clustering details as
Hierarchical clustering does.

B. K-Medoids Clustering
A partitional algorithm, which attempts to minimize the
SSE (Sum of Squared Error), is the K-Medoids clustering.
The K-Means algorithm is sensitive to outliers since an
object with an extremely large value may substantially
distort the distribution of data. Instead of taking the mean
value of the objects in a cluster as a reference point, a
Medoid can be used, which is the most centrally located
object in a cluster. Thus the partitioning method can still be
performed based on the principle of minimizing the sum of
the dissimilarities between each object and its
corresponding reference point. This forms the K-medoids
clustering method. This algorithm is very similar to the Kmeans algorithm. It differs from the latter mainly in its
representation of the different clusters. Each cluster is
represented by the most centric object in the cluster, rather
than by the implicit mean that may not belong to the
cluster.
A general version of K-means algorithm is called KMedoids clustering method. The basic strategy of KMedoids clustering algorithms is to find K clusters in n
objects by first arbitrarily finding a representative object
for each cluster. Each remaining object is clustered with the
medoid to which it is the most similar. K-Medoids method
uses representative objects as reference points instead of
taking the mean value of the objects in each cluster.

Experiment: This method is implemented using MATLAB


and tested with the following image.

Fig.3

Advantages and disadvantages of K-Medoids clustering:


The advantages of K-Medoids Clustering are as below

177

K-Medoids clustering is immune to noise and


outliers hence more suitable than K-Means
clustering.
K-Medoids clustering is computationally more
intensive.
K-Medoids can work with any distance measure.

International Journal of Emerging Technology and Advanced Engineering


Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)
The disadvantages of K-Medoids clustering are as below

3. Split xi out as a new cluster Ci+N, and then compute


d(y, Ci) and d(y, Ci+N), for. If d(y, Ci) > d(y, Ci+N),
then split y out of Ci and merge it into Ci+N.
4. Repeat to Step 2 until all of the clusters are not
change anymore.

K-Medoids clustering is computationally much


costlier that K-means clustering.
K-Medoids is applied when dealing with
categorical data.
PAM works effectively for small data sets, but
does not scale well for large data sets.

Experiment: The Hierarchical clustering method is


implemented using MATLAB and tested with the
following image

C. Hierarchical Clustering
One of the well known methods for image segmentation
is Hierarchical clustering. It is the process of integrating
different images and building them as a cluster in the form
of a tree and then developing step by step in order to form a
small cluster. The concept of hierarchical clustering is to
construct a dendrogram representing the nested grouping of
patterns (for image, known as pixels) and the similarity
levels at which groupings change. The hierarchical
clustering can be divided into two kinds of algorithm as
below

Fig.4

Advantages and disadvantages of Hierarchical Clustering:


The advantages of Hierarchical Clustering are as below

1.

Agglomerative (bottom up) hierarchical clustering:


Each object initially represents a cluster of its own. Then
clusters are successively merged until the desired cluster
structure is obtained.
The Hierarchical Agglomerative clustering algorithm is
composed as the following steps:
1. Set each pattern in the database as a cluster Ci
and compute the proximity matrix including the
distance between each pair of patterns.
2. Use the proximity matrix to find out the most
similar pair of clusters and then merge these two
clusters into one cluster. After that, update the
proximity matrix.
3. Repeat Step 1 and 2 until all patterns in one
cluster or just achieve the similarity we demand.

The process and relationships of Hierarchical


clustering can just be realized by checking the
dendrogram.

The result of hierarchical clustering presents high


correlation with the
database.

characteristics of original

We only need to compute the distances between


each pattern, instead of calculating the centroid of
clusters.

Hierarchical

clustering

methods

are

more

versatile.

It is easy to handle of any forms of similarity or


distance.

2.

Divisive(top down) hierarchical clustering:


All objects initially belong to one cluster.
Then the cluster is divided into sub-clusters which are
successively divided into their own sub-clusters. This
process continues until the desired cluster structure is
obtained.
The Hierarchical Divisive clustering algorithm is
composed as the following steps:
1. Start with one cluster of the whole image.
2. Find the pattern xi in cluster Ci satisfied d(x, Ci)
= max (d(y, Ci)), for yCi, where i = 1, 2, ,
N and N is the current number of clusters in the
whole image.

The disadvantages of Hierarchical Clustering are as


below

Hierarchical clustering methods can never undo


what was done previously. Namely there is no
back-tracking capability.

No objective function is directly minimized.


Difficulty handling different sized clusters and
convex shapes.

The hierarchical clustering involves in detailed


level, the fatal problem is the computation time.

178

International Journal of Emerging Technology and Advanced Engineering


Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)

Hierarchical clustering methods do not scale up

5. Hierarchical clustering algorithms are more


suitable for categorical data as long as a similarity
measure can be defined accordingly than
Partitional clustering.
6. Hierarchical clustering is a sequential partitioning
process, which results in a hierarchical nested
cluster structure, while partitional clustering is an
iterative partitioning process.

well with the number of observations.


IV. COMPARISONS OF THE VARIOUS CLUSTERING
METHODS
We used different natural images to experiment using Kmeans clustering, K-Medoids clustering and Hierarchical
clustering. There are some differences between various
Clustering methods for natural image segmentation as
below
A. Comparisons between K-Means and K-Medoids
clustering
1. K-Medoids is a generalized of K-Means
clustering.
2. K-Medoids clustering is computationally more
intensive than k-means clustering.
3. K-Means can work only with numerical,
quantitative variable types but K-Medoids can
work with any distance measure.
4. The K-Medoids method is more robust than the
K-Means algorithm in the presence of noise and
outliers because K-Medoid is less influenced by
outliers or other extreme values than K-Means
clustering.
5. KMedoids is computationally much costlier than
K-Means clustering.
6. Unlike K-Means clustering algorithm, K-Medoids
is not sensitive to dirty images.
7. The efficiency of K-Means clustering is
comparatively more than K-Medoids clustering.
8. K-Means clustering is easy to implement but kMedoids clustering is complicated to implement.

V. MATLAB
The name MATLAB stands for matrix laboratory.
MATLAB is a software program that allows us to do data
manipulation and
visualization,
image
analysis,
calculations, math and programming. It can be used to do
very simple as well as very sophisticated tasks. MATLAB
is a high-performance language for technical computing.
MATLAB is an interactive system whose basic data
element is an array that does not require dimensioning.
MATLAB can import/export several image formats such as
BMP (Microsoft Windows Bitmap), GIF (Graphics
Interchange Files), HDF (Hierarchical Data Format), JPEG
(Joint Photographic Experts Group), PCX (Paintbrush),
PNG (Portable Network Graphics), TIFF (Tagged Image
File Format), XWD (X Window Dump). MATLAB can
also load raw-data or other types of image data.
VI. CONCLUSION
Image segmentation is the key behind image
understanding. Image segmentation is one of the most
important steps leading to the analysis of processed image
data. Natural-image segmentation is one of the classical
problems in computer vision. There are many applications
of image segmentation like Medical imaging, Treatment
planning, Face recognition, Iris recognition, Fingerprint
recognition etc. There are many segmentation techniques
for natural images. In this paper, we used Clustering
methods. Clustering methods are very popular because they
are intuitive and, some of them, easy to implement. Since
the clustering methods are either directly applicable or
easily extendable to higher dimensional data, their
application in segmentation of colour and multispectral
images is a natural choice, it is important to know the
differences between Clustering methods. Here, we studied
the most commonly used three Clustering methods and
they are K-Means clustering, K-Medoids clustering and
Hierarchical clustering. The performance of various
Clustering methods are carried out with different images
by using MATLAB software.

B. Comparisons between Hierarchical and Partitional


(K-Means and K-Medoids) clustering.
1. The Partitional clustering is faster than
Hierarchical clustering.
2. Hierarchical clustering requires only a similarity
measure, while Partitional clustering requires
stronger assumptions such as number of clusters
and the initial centers.
3. Hierarchical clustering does not require any input
parameters,
while
Partitional
clustering
algorithms require the number of clusters to start
running.
4. Hierarchical clustering returns a much more
meaningful and subjective division of clusters but
Partitional clustering results in exactly K clusters.

179

International Journal of Emerging Technology and Advanced Engineering


Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)
Different Clustering methods work better under different
conditions. The K-Means clustering has better performance
and easy to implement than other Clustering methods.

AUTHORS PROFILE
Saiful Islam, Assistant Professor, HoD,
Department of Computer Science,
Dhamdhama
Anchalik
College,
Dhamdhama, Nalbari(Assam), India. He
has Completed MCA (Computer Science
& Engineering) in 2009 from Tezpur University, Assam,
India. He is pursuing PhD from CMJ University,
Meghalaya, India, in the department of Computer Science
and Applications. His interesting research areas are Image
Processing, Data Mining, and Database Management
System. He has three years teaching experiences. He is also
Co-coordinator of Study Centre at Dhamdhama Anchalik
College under IDOL (Gauhati University).

REFERENCES
[1]
[2]
[3]
[4]

[5]
[6]
[7]

[8]

[9]

[10]

[11]

[12]
[13]

[14]
[15]
[16]

[17]
[18]

[19]
[20]

R. C. Gonzalez and R. E. Woods, Digital Image Processing, 3rd ed.,


Prentice Hall, New Jersey 2008.
W. K. Pratt, Digital Image Processing, 3th ed., John Wiley & Sons,
Inc., Los Altos, California, 2007.
A.K. Jain and R.C. Dubes, Algorithms for Clustering Data, Prentice
Hall, 1988.
A. K. Jain, M. N. Murty, and P. J. Flynn, Data clustering: a
review, ACM Computing Surveys, vol. 31, issue 3, pp. 264-323,
Sep. 1999.
R. C. Dubes, A. K. Jain, (1976). Clustering techniques: the users
dilemma, Pattern Recognition.
R. H. Turi, (2001). Clustering-Based Color Image Segmentation,
PhD Thesis, Monash University, Australia.
T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman,
& A. Y. Wu (2002) An efficient k-means clustering algorithm:
Analysis and implementation Proc. IEEE Conf. Computer Vision
and Pattern Recognition, pp.881-892.
S. Mary Praveena, Dr. IlaVennila, Optimization Fusion Approach
for Image Segmentation Using K-Means Algorithm, International
Journal of Computer Applications (0975 8887) Volume 2 No.7,
June 2010.
Matteo Matteucci. Tutorial on Clustering Algorithms, Politecnico
di
Milano,
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/kmeans.h
tml (accessed October 4, 2010).
Teknomo,
Kardi.
"K-means
Clustering
Tutorials,"
http://people.revoledu.com/kardi/tutorial/K-means/(acessed October
6, 2010).
An Efficient k-means Clustering Algorithm: Analysis and
Implementation by Tapas Kanungo, David M. Mount, Nathan S.
Netanyahu, Christine D. Piatko, Ruth. Silverman Angela Y. Wu.
K. S. Fu, A survey on image segmentation, Pattern Recognition,
vol. 13, pp. 316, 1981.
Osama Abu Abbas, "Comparisons Between Data Clustering
Algorithms," The International Arab Journal of Information
Technology, vol. 5, no. 3, p. 320, July 2008.
The MarthWorks 2007. Matlab 7. Creating Graphical User
Interfaces. United States of America. The MathWorks.
The MarthWorks 2004. Getting started with Matlab. Version 7.
United States of America. The MathWorks.
R. M. Haralick and L. G. Shapiro, Image segmentation techniques,
Computer Vision Graphics Image Process., vol. 29, pp. 100132,
1985.
N. R. Pal and S. K. Pal, "A review on image segmentation
techniques," Pattern Recognition, vol. 26, pp. 1277-1294, 1993.
Huiyu Zhou, Abdul H. Sadka, Mohammad R. Swash, Jawid Azizi
and Abubakar S. Umar., Content Based Image Retrieval and
Clustering: A Brief Survey school of Engineering and Design,
Brunel University, Uxbridge, UB8 3PH, UK
Google:http://sites.google.com/site/dataclusteringalgorithms/kmean-clustering-algorithm.
Research issues on K-means Algorithm: An Experimental Trial
Using Matlab by Joaquin Perez Ortega, Ma. Del Rocio Boone Rojas
and Maria J. Somodevil Garcia.

Dr. Majidul Ahmed, Assistant


professor , HOD, Department of
Information Technology, Gauhati
Commerce College, Guwahati, Assam
(India). He has completed MCA and
PhD in the Department of Computer
Science from Gauhati University,
Guwahati (Assam), India. He has 9 years teaching
experience. He has published various papers in national and
International Journal. He is supervising 8 research scholars
under CMJ and others University.

180

Das könnte Ihnen auch gefallen