Beruflich Dokumente
Kultur Dokumente
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 3, March 2013)
Assistant Professor, Dhamdhama Anchalik College, Department of Computer Science (HoD), Assam, INDIA
Assistant Professor, Gauhati Commerce College, Department of Information Technology (HoD), Assam, INDIA
B. Clustering
The process of grouping a set of physical or abstract
objects into classes of similar objects is called clustering. A
cluster is a collection of data objects that are similar to one
another within the same cluster and are dissimilar to the
objects in other clusters. A cluster is an ordered list of
objects, which have some common characteristics.
Clustering is a data mining technique used in statistical data
analysis, data mining, pattern recognition, image analysis
etc.
Clustering is a classification technique. Given a vector of
N measurements describing each pixel or group of pixels in
an image, a similarity of the measurement vectors and
therefore their clustering in the N-dimensional
measurement space implies similarity of the corresponding
pixels or pixel groups. Therefore, clustering in
measurement space may be an indicator of similarity of
image regions, and may be used for segmentation purposes.
The vector of measurements describes some useful image
feature and thus is also known as a feature vector.
Similarity between image regions or pixels implies
clustering in the feature space.
I. INTRODUCTION
A. Image and Image Segmentation
An image is an two dimensional array or matrix of
square pixels arranged in columns and rows. Each pixel
represents the color or gray at a single point in the image.
An image comes from imago which is Latin word.
Natural images consist of an overwhelming number of
visual patterns generated by very diverse stochastic
processes in nature. Natural images are particularly noisy
due to the environment they were produced. A digital
image is a numeric representation (binary) of a twodimensional image. Any image from a scanner, or from a
digital camera, or in a computer, is a digital image.
The segmentation of natural images has become a very
important task in todays scenario. Image segmentation is a
fundamental process in many image, video, and computer
vision applications.
175
Fig.2
176
Input:
K: The number of clusters.
D: A data set containing n objects.
Output:
A set of K clusters that minimizes the sum of the
dissimilarities of all the objects to their nearest medoid.
Arbitrarily choose K objects in D as the initial
representative objects.
The most common K-Medoids clustering is the
Partitioning Around Medoids (PAM) algorithm and it is as
follows:
1. Randomly select K of the n data points as the
medoids.
2. Associate each data point to the nearest medoid.
3. For each medoid m and each data point o
associated to m swap m and o and compute the
total cost of the average dissimilarity of o to all
the data points associated to m. Select the medoid
o with the lowest cost of the configuration.
4. Repeat alternating steps 2 and 3 until there is no
change in the assignments.
B. K-Medoids Clustering
A partitional algorithm, which attempts to minimize the
SSE (Sum of Squared Error), is the K-Medoids clustering.
The K-Means algorithm is sensitive to outliers since an
object with an extremely large value may substantially
distort the distribution of data. Instead of taking the mean
value of the objects in a cluster as a reference point, a
Medoid can be used, which is the most centrally located
object in a cluster. Thus the partitioning method can still be
performed based on the principle of minimizing the sum of
the dissimilarities between each object and its
corresponding reference point. This forms the K-medoids
clustering method. This algorithm is very similar to the Kmeans algorithm. It differs from the latter mainly in its
representation of the different clusters. Each cluster is
represented by the most centric object in the cluster, rather
than by the implicit mean that may not belong to the
cluster.
A general version of K-means algorithm is called KMedoids clustering method. The basic strategy of KMedoids clustering algorithms is to find K clusters in n
objects by first arbitrarily finding a representative object
for each cluster. Each remaining object is clustered with the
medoid to which it is the most similar. K-Medoids method
uses representative objects as reference points instead of
taking the mean value of the objects in each cluster.
Fig.3
177
C. Hierarchical Clustering
One of the well known methods for image segmentation
is Hierarchical clustering. It is the process of integrating
different images and building them as a cluster in the form
of a tree and then developing step by step in order to form a
small cluster. The concept of hierarchical clustering is to
construct a dendrogram representing the nested grouping of
patterns (for image, known as pixels) and the similarity
levels at which groupings change. The hierarchical
clustering can be divided into two kinds of algorithm as
below
Fig.4
1.
characteristics of original
Hierarchical
clustering
methods
are
more
versatile.
2.
178
V. MATLAB
The name MATLAB stands for matrix laboratory.
MATLAB is a software program that allows us to do data
manipulation and
visualization,
image
analysis,
calculations, math and programming. It can be used to do
very simple as well as very sophisticated tasks. MATLAB
is a high-performance language for technical computing.
MATLAB is an interactive system whose basic data
element is an array that does not require dimensioning.
MATLAB can import/export several image formats such as
BMP (Microsoft Windows Bitmap), GIF (Graphics
Interchange Files), HDF (Hierarchical Data Format), JPEG
(Joint Photographic Experts Group), PCX (Paintbrush),
PNG (Portable Network Graphics), TIFF (Tagged Image
File Format), XWD (X Window Dump). MATLAB can
also load raw-data or other types of image data.
VI. CONCLUSION
Image segmentation is the key behind image
understanding. Image segmentation is one of the most
important steps leading to the analysis of processed image
data. Natural-image segmentation is one of the classical
problems in computer vision. There are many applications
of image segmentation like Medical imaging, Treatment
planning, Face recognition, Iris recognition, Fingerprint
recognition etc. There are many segmentation techniques
for natural images. In this paper, we used Clustering
methods. Clustering methods are very popular because they
are intuitive and, some of them, easy to implement. Since
the clustering methods are either directly applicable or
easily extendable to higher dimensional data, their
application in segmentation of colour and multispectral
images is a natural choice, it is important to know the
differences between Clustering methods. Here, we studied
the most commonly used three Clustering methods and
they are K-Means clustering, K-Medoids clustering and
Hierarchical clustering. The performance of various
Clustering methods are carried out with different images
by using MATLAB software.
179
AUTHORS PROFILE
Saiful Islam, Assistant Professor, HoD,
Department of Computer Science,
Dhamdhama
Anchalik
College,
Dhamdhama, Nalbari(Assam), India. He
has Completed MCA (Computer Science
& Engineering) in 2009 from Tezpur University, Assam,
India. He is pursuing PhD from CMJ University,
Meghalaya, India, in the department of Computer Science
and Applications. His interesting research areas are Image
Processing, Data Mining, and Database Management
System. He has three years teaching experiences. He is also
Co-coordinator of Study Centre at Dhamdhama Anchalik
College under IDOL (Gauhati University).
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
180