Beruflich Dokumente
Kultur Dokumente
Heart line is chosen for classification because there are many types
found and because it lies independently in the palm with parallel
orientation with horizontal axis. After a careful investigation on
heart line it is found that every palm has some particular length
of the heart line, some with gaps in the heart line and in rare cases Fig. 3: Binary Fig. 4 : Boundary
no heart line. In general the heart lies in the area between the
fingers and the point where the life line and the head line come 3. Euclidean distance is calculated between BPV and P with
closer. we take low-resolution images. Low-resolution images
8. Mid point of the finer web points M1, M2 and mid point of
M1 and M2 is taken as M. It is seen that the distance D between
Fig. 5 : Palm rotation LT1 and M, if projected vertically from the point LT1, then the
heart lines always Lies in between the points LT1 and LT2 as
5. The image is then rotated fig. 5 at an angle θ to align the straight line shown in fig. 9.
joining FW2(x2, y2) & FW4(x4, y4) with the horizontal axis fig. 6.
Fig. 6: Palm after rotation Fig. 9 : finding out the point. of ROI
9. 13. Using these above points LT1, LT2, RT1 and RT2 fig. 10-I
a rectangular ROI containing only the heart line fig. 10-II.
If we take the gradient of this signal (in one dimension, is just the
first derivative with respect to (t) we get the following
V. K-Means Clustering Using the above notation, a particle’s position is adjusted according
One of the most important components of a clustering algorithm is to
the measure of similarity used to determine how close two patterns
are to one another. K-means clustering groups data vectors into vi ,k (t + 1) = wvi ,k (t ) + c1r1,k (t ) − ( y − xi ,k (t )) + c2 r2,k (t )( yˆ k (t ) − xi ,k (t )) (7)
a predefined number of clusters, based on Euclidean distance
as similarity measure. Data vectors within a cluster have small xi (t + 1) = xi (t ) + vi (t + 1) (8)
Euclidean distances from one another, and are associated with one
centroid vector, which represents the "midpoint" of that cluster. where w is the inertia weight c1 and c2 are the acceleration
The centroid vector is the mean of the data vectors that belong to constants, r1,k (t), r2,j (t) ~ U(0,1), and k=1,...,Nd. The velocity
the corresponding cluster. is thus, calculated based on three contributions: (i) a fraction of the
previous velocity, (ii) the cognitive component which is a function
Using the above notation, the standard K-means algorithm is of the distance of the particle from its personal best position, and
summarized as (iii) the social component which is a function of the distance of
the particle from the best particle found thus far (i.e. the best of
1. Randomly initialize the Nc cluster centroid vectors. the personal bests).
2. Repeat The personal best position of the particle I is calculated as
a) For each data vector, assign the vector to the class with the y (t ) if f ( xi (t + 1)) ≥ f ( yi (t ))
yi (t + 1) = i
closest centroid vector, where the distance to the centroid is
xi (t + 1) if f ( xi (t + 1)) < f ( yi (t )) (9)
determined using
Two basic approaches to PSO exists based on the interpretation of
Nd neighbourhood of particles. Equation (7) reflects the gbest version
d (zp , mj ) = ∑ (z pk − m jk ) 2 of PSO where, for each particle, the neighbourhood is simply the
k =1
(5) entire swarm. The social component then causes particles to be
drawn toward the best particle in the swarm. In the lbest PSO
where k subscripts the dimension. model, the swarm is divided into overlapping neighbourhoods, and
the best particle of each neighbourhood is determined. For the lbest
b) Recalculate the cluster centroid vectors, using PSO model, the social component of equation (7) changes to
1
mj =
nj
∑
∀Z p ∈C j
zp c2 r2,k (t )( yˆ j ,k (t ) − xi ,k (t ) (10)
(6)
until a stopping criterion is satisfied where ŷj is the best particle in the neighbourhood of the i-th
particle. The PSO is usually executed with repeated application
The K-means clustering process can be stopped when any one of of equation (7) & (8) until a specified number of iteration has
the following criteria are satisfied: when the maximum number been exceed. Alternatively, the algorithm can be termed when the
of iterations has been exceeded, when there is little change in the velocity updates are close to zero over a number of iteration.
centroid vectors over a number of iterations, or when there are no
cluster membership changes. For the purpose of this study, the A. PSO Clustering
algorithm is stopped when a user-specified number of iterations In the context of clustering, a single particle represents the Nc
has been exceeded. cluster centroid vectors. That is, each particle xi is constructed
as follows:
VI. Particle Swarm Optimization
Particle swarm optimization (PSO) is a population-based stochastic xi = (mi1 ,..., m1 j ,..., miNc ) (11)
search process, modelled after the social behaviour of a bird flock where mil refers to the j-th cluster centroid vector of the i-th
[4,5]. The algorithm maintains a population of particles, where particle in cluster Cij. Therefore, a swarm represents a number
of candidate clustering for the current data vectors. The fitness relation. Finally, passive congregation is an attraction of a particle
of particles is easily measured as the quantization error, to other swarm members, where there is no display of social
behaviour since particles need to monitor both environment
∑ Nc
j =1 ∑ ∀Z p ∈Cij d ( z p , m j ) / Cij
Je = and their immediate surroundings such as the position and the
Nc (12) speed of neighbours. Such information transfer can be employed
Where ‘d’ is defined in equation (5), and | Cij | is the number of in the passive congregation. The global variant-based passive
data vectors belonging to cluster Cij, i.e. the frequency of that congregation PSO (GPAC) is enhanced with the constriction factor
cluster. approach.
This section first presents a standard gbest PSO for clustering The swarms of the enhanced GPAC is manipulated by the velocity
data into a given number of clusters in section IV, and then shows update
how K-means and the PSO algorithm can be combined to further vi (t + 1) = k .[w(t ).vi (t ) + c1.rand1.( Pi − Si (t ) + c2 .rand 2 .( Pk − Si (t )) + c3 .rand3 .( Pr − Si (t ))] (14)
improve the performance of the PSO clustering algorithm in
section B. where i = 1,2,...,N ; c1,c2, and c3 are the cognitive, social, and
passive congregation parameters respectively ;rand1, rand2, and
i. calculate the Euclidian distance d(zp’,mij) to all cluster centroids rand3 are random numbers uniformly distributed within [0,1]; Pi
Cij. is the best previous position of the ith particle; Pk is either the
ii. assign zp to cluster Cij such that global best position ever attained among all particles in the case of
d ( z p , mij ) = min ∀c =1,..., Nc {d ( z p , mic )} enhanced GPAC or the local best position of particle-i, namely, the
(13) position of its nearest particle-k with better evaluation in the case of
iii. calculate the global best and local best position. LPAC; and Pr is the position of passive congregator (position of a
c) Update the global best and local best position. randomly chosen particle-r). The positions are updated using( step
3). The positions of the ith particle in the n-dimensional decision
B. Hybrid PSO and K-means Clustering Algorithm space are limited by the minimum and maximum and maximum
The K-means algorithm tends to converge faster (after less positions expressed by vectors
function evaluation) than the PSO, but usually with a less accurate
clustering [6]. This section shows that the performance of the PSO Simin , Simax (15)
clustering can further be improved by sending the initial swarm Here, the minimum and maximum position vectors express the
with the result of the K-means algorithm. The hybrid algorithm inequality constraints. The velocities of the ith particle in the
first executes K-means algorithm once. In this case the K-means n-dimensional decision space are limited by
clustering is terminated when (i) the maximum number of iteration
is exceeded, or when (ii) the average change in centroid vector is −Vi max , Vi max (16)
less than that 0.0001 (a user specified parameter). The result of the Where the maximum velocity in the lth dimension of the search
K-means algorithm is then used as one of the particle, while the space is proposed as
rest of the swarm is initialized randomly. The gbest PSO algorithm simax min
,l − S i ,l
as presented above is then executed. vimax
,l = , (l = 1, 2,..., n) (17)
Nr
VII. Experimental Results where simin,l and simax
,l
are the limits in the l-dimension of the
This section compares the results of the K-means, PSO and search space. The maximum velocities are constructed in small
Hybrid clustering within a cluster, where the objective is to intervals in the search space for better balance between exploration
minimize the intra-cluster distance. The inner-cluster distance, and exploitation. N. is a chosen number of search intervals for
i.e. the distance between the centroid of the clusters, where the the particles. It is an important parameter in the enhanced GPAC
objective is to maximize the distance between clusters, algorithms algorithms. A small N, facilitates global exploration (searching
on six classification problems. The main purpose is to compare the new areas), while a large one tends to facilitate local exploration
quality of the respective clusterings, where quality is measured (fine tuning of the current search area). A suitable value for the
according to the following two criterion N, usually provides balance between global and local exploration
• the quantization error as defined in equation (12). abilities and consequently results in a reduction of the number of
• the intra-cluster distances, i.e. the distance between data iterations required to locate the optimum solution.
vectors
The latter two objectives respectively correspond to crisp, compact The basic steps of the enhanced GPAC are listed below:
clusters that we are well separated. For all the results reported, Step-1. Generate a swarm of N-particles with uniform probability
averages over 30 simulations are given. All algorithms are run distribution, initial positions Si (0) , and velocities Vi (0), (i=1,2,...,
for 1000 function evaluations, and the PSO algorithms used 20 N) , and initialize the random parameters. Evaluate each particle-i
particles. For PSO,w = 0.72 and c1=c2=1.49. These values were using objective function f (e.g., f to be minimized).
chosen to ensure good convergence [6]. Step-2 For each particle-i, calculate the distance dij,
Hybrid PSO algorithm consistently performs better than the other between its position and the position of all other particles
two approaches with an increase in the number of clusters. The dij = Si − S j (i ≠ j = 1, 2,..., N ) where Si and Sj are the position
K-means algorithm exhibited a faster, but premature convergence vectors of particle-i and particle-j respectively.
to a large quantization error, while the PSO algorithms and slower
quantization errors. Step-3 For each particle-i determine the nearest particle , particle-k,
ongregation, on the other hand, is a swarming by social forces, with better evaluation than its own, i.e., d = min (d ), f ≤ f , and
ik j ij k i
which is the source of attraction of particle to others and is set is as the leader bf particle-i. In the case of enhanced GPAC,
classified in two types: social and passive. Social congregation particle-r and set it as the global best.
usually happens when the swarm's fidelity is high, such as genetic Step-4 For each particle-I, randomly select a particle-r and set it
w w w. i j c s t. c o m International Journal of Computer Science and Technology 375
IJCST Vol. 2, Issue 2, June 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online)