Sie sind auf Seite 1von 6

ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online) IJCST Vol.

2, Issue 2, June 2011

A Palmprint Feature Extraction and Pattern Classification


Based on Hybrid PSO-K-Means Clustering
1
Bhawani Sankar Panigrahi, 2Hari Narayan Pratihari, 3Dr. Gayatri Devi, 4Satyabrata Dash
1
Utkal University, Orissa, India
2,4
Orissa Engineering College, Bhubaneswar, India
3
Vikash Engineering College, Orissa, India
Abstract are more suitable because of their smaller file sizes, which result
The paper presents a hybrid particle swarm optimization (PSO) in shorter computation times during preprocessing and feature
technique for optimally clustering N palmprint data points into extraction [14]. The palmprint image is captured and the processed
K cluster. The cluster center are automatically detected by PSO image is then used to extract the features. The authentication
technique from a set of features obtained by applying geometrical methods require the input palmprint to be matched with the large
methods to the palmprint data. The image captured by a peg-free set of palmprints stored in the database. The search time should
scanner, a rectangular region of interest (ROI) containing only the be very small of the order less than 1sec. In order to reduce the
heart line is extracted. Intensity of the ROI image is standardized search time and complexity, the palmprint patterns are clustered.
and image is smoothened. After that soble gradient with a threshold After the features are extracted from the palmprint database using
is applied to extract the heart line from the ROI. The cluster centers different geometrical methods, a clustering analysis is used to
are useful in identifying the class of the palmprint, which occurs group the data into clusters and thereby indentifying the class of
in a variety of practical system in engineering and science. the palmprint data. The well known K-means and Fuzzy C-means
algorithms commonly used for data clustering suffer from the trial
Keyword and error choice of the initial cluster.In this paper a hybrid K-means
Palmprint classification, Sobel, PSO, K-Means Clustering, GPAC and PSO technique is used to cluster the features into distinct
PSO, GPAC APSO. groups so as to classify the nature of the time series data. Particle
Swarm Optimization (PSO) is a population based algorithm that
I. Introduction simulates the flocking of bird or fish behavior to achieve an optimal
Many biometrics techniques have been introduced which includes solution. It uses a fitness function in the search space and need
fingerprint, iris, face, speech recognition. Recently Palmprints smaller parameters than other evolutionary algorithms to cluster
have come forward as a very reliable biometrics. palm is an inner the features. In traditional K-means and fuzzy C-means, the
surface of the hand between the wrist and the fingers [13]. Palm program gets stuck in the local minima and, therefore, does not
has several features to be extracted like principal lines, wrinkles, offer robustness in clustering the features. The hybrid K-means
ridges, singular points, texture and minutiae. Low-resolution PSO algorithm is applied to the ployU palmprint database and
images are sufficient for the principal lines extraction since they database collected from OEC students. Feature clusters are found
are thick in nature. There are usually three principal lines made by to classify the palmprint images.
flexing the hand and wrist in the palm, which are named as heart
line, head line, and life line respectively. A palmprint image with II. Feature Extraction
principal lines and wrinkles represented is shown in Fig. 1.
A. Preprocessing and segmentation
1. Image is converted to binary Fig. 2,
2. Boundary tracing 8-connected pixels algorithm is applied on
the binary image to find the boundary of palmprint image Fig. 3.
The starting point is the bottom left point “P” as shown in Fig. 4
and the tracing direction is counter clockwise. The end point is
also “P”. And these boundary pixels are collected in Boundary
pixel vector (BPV).

Fig.1: Palmprint with principle lines and wrinkles

Heart line is chosen for classification because there are many types
found and because it lies independently in the palm with parallel
orientation with horizontal axis. After a careful investigation on
heart line it is found that every palm has some particular length
of the heart line, some with gaps in the heart line and in rare cases Fig. 3: Binary Fig. 4 : Boundary
no heart line. In general the heart lies in the area between the
fingers and the point where the life line and the head line come 3. Euclidean distance is calculated between BPV and P with
closer. we take low-resolution images. Low-resolution images

w w w. i j c s t. c o m   International Journal of Computer Science and Technology  371


IJCST Vol. 2, Issue 2, June 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online)

6. After rotation again step 1 to 5 are applied to get finger web


formula D E (i) = (X p − X b (i) 2 + (Yp − Yb (i) 2 (1) points of the rotated image fig. 7 as the co-ordinates of finger web
points changes after rotation. The finger webs after rotation are
where ( Xp , Yp ) are the X and Y coordinates of the P. ( Xb(i), named as FR1, FR2, FR3, and FR4 fig. 7.
Yb(i) ) is the coordinate of the border pixel, and DE(i) is the
Euclidian distance between P and I th border pixel. A Distance
distribution diagram Fig. 4 is constructed using the vector DE.
The constructed diagram pattern is similar to geometric shape of
the palm. In the diagram Fig. 4 four local minima can be seen
and five local maxima, which resemble the five-finger tips (local
maxima) and four finger webs (local minima) i-e valley between
fingers.

Fig. 7: Finger web locations after rotation

7. Vertical distance ‘D1’ between FR4 and FR3, is calculated fig.


Fig. 4: Distance distribution diagram 8, D1 is projected vertically from FR4 and from there half of the
distance between FR3 and FR4 (ds/2) is projected horizontally
4. The second and the fourth finger wave point is taken fig. 5 and to the left edge of the palm to get the point LT1.
the slope joining this two line is calculated using formula
Y
θ = tan −1
X (2)
Where Y= y2-y4, X= x2-x4, (x2, y2) & (x4, y4) and (x3, y3)
are the co-ordinates of FW2 & FW4 and FW3 finger web point
respectively, θ= slope of the line

Fig. 8: Finding left point of ROI

8. Mid point of the finer web points M1, M2 and mid point of
M1 and M2 is taken as M. It is seen that the distance D between
Fig. 5 : Palm rotation LT1 and M, if projected vertically from the point LT1, then the
heart lines always Lies in between the points LT1 and LT2 as
5. The image is then rotated fig. 5 at an angle θ to align the straight line shown in fig. 9.
joining FW2(x2, y2) & FW4(x4, y4) with the horizontal axis fig. 6.

Fig. 6: Palm after rotation Fig. 9 : finding out the point. of ROI

372  International Journal of Computer Science and Technology w w w. i j c s t. c o m


ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online) IJCST Vol. 2, Issue 2, June 2011

9. 13. Using these above points LT1, LT2, RT1 and RT2 fig. 10-I
a rectangular ROI containing only the heart line fig. 10-II.

Fig. 13 : signal undergone gradient

Clearly, the derivative shows a maximum located at the center


of the edge in the original signal. This method of locating an
edge is characteristic of the “gradient filter” family of edge
detection filters and includes the Sobel method. A pixel location
is declared an edge location if the value of the gradient exceeds
some threshold. As mentioned before, edges will have higher pixel
intensity values than those surrounding it. So once a threshold
is set, you can compare the gradient value to the threshold value
and detect an edge whenever the threshold is exceeded. Based
on this one-dimensional analysis, the theory can be carried over
to two-dimensions as long as there is an accurate approximation
Fig. 10: ROI points to calculate the derivative of a two-dimensional image. The
Sobel operator performs a 2-D spatial gradient measurement on
an image. Typically it is used to find the approximate absolute
gradient magnitude at each point in an input grayscale image. The
Sobel edge detector uses a pair of 3x3 convolution masks, one
estimating the gradient in the x-direction (columns) and the other
estimating the gradient in the y-direction (rows). A convolution
Fig. 11 : Extracted ROI mask is usually much smaller than the actual image. As a result,
the mask is slid over the image, manipulating a square of pixels
III. Sobel Edge Gradient at a time. The actual Sobel masks are shown below
Edges characterize boundaries and are therefore a problem of
fundamental importance in image processing. Edges in images are
areas with strong intensity contrasts – a jump in intensity from one
pixel to the next. Edge detecting an image significantly reduces the
amount of data and filters out useless information, while preserving
the important structural properties in an image. There are many
ways to perform edge detection. An edge has the one-dimensional
shape of a ramp and calculating the derivative of the image can The magnitude of the gradient is then calculated using the
highlight its location. Suppose we have the following signal, with formula:
an edge shown by the jump in intensity below
(3)
An approximate magnitude can be calculated using
|G| = |Gx| + |Gy| (4)

IV. Algorithm for feature extraction


1. Blur the ROI image with adaptive median filter (until the ridges
get smoothened)
2. Standardize the gray level intensity.
3. Apply Sobel edge detection operator with a threshold to obtain
only the prominent heart line.
4. Record the co-ordinates in a vector.
Fig. 12: Original signal

If we take the gradient of this signal (in one dimension, is just the
first derivative with respect to (t) we get the following

w w w. i j c s t. c o m   International Journal of Computer Science and Technology  373


IJCST Vol. 2, Issue 2, June 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online)

each particle represents a potential solution to an optimization


problem.
In the context of PSO, a swarm refers to a number of potential
solutions to the optimization problem, where each potential
Fig. 14: Smoothed ROI solution is referred to as a particle. The aim of the PSO is to find the
particle position that results in the best evaluation of a given fitness
(objective) function. Each particle represents a position in Nd
dimensional space, and is "flown" through this multidimensional
search space, adjusting its position towards both.
• the particles best position found thus far, and
• the best position in the neighbourhood of that particle.
Fig. 15: Sobel edged ROI
Each particle i maintains the following information:
These features are used to classify the palmprint data into several • xi : The current position of the particle;
class by using a hybrid clustering technique based on PSO and • vi : The current velocity of the particle ;
K-means algorithm. • yi :The personal best position of the particle.

V. K-Means Clustering Using the above notation, a particle’s position is adjusted according
One of the most important components of a clustering algorithm is to
the measure of similarity used to determine how close two patterns
are to one another. K-means clustering groups data vectors into vi ,k (t + 1) = wvi ,k (t ) + c1r1,k (t ) − ( y − xi ,k (t )) + c2 r2,k (t )( yˆ k (t ) − xi ,k (t )) (7)
a predefined number of clusters, based on Euclidean distance
as similarity measure. Data vectors within a cluster have small xi (t + 1) = xi (t ) + vi (t + 1) (8)
Euclidean distances from one another, and are associated with one
centroid vector, which represents the "midpoint" of that cluster. where w is the inertia weight c1 and c2 are the acceleration
The centroid vector is the mean of the data vectors that belong to constants, r1,k (t), r2,j (t) ~ U(0,1), and k=1,...,Nd. The velocity
the corresponding cluster. is thus, calculated based on three contributions: (i) a fraction of the
previous velocity, (ii) the cognitive component which is a function
Using the above notation, the standard K-means algorithm is of the distance of the particle from its personal best position, and
summarized as (iii) the social component which is a function of the distance of
the particle from the best particle found thus far (i.e. the best of
1. Randomly initialize the Nc cluster centroid vectors. the personal bests).
2. Repeat The personal best position of the particle I is calculated as

a) For each data vector, assign the vector to the class with the  y (t ) if f ( xi (t + 1)) ≥ f ( yi (t ))
yi (t + 1) =  i
closest centroid vector, where the distance to the centroid is
 xi (t + 1) if f ( xi (t + 1)) < f ( yi (t )) (9)
determined using
Two basic approaches to PSO exists based on the interpretation of
Nd neighbourhood of particles. Equation (7) reflects the gbest version
d (zp , mj ) = ∑ (z pk − m jk ) 2 of PSO where, for each particle, the neighbourhood is simply the
k =1
(5) entire swarm. The social component then causes particles to be
drawn toward the best particle in the swarm. In the lbest PSO
where k subscripts the dimension. model, the swarm is divided into overlapping neighbourhoods, and
the best particle of each neighbourhood is determined. For the lbest
b) Recalculate the cluster centroid vectors, using PSO model, the social component of equation (7) changes to
1
mj =
nj

∀Z p ∈C j
zp c2 r2,k (t )( yˆ j ,k (t ) − xi ,k (t ) (10)
(6)
until a stopping criterion is satisfied where ŷj is the best particle in the neighbourhood of the i-th
particle. The PSO is usually executed with repeated application
The K-means clustering process can be stopped when any one of of equation (7) & (8) until a specified number of iteration has
the following criteria are satisfied: when the maximum number been exceed. Alternatively, the algorithm can be termed when the
of iterations has been exceeded, when there is little change in the velocity updates are close to zero over a number of iteration.
centroid vectors over a number of iterations, or when there are no
cluster membership changes. For the purpose of this study, the A. PSO Clustering
algorithm is stopped when a user-specified number of iterations In the context of clustering, a single particle represents the Nc
has been exceeded. cluster centroid vectors. That is, each particle xi is constructed
as follows:
VI. Particle Swarm Optimization
Particle swarm optimization (PSO) is a population-based stochastic xi = (mi1 ,..., m1 j ,..., miNc ) (11)
search process, modelled after the social behaviour of a bird flock where mil refers to the j-th cluster centroid vector of the i-th
[4,5]. The algorithm maintains a population of particles, where particle in cluster Cij. Therefore, a swarm represents a number

374  International Journal of Computer Science and Technology w w w. i j c s t. c o m


ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online) IJCST Vol. 2, Issue 2, June 2011

of candidate clustering for the current data vectors. The fitness relation. Finally, passive congregation is an attraction of a particle
of particles is easily measured as the quantization error, to other swarm members, where there is no display of social
  behaviour since particles need to monitor both environment
∑ Nc
j =1  ∑ ∀Z p ∈Cij d ( z p , m j ) / Cij 
Je = and their immediate surroundings such as the position and the
Nc (12) speed of neighbours. Such information transfer can be employed
Where ‘d’ is defined in equation (5), and | Cij | is the number of in the passive congregation. The global variant-based passive
data vectors belonging to cluster Cij, i.e. the frequency of that congregation PSO (GPAC) is enhanced with the constriction factor
cluster. approach.
This section first presents a standard gbest PSO for clustering The swarms of the enhanced GPAC is manipulated by the velocity
data into a given number of clusters in section IV, and then shows update
how K-means and the PSO algorithm can be combined to further vi (t + 1) = k .[w(t ).vi (t ) + c1.rand1.( Pi − Si (t ) + c2 .rand 2 .( Pk − Si (t )) + c3 .rand3 .( Pr − Si (t ))] (14)
improve the performance of the PSO clustering algorithm in
section B. where i = 1,2,...,N ; c1,c2, and c3 are the cognitive, social, and
passive congregation parameters respectively ;rand1, rand2, and
i. calculate the Euclidian distance d(zp’,mij) to all cluster centroids rand3 are random numbers uniformly distributed within [0,1]; Pi
Cij. is the best previous position of the ith particle; Pk is either the
ii. assign zp to cluster Cij such that global best position ever attained among all particles in the case of
d ( z p , mij ) = min ∀c =1,..., Nc {d ( z p , mic )} enhanced GPAC or the local best position of particle-i, namely, the
(13) position of its nearest particle-k with better evaluation in the case of
iii. calculate the global best and local best position. LPAC; and Pr is the position of passive congregator (position of a
c) Update the global best and local best position. randomly chosen particle-r). The positions are updated using( step
3). The positions of the ith particle in the n-dimensional decision
B. Hybrid PSO and K-means Clustering Algorithm space are limited by the minimum and maximum and maximum
The K-means algorithm tends to converge faster (after less positions expressed by vectors
function evaluation) than the PSO, but usually with a less accurate
clustering [6]. This section shows that the performance of the PSO  Simin , Simax  (15)
clustering can further be improved by sending the initial swarm Here, the minimum and maximum position vectors express the
with the result of the K-means algorithm. The hybrid algorithm inequality constraints. The velocities of the ith particle in the
first executes K-means algorithm once. In this case the K-means n-dimensional decision space are limited by
clustering is terminated when (i) the maximum number of iteration
is exceeded, or when (ii) the average change in centroid vector is  −Vi max , Vi max  (16)
less than that 0.0001 (a user specified parameter). The result of the Where the maximum velocity in the lth dimension of the search
K-means algorithm is then used as one of the particle, while the space is proposed as
rest of the swarm is initialized randomly. The gbest PSO algorithm simax min
,l − S i ,l
as presented above is then executed. vimax
,l = , (l = 1, 2,..., n) (17)
Nr
VII. Experimental Results where simin,l and simax
,l
are the limits in the l-dimension of the
This section compares the results of the K-means, PSO and search space. The maximum velocities are constructed in small
Hybrid clustering within a cluster, where the objective is to intervals in the search space for better balance between exploration
minimize the intra-cluster distance. The inner-cluster distance, and exploitation. N. is a chosen number of search intervals for
i.e. the distance between the centroid of the clusters, where the the particles. It is an important parameter in the enhanced GPAC
objective is to maximize the distance between clusters, algorithms algorithms. A small N, facilitates global exploration (searching
on six classification problems. The main purpose is to compare the new areas), while a large one tends to facilitate local exploration
quality of the respective clusterings, where quality is measured (fine tuning of the current search area). A suitable value for the
according to the following two criterion N, usually provides balance between global and local exploration
• the quantization error as defined in equation (12). abilities and consequently results in a reduction of the number of
• the intra-cluster distances, i.e. the distance between data iterations required to locate the optimum solution.
vectors
The latter two objectives respectively correspond to crisp, compact The basic steps of the enhanced GPAC are listed below:
clusters that we are well separated. For all the results reported, Step-1. Generate a swarm of N-particles with uniform probability
averages over 30 simulations are given. All algorithms are run distribution, initial positions Si (0) , and velocities Vi (0), (i=1,2,...,
for 1000 function evaluations, and the PSO algorithms used 20 N) , and initialize the random parameters. Evaluate each particle-i
particles. For PSO,w = 0.72 and c1=c2=1.49. These values were using objective function f (e.g., f to be minimized).
chosen to ensure good convergence [6]. Step-2 For each particle-i, calculate the distance dij,
Hybrid PSO algorithm consistently performs better than the other between its position and the position of all other particles
two approaches with an increase in the number of clusters. The dij = Si − S j (i ≠ j = 1, 2,..., N ) where Si and Sj are the position
K-means algorithm exhibited a faster, but premature convergence vectors of particle-i and particle-j respectively.
to a large quantization error, while the PSO algorithms and slower
quantization errors. Step-3 For each particle-i determine the nearest particle , particle-k,
ongregation, on the other hand, is a swarming by social forces, with better evaluation than its own, i.e., d = min (d ), f ≤ f , and
ik j ij k i
which is the source of attraction of particle to others and is set is as the leader bf particle-i. In the case of enhanced GPAC,
classified in two types: social and passive. Social congregation particle-r and set it as the global best.
usually happens when the swarm's fidelity is high, such as genetic Step-4 For each particle-I, randomly select a particle-r and set it
w w w. i j c s t. c o m   International Journal of Computer Science and Technology  375
IJCST Vol. 2, Issue 2, June 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online)

as passive congregator particle-i. Cybernetics, Vol.34, No.3, and June 2004.


Step-5 Update the velocities and positions of particles using (21) [11] X. Wu, David Zhang, K. Wang, Bo Huang,” Palmprint
and (18), respectively. classification using principal lines”, Pattern Recognition
Step-6 Check if the limits of positions (22) and velocities (23), 37, 2004, pp.1987- 1998.
(24) are enforced. If the limits are violated, then they are replaced [12] D. Zhang, Wai-Kin Kong, Jane You, M.Wong, “Online
by the respective limits. palmprint Identification”, IEEE Trans. On Pattern Analysis
Step-7 Evaluate each particle using the objective function f. The and Machine Intelligence, Vol.25, No.9, Sep 2003.
objective function f is calculated by running a load flow. In the [13] N. Duta, A. K. Jain, K. V. Mardia, “Matching of palmprint”,
case where for a particle no load flow solution exists, an error is Pattern Recognition Letters 23, 2001, pp. 477-485.
retuned and the particle retains its previous achievement. [14] C.C. Han, H. L. Chen, C.L. Lin, K. C. Fan, “Personal
Step-8 If the stopping criteria are not satisfied go to ‘Step 2’. authentication using palmprint features, Pattern Recognition
The enhanced GPAC algorithms will be terminated if one of the 36, 2003, pp. 371-381.
following criteria is satisfied: (i) no improvement of the global
best in the last 30 generation is observed, or (ii) the maximum
number of allowed iterations is achieved ( in this paper). Er. Bhawani Sankar Panigrahi is a Ph.D
research scholar at Utkal University.
He has published 6 research papers in
international and national journals &
conferences.

Prof. Dr. Gayatri Devi is Working as a


principal in Vikash Enginnering college.
She is having 21 years of teaching
experience and over 15 years of research
Fig. 15 : F1-F2 GPAC APSO experience. She has produced 2 PhD
scholars and 8 are perusing. She has
References published more than 15 research papers
[1] Selim, S. Z., Ismail, M. A., "K-means Type Algorithms: A in national & international journals.
Generalized Convergence Theorem and Characterization of
Local Optimality," IEEE Trans. Pattern Anal. Mach. lntell.
Vol. 6, pp. 8l_87 (1984).
[2] Wu, K.-L., Yang, M-S., "Altenative C-means Clustering Dr. H.N.Pratihari is a Professor in the
Algorithms," Pattern Recognition Vol. 35, pp. 2267_ 2278( Department of Electronics
2002). Telecommunication Engineering, Orissa
[3] Maulik, U., Bandyopadhyay S., "Genetic Algorithm-based Engineering College, Bhubaneswar–
Clustering Technique," Pattern Recognition Vol. 33, pp. 752050, India. He received his AMIE in
1455_1465 no. 1, pp. 60-68, Feb. 2003. Electronics & Communication Engineering
[4] Kennedy, J., Eberhart, R., "Particle Swarm Optimization," from the Institution of Engineers, (India)
Proc. of IEEE International Conference on Neural Networks and served Neelachal Ispat Nigam Ltd.
(ICNN), Perth, Australia, Vol. 4, pp. 1942_1948. for 5 years. Later he has done M.Tech.
[5] Eberhart, R., Kennedy, .I., "A New Optimizer Using Particle degree in Electronics & Communication
Swarm Theory," Proc. 6th lnt. Symposium on Micro Machine Engineering from National Institute of Technology, Rourkela in
and Human Science, pp. 39_43, 1995. 2005. He was awarded Ph.D. in Electronics Engineering from
[6] J Eberhart, R. C., Shi, Y., "Particle Swarm Optimization: Utkal University, Vani Vihar, Bhubaneswar in 2010. His main
Developments, Applications and Resources," Proceedings professional interests are in Antenna, Design of power saving
of the IEEE Congress on Evolutionary Computation (CEC electronics equipments, Microprocessors, Network Security,
2001), Seoul , Korea, 2001. Intelligent Techniques, Image Processing & Speech Processing.
[7] Filho, J. L. R., Treleaven, P . C., Alippi, C., "Genetic
Algorithrn Programming Environments ” IEEE Comput.
Vol. 27, pp. 28_43, 1994. Er. Satyabrata Dash is a Ph.D research
[8] S. Nanavati, Michael Thieme, Raj Nanavati, eds, "Biometrics: scholar at Ravenshaw University. He is
Identity Verification in a Networked World". John Wiley & working as an Asst. Prof. in dept. of IT,
Sons 2002. Orissa Engg. College, Bhubaneswar,
[9] H. L .Lee, R. Gaensslen, eds. Second ed, "Advances in Orissa, India. He has published 4 research
Fingerprint Technology", Boca Raton, Fla.: CRCPress, papers in international and national journals
2001 & conferences.
[10] Lei Zhang, David Zhang,” Characterization of Palmprints
by wavelet signatures via Directional Context Modeling”,
IEEE Trans. On Systems, Man, and Cybernetics Part B:
376  International Journal of Computer Science and Technology w w w. i j c s t. c o m

Das könnte Ihnen auch gefallen