Beruflich Dokumente
Kultur Dokumente
BIOMETRICS
INNOVATION-2k7
By
Himanshu Madan
Siddharth Shashidharan
TY-ENTC COEP
2
ABSTRACT
No two human beings are alike. Throughout history, man has developed various
mechanisms to identify the unique characteristics that go to build each personality. These
range from instinctive abilities, as demonstrated by a mother in distinguishing between
her twins, to sophisticated tools used in classical forensic sciences. The hallmark of the
twenty first century will be the application of cutting edge technology to delineate with
microscopic precision, what man has hitherto deduced through observation and instinct.
Today, biometrics is fast becoming a frontier science, having great relevance in the
channels of commerce, banking and trade, security, safety and authorization.
The applications of Biometrics are extensive but they can essentially be divided into the
following main groups:
(i) Commercial applications, such as computer network login, electronic data security,
e-commerce, Internet access, ATM, cellular phones, PDA, medical records etc.
(ii) Government applications, such as driver’s licenses, PAN and social security cards,
border and passport control etc.
(iii) Forensic applications, such as corpse identification, criminal investigation, terrorist
identification, parenthood determination, missing children etc.
The schemes often employed for such diverse applications include facial pattern
recognition, fingerprinting and hand geometry, voice signature identification, retinal
and iris scanning, DNA sequencing and signature identification among others. As
technology advances, it opens up a plethora of avenues to exploit. Identification systems
based on a person’s vein patterns, ear shape, body odour and body salinity.
The result is a future that promises a confluence of different biometric
technologies, integrated efficiently to deliver a reliable and secure system of ascertaining
an individual’s identity.
3
Index
• Physical Biometry
• Facial Recognition 4
• Fingerprint Recognition 7
• Iris Scan Biometry 9
• Retinal Scan Biometry 11
• Hand Geometry 12
• DNA Fingerprinting 13
• Behavioral biometry
• Dynamic Signature Recognition 14
• Dynamic Keystroke Identification 15
• Speaker Recognition 16
• Future Biometry
• Vascular Pattern Authentication 18
• Body Odour Identification 18
• Body Salinity 18
• Ear Shape Identification 19
• Project Undertaken 20
4
Facial Recognition
(i) Original image (ii) Absolute difference from empty-room image summed over the three RBG
channels (iii) Resulting foreground segments
Using the difference from a background image identifies the foreground. The
background image can either be either static or the result of adaptive estimation. For the
picture shown above, the algorithm subtracts the empty room image, sums the RGB
5
channels and digitizes the results. Skin colour segmentation is applied on the colour-
normalised foreground segments by using a human skin colour model to produce the skin
likelihood map of the foreground. The likelihood map is then segmented into skin
segments; some heuristics discard those segments that are very unlikely to be human
faces.
(a) Skin colour likelihood map (b) 8-way connectivity skin segments
Top-left: mean
brightness across
rows. Bottom-right:
mean brightness across
columns in the
horizontal stripe of the
eye zone.
(b) Detection of eyes
in the eye zone.
6
co-efficients, with modern systems even allowing for a tolerable amount of tilt.
The eye detection is a pre-requisite required for estimating the features of the
face. Subsequent procedures use probability-model based algorithms that cut out and
characterize the features of the individual.
The former is used to compute distances from a frontal face prototype. The LDA step
dramatically reduces the number of false detections and thus enhances the performance.
Facial recognition: An LDA classifier, as shown above, finally processes the normalized
segments, and an identity tag is attached to each. The system can be trained to read and
track movements of the face hence giving real time operation. After collecting the
requisite samples, it
searches for a template
stored in the database. If a
match is found, the user’s
face is subsequently Block diagram depicting a facial recognition system.
verified or authenticated.
Another technology that is
being applied in facial feature
identification systems is thermal
imaging. Thermal imaging systems
employ an infrared camera to capture
the pattern of blood vessels under the
skin of one's face. The inherent
advantage of this system is that it can
Fingerprint Recognition
Fingerprint is a reproduction of the fingertip epidermis, produced when the finger
is pressed against a smooth surface. Scientific studies on fingerprint started during the
19th century establishing two fingerprint properties that are still accepted as true: high
uniqueness and high permanence. These studies led to the use of fingerprints for criminal
identification, first in Argentina in 1896, then at Scotland Yard in 1901, and other
countries in the early 1900’s.
Fingerprint recognition is a complex
pattern recognition problem. Designing
Figure 1. (a) Fingerprint ridges and valleys
patterns; (b) Ridge bifurcation (black box) and
ridge terminations (black circumferences)
8
algorithms capable of extracting salient features and matching them in a robust way is
quite hard, especially in poor quality images. The most evident structural characteristic of
a fingerprint is a pattern of interleaved ridges and valleys. Ridges vary in width from 100
μm for thin ridges, to 300 μm for thick ridges. Generally, the period of a ridge/valley
cycle is about 500 μm. Ridges and valleys often run in parallel, and sometimes they can
suddenly come to an end (termination), or can divide into two ridges (bifurcation). Ridge
terminations and bifurcations are considered minutiae (small details). There are other
types of minutiae in a fingerprint, but the most frequently used are terminations and
bifurcations. Fig 1(a) shows a fingerprint, where it is possible to observe the ridges and
valleys, and Fig 1(b) shows the ridge bifurcations and terminations found in the
fingerprint area enclosed by the white rectangle.
Automatic fingerprint matching approaches are usually categorized as:
correlation-based matching, where two fingerprint images are superimposed and the
correlation between corresponding pixels is computed for different alignments; minutiae-
based matching, which consists of finding the alignment between the template and the
input minutiae sets that results in the maximum number of minutiae pairing; and, ridge
feature-based matching, which compares fingerprints in term of features extracted from
the ridge pattern, such as: shape, orientation, and frequency.
Minutiae-Based Matching: Most fingerprint matching systems are based on matching
minutiae points between the query and the template fingerprint images. The matching of
two minutiae sets is usually posed as a point pattern matching problem and the similarity
between them is proportional to the number of matching minutiae pairs. The first stage of
the minutiae-based technique is
the minutiae extraction.
Figure2 shows a diagram of a
minutiae extraction algorithm,
composed by five components:
orientation field estimation,
fingerprint area location, ridge
extraction, thinning, and
minutiae extraction.
9
Figure 2:. Minutiae extraction algorithm components. in I and a minutiae mi in T are considered
“matching” if the spatial displacement
between them is smaller than a given tolerance r0 and the direction difference between
them is smaller than an angle tolerance θ0. Aligning the two fingerprints is a mandatory
step of the fingerprint matching in order to maximized the number of matching minutiae.
Correctly aligning two fingerprints requires geometrical transformations, such as:
rotation, displacement, scale, and other distortion-
tolerant transformations. After the alignment, a final
matching score is computed by using the maximum
number of mated pairs.
the process automatically when a subject approaches within 18" of the unit. The iris
contains many collagenous fibers, contraction furrows, coronas, crypts, color, serpentine
vasculature, striations, freckles, rifts, and pits. Measuring the patterns of these features
and their spatial relationships to each other provides other quantifiable parameters useful
to the identification process.
Iris Definition: The image that best meets the focus
and detail clarity requirements of the system is then
analyzed to locate the limbus (the outer boundary of
the iris that meets the white sclera of the eye), the
nominal pupillary boundary, and the center of the
pupil. The precise location of the circular iris has
now been defined and processing can take place.
Field Optimization: A dynamic feature of the
system automatically adjusts the width of the
The pupil and the iris.
pupillary boundary-to-limbus zone in real time to maximize the amount of iris analyzed,
using algorithms that exclude areas covered by eyelids, deep shadow, specular reflection,
etc... Elimination of marginal areas has little negative impact on the analysis process. In
the previous figure, points Cp and Ci are the detected centers of the pupil and iris
respectively. The intersection points of these wedges with the pupil and iris circles form a
skewed wedge polygon p1 p2 p3 p4. The skewed wedge is subdivided radially into N
blocks and the image pixel values in each block are averaged to form a pixel (j,k) in the
unwrapped iris image, where j is the current angle and k is the current radius number.
(a) Detected iris and pupil circles. (b) Iris extracted into 180 angle divisions, 73 radius divisions. (c) Iris
extracted into 128 angle divisions, 8 radius divisions.
11
scanner, and no movement of the eye. The examiner is required to keep the subject's eye
within half an inch of the instrument. The subject must focus on a pinpoint of little green
light (to properly align the eye) and avoid blinking. Retinal scans involve a low-intensity
infrared light that is projected through to the back of the eye and onto the retina. Infrared
light is used due to the fact that the blood vessels on the retina absorb the infrared light
faster than surrounding eye tissue(s). The infrared light with the retinal pattern is
reflected back to a video camera. The video camera captures the retinal pattern and
converts it into data that is 35 bytes in size.
Although retinal patterns are generally thought to be constant during a person's
life, they can change in case of diabetes, glaucoma, retinal degenerative disorders or
cataracts. Therefore, although retinal scans are nearly 100% accurate they cannot be used
as a universal security measure without making allowances for normal changes.
Hand Geometry
Hand geometry systems use an optical camera to capture two orthogonal two
dimensional images of the palm and sides of the hand, offering a balance of reliability
and relative ease of use. They typically collect more than 90 dimensional measurements,
including finger width, height, and length; distances
between joints; and knuckle shapes.
Hand geometry technology posses one of the
smallest reference templates in the biometric field,
generally under ten bytes.
The process involves matching a given hand to a
person previously enrolled in the system. From the
snapshots of the hand, the average feature vector is Hand Geometry Authentication
computed. The given feature vector is then compared with the feature vector stored in the
database associated with the claimed identity.
F = (f1; f2; :::; fd) represents the d-dimensional feature vector in the database and
Y = (y1; y2; :::; yd) is the feature vector of the hand whose identity has to be verified.
The verification is positive if the distance between F and Y is less than a threshold value.
Four distance metrics, absolute, weighted absolute, Euclidean, and weighted Euclidean,
corresponding to the following four equations are explored:
13
∑
j =1
| Y j − Fj | < ε a
d | Y j − Fj |
∑
j =1 σj
< ε wa
∑ (Y − Fj ) < εe
d
2
j
j =1
d (Y j − F j ) 2
∑
j =1 σj
< ε we Pegs as shown aid in the calculation of Euclidean distances
where σ j is the feature variance of the jth feature and εa , εwa , εe and εwe are threshold
2
tissues of the body. Only a small amount of tissue - like blood, hair, or skin - is needed.
Cutting, sizing, and sorting: Special enzymes called restriction enzymes are used to cut
the DNA at specific places. For example, an enzyme called EcoR1, found in bacteria, will
cut DNA only when the sequence GAATTC occurs. The DNA pieces are sorted
according to size by a sieving technique called electrophoresis. The DNA pieces are
passed through a gel made from seaweed agarose (a jelly-like product made from
seaweed), which is the biotechnological equivalent of screening
sand through progressively finer mesh screens.
14
DNA fingerprint: The final DNA fingerprint is built by using several probes (5-10 or
more) simultaneously. Thus, these fingerprints can be compared and matched with those
residing in the database.
This technology has been around for some time, especially in the field of criminal
forensic sciences and parenthood determination. As extraction methods continue to
advance, we shall soon see its application in commercial avenues as well.
Dynamic Signature Verification
Any process or transaction that requires an individual's signature is a prime
contender for signature identification. The major technological hurdle for signature
identification involves the method of trying to differentiate between the parts of the
signature that are habitual (consistent) and those that alter with each signing (behavioral).
Therefore signature identification systems analyze two different areas of an individual's
signature: the specific features of the signature and specific features of the process of
signing one's signature. Features that are taken into account and measured include
speed, pen pressure, directions, stroke length, and the points in time when the pen is
lifted from the paper. Human signatures, despite overall consistencies do contain certain
variations. It is thus imperative to train system to
account for this in order to build a prototype for the
database.
Training:
The system needs to extract a representation
of the training set that will yield minimum
generalization error. DTW (Dynamic Time
Warping) provides the optimal
alignment of two signatures. The
prototype that represents the training set
15
is computed as the mean of the aligned signatures. The individual residual distances
between each of the signatures in the training set and this reference signature are
collected in order to estimate the statistics of the alignment process. This statistics is
subsequently used for classification.
Verification:
Once this mean or prototype signature is ready, the signature is segmented into a
series of strokes, which are encoded to find a close matching between the segments. Test
signatures can then be compared against template signatures and if the match is below a
signer’s specific threshold, the signature is accepted.
Dynamic Keystroke Identification
Signature characteristics
Keystroke dynamics looks at the way a person types on a keyboard. Specifically,
keyboard dynamics systems measure two distinct variables: keystroke duration, which
is the amount of time you hold down a particular key, and keystroke latency, which is
the amount of time between keys. These systems scan for inputs a thousand times per
second. The user normally types out a string of data and his movements and traits are
monitored by the system. The aforementioned features are
extracted from the user's keystroke for the formation of a
template and later for verification.
Shown below is an example of the keystroke duration and
latency observed for the word ‘IVAN’. The features extracted for
formation of the pattern
form the Features
Vector. For the above
example, Extracted features of the Keystroke Dynamics for the word IVAN
Thus, a prototype is generated with the Mean (μ), minimum or maximum and
standard deviation (σ), calculated for each feature (xi) of the pattern of size n, in accord
with the following equations:
n
1 1 n
Mean ( µ )=
N
∑ xi
i =1
Standard Deviation (σ)= ∑ | xi − µ i |
N − 1 i =1
The Classifier is responsible for the process of authentication. It correlates the
pattern to be verified with the template of the prototypes using the Distance Pattern
between the two vectors, calculated as:
1 n
| patteni − prototypei |
D (pattern, prototype) =
N
∑
i =1 σi
A favorable decision is made if this value is less than a predefined threshold.
Speaker Recognition
Speaker recognition, which can be classified into identification and verification,
is the process of automatically recognizing the speaker on the basis of individual
information encoded in speech waves. In speaker identification, the correct speaker is
determined from a given population. In this the test utterance is compared with the
reference model of the registered population. Speaker verification is to determine if
the speaker is who he or she claims to be. So the test utterance is compared only with the
reference model of the claimed identity. Speaker identification can be text independent
or text dependent. Among various types of speech features, LPCC (linear prediction-
based cepstral coefficients) and MFCC (mel-frequency cepstral coefficients) have been
found to be superior for speaker recognition.
In LPCC, Linear prediction
coefficients are obtained for each
frame using Durbin Recursive
method. These coefficients are then
converted to cepstral coefficients. In
MFCC, Fast Fourier Transform is
computed for each frame and then
weighted by a Mel-scaled filter bank.
Speech Spectrograph
17
The filter bank outputs are then converted to cepstral parameters by applying the discrete
cosine transformation.
In text-independent speaker identification, given a set of registered speakers and a
sample utterance, open-set speaker identification is defined as a twofold problem. Firstly,
it is required to identify the speaker model in the set, which best matches the test
utterance. Secondly, it must be determined whether the test utterance has actually been
produced by the speaker associated with the best-matched model, or by some unknown
speaker outside the registered set.
Mathematically speaking, let N speakers are enrolled in the system and their
statistical model descriptions are λ1, λ2,..., λN. If O denotes the feature vector sequence
extracted from the test utterance, then the open-set identification can be stated as:
…………….(1)
Where θ is a pre-determined threshold. In other words, O is assigned to the
speaker model that yields the maximum likelihood over all other speaker models in the
system, if this maximum likelihood score itself is greater than the threshold θ. Otherwise,
it is declared as originated from an unknown speaker. It is evident from the above
description that, for a given θ, three types of error are possible:
• O, which belongs to λm, not yielding the maximum likelihood for λm.
• Assigning O to one of the speaker models in the system when it does not belong
to any of them.
• Declaring O which belongs to λm, and yields the maximum likelihood for it, as
originating from an unknown speaker.
18
These types of error are referred to as OSIE, OSI-FA and OSI-FR respectively (where
OSI, E, FA and FR stand for open-set identification, error, false acceptance and false
rejection respectively). Based on equation (1), it is evident that open-set identification is a
two-stage process. For a given O, the first stage determines the speaker model that yields
the maximum likelihood, and the second stage makes the decision to assign O to the
speaker model determined in the first stage or to declare it as originating from an
unknown speaker. Of course, the first stage is responsible for generating OSIE, whereas
both OSI-FA and OSI-FR are the consequences of the decision made in the second stage.
processing and registered. The vein pattern of the person being authenticated is then
verified against the pre-recorded pattern.
Body Odour Identification
In the case of human recognition, the goal of an electronic nose is to identify an
odorant sample and to estimate its concentration. It basically means signal
processing and pattern recognition system. However, those two steps may be
subdivided into preprocessing, feature extraction, classification, and decision-
making. But first, a database of expected odorants must be compiled and the
sample must be presented to the nose’s sensor array. Currently, a U.K. based
company Mastiff Electronic Systems is said to be in development stages of a
product that digitally sniffs the back of a computer user's hand to verify identity.
ENose device
Body Salinity
An existing system, developed jointly by MIT and IBM, works by exploiting the
natural level of salinity, in the human body. An electric field passes a tiny electrical
current (of the order of 1nA) through the body (salt is an effective conductor).
Applications of this kind of biometric technology could include data transfer between
communication devices carried on the body including watches, mobiles and
pagers. Applications could also include "waking up" household appliances as one enters a
room.
Ear Shape Identification
Ear images can be acquired in a similar manner to face images, and a number of
researchers have suggested that the human ear is unique enough to each individual to be
of practical use as a biometric. There are two major parts to the system: automatic ear
region segmentation and 3D ear shape matching. Starting with the multi-modal 3D+2D
image acquired in a profile view, the system automatically finds the ear pit by using skin
detection, curvature estimation and surface segmentation and classification. After the ear
pit is detected, an active contour algorithm using both color and depth information is
applied, and the contour expands to find the outline of the visible ear region.
20
Optica
l Processin
Storage unit
sensor g unit
Signature
Comparison
Templates
Optical signature
i d e n t i f i c a t i o n s ys t e m .
Identification