Sie sind auf Seite 1von 10

FACE RECOGNITION UNDER POSE AND EXPRESIVITY

VARIATION USING THERMAL AND VISIBLE IMAGES

Florin Marius Pop, Mihaela Gordan, Camelia Florea, Aurel Vlaicu


Centre for Multimedia Technologies and Distance Education
Technical University of Cluj-Napoca, Romania
florinmarius.pop@gmail.com, {Mihaela.Gordan, Camelia.Florea, Aurel.Vlaicu}@com.utcluj.ro

ABSTRACT
Many existing works in face recognition are based solely on visible images. The
use of bimodal systems based on visible and thermal images is seldom reported in
face recognition, despite its advantage of combining the discriminative power of
both modalities, under expressions or pose variations. In this paper, we investigate
the combined advantages of thermal and visible face recognition on a Principal
Component Analysis (PCA) induced feature space, with PCA applied on each
spectrum, on a relatively new thermal/visible face database – OTCBVS, for large
pose and expression variations. The recognition is done through two fusion
schemes based on k-Nearest Neighbors classification and on Support Vector
Machines. Our findings confirm that the recognition results are improved by the
aid of thermal images over the classical approaches on visible images alone, when
a suitably chosen classifier fusion is employed.

Keywords: face recognition, fusion scheme, thermal images, PCA, k-NN, SVMs.

1 INTRODUCTION 1.1 Visible spectrum face recognition based on


Principal Component Analysis
Face recognition research is experiencing a major One of the first holistic methods for face
development due to its potential in the integration of recognition is based on the Principle Component
multiple applications, such as commercial or military Analysis (PCA) for feature extraction [2] and its
systems. Face recognition applications are used for flowchart is illustrated in Fig. 1. The good results
access control to high level security areas or video provided by the classical PCA approach proposed by
surveillance in important security or commercial Turk and Pentland [2] have lead to other successful
areas like airports and casinos. The main advantage approaches based on PCA. Recent improvements
of the biometric systems based on the recognition of over the classical PCA based method in visible
the human face is the non-intrusive information spectrum are using the discriminative power of the
acquisition. Other biometric systems based on wavelet transform sub-bands [3, 4], more complex
physiological features (e.g. fingerprint or iris) require classifiers such as Support Vector Machines (SVMs)
the cooperation of the tested subjects, a scenario that [5], neuronal networks [6], k-Nearest Neighbors (k-
is not always possible. NN) [7] or improved PCA techniques: (2D PCA)2
Face recognition techniques can be classified into [8] and Weighted Modular PCA [9].
two main categories [1]: analytic or feature based
techniques and the holistic or appearance based
techniques. The analytic techniques extract certain
geometrical face features (e.g. the eyes, the nose or
the mouth) for feature comparing. Those approaches
have the disadvantage of not being robust. Their
performance is highly affected by face expressions or
other natural changes and generally has a high
computational cost due to the local feature extraction Figure 1: Flowchart of the classical PCA-based face
involved. recognition methods
Thus, the holistic or appearance based methods
were proposed. These methods have the advantage of PCA based methods’ performance is mostly
a lower computational complexity. The holistic affected by intra-personal variations under
methods are based on techniques that transform the illumination, pose or/and expressivity which degrade
face image into a low-dimensional feature space with the recognition performance, sometimes in a higher
enhanced discriminatory power. manner than by the inter-personal variations [10].
Most of the improvements over classical PCA based acquisition.
method in visible spectrum are usually optimizing The necessity of the non-intrusive data
the computational cost or improve the recognition acquisition for recognition methods lead to recent
rate for small intra-personal variations. approaches [24, 25, 26, 27, 28] that are considering
However, the performance in visible spectrum is fusion schemes based on visible spectrum images
highly affected by illumination, expressivity and and IR thermal images, carrying the advantage that
pose variations, which explains the current interest both of the biometric features are non intrusive.
for the use of thermal images in face recognition [11, These bimodal fusion schemes are exploiting the
12, 13]. advantages from visible and IR thermal images and
trying to compensate the individual drawbacks of
1.2 Thermal spectrum face recognition based each single modality. A multimodal system based on
on Principal Component Analysis 2D images extracted from visible spectrum, the 3D
Infrared (IR) thermal images represent the model of the face and IR thermal images of the same
patterns of the heat emitted from the human body subjects is proposed in [24]. The metric fusion
and are considered as a unique feature of each scheme is based on the product rule, obtaining the
individual [14]. Also, IR thermal face images are following recognition rates: 98.7% for 2D-3D fusion
nearly invariant to illumination and less variant to scheme, 96.6% for 2D-IR fusion scheme and 98%
expressivity than the visible images. It is known that for 3D-IR fusion scheme. Also, considering a metric
even identical twins have different thermal patterns. fusion scheme based on product rule of all three
Earlier approaches were based on determining the biometric features (2D, 3D, IR thermal images), the
thermal shape of the face and used it directly for recognition rate achieved was 100%. The approach
identification [15]. PCA-based techniques can also was tested on a particular image set, since there isn’t
be successfully applied in face recognition on IR a standard database of 2D, 3D and IR thermal images
thermal images [16, 17, 18]. While, the visible of the same subjects. In [25], another approach based
spectrum eigenfaces contain mostly low-frequency on a special type of Convolutional Neural Network is
information, the corresponding IR thermal presented, based on diabolo network model [29] for
eigenfaces have fewer low-frequency components automatic feature extraction from both visible and IR
and many more high-frequency components [15]. thermal images. The recognition rate in IR spectrum
Therefore, the majority of the information for IR was the lowest but those obtained from the fusion
thermal images is distributed in a lower dimensional scheme were improved considering also the
subspace. Even if the results are promising using IR recognition rates obtained from visible or IR thermal
thermal images, their performances are negatively images. All the experiments from [25] are using
affected by ambient temperature, some emotions “Notre Infrared Face Database (X1 Collection)” [30].
states of the tested subject or by wearing glasses. Two fusion schemes were proposed by Singh [27] in
Thus, the IR thermal based recognition rates are in order to improve recognition rates in occlusions
most cases lower than in visible spectrum [17]. For scenarios caused by eye glasses. Thus, an image-
example, the glasses reduce most of the thermal based fusion was performed in wavelet domain and a
energy emitted by the human face. feature-based fusion in the PCA induced feature
However, since IR thermal imaging and visible space. The results show improvements in the
imaging bring complementary information to face recognition rate using the “Equinox Infrared face
recognition, the joint use of the two modalities seems database” [31]. A decision fusion scheme based on a
appealing to face recognition applications. voting scheme for IR and visible face recognition
was proposed by Shahbe and Hati [26]. In their
1.3 Fusion schemes for face recognition based approach, the eigenface and fisherface classification
on visible and thermal spectrum images techniques are applied for extracting the face features,
As presented for visible spectrum images and and the experimental results are obtained using
thermal spectrum images, every biometric feature “Equinox Infrared Face Database” and “OTCBVS
presents advantages, but also disadvantages in the Thermal/Visible Face Database” [32]. Also, in [28]
recognition process. Thus, recently the researchers an integrated image fusion and a match fusion score
present a growing interest for the multimodal approach is proposed. Discrete Wavelet Transform
biometric systems, which become the 2nd most used and a 2ν-Granular Support Vector Machine are
biometric systems after those based on the performed for the fusion of the visible and thermal
fingerprint [19]. Most of the multimodal systems are spectrum face images. 2D log polar Gabor transform
combining the advantages of the visible face images is applied for extracting the global and the local
and the fingerprint [20, 21, 22, 23] which offers facial features and a match score fusion based on
superior performance to the face recognition Dezert Smarandache theory [33] is proposed for
approaches, due to the accuracy of the fingerprints. improving the results over the classical unimodal
Also, these approaches have the disadvantage of approaches and over few classical fusion based face
being intrusive, also due to the fingerprint’s recognition systems. The results were validated using
“Notre Infrared Face Database (X1 Collection)” and representative enough for the given classification
“Equinox Infrared Face Database”. problem.
In this work, we propose to evaluate the When we apply PCA, the most significant
performance of a rather classical approach, based on eigenvectors are extracted from the training database,
the application of the eigenfaces method on the and they define a lower dimensional subspace. They
visible and thermal infrared spectrum face images are also known as “the eigenfaces” [2] due to their
and we propose two fusion methods. The first graphical representation. After computing the
method is based on the fusion of classical PCA eigenfaces, any face image is represented by a
methods results in visible and thermal spectrum with feature vector in the subspace determined by the
respect to k-Nearest Neighbor classifier and the eigenfaces. A short summary of the algorithm [2] is
second method is based on fusion of the feature presented below:
vectors in the PCA induced space using SVMs for For a grey level N × N image, we consider it as a
classification. All the scenarios that are tested in our N2 one-dimensional vector. Let X be a matrix of S
approaches are following the variation of the pose columns and N2 rows, where S represents the number
and expressivity. Both of the approaches are tested of training images, which can be used to represent
on the “OTCBVS Thermal/Visible Face Database”. the entire training set of face images.
Unlike the method of Shahbe and Hati mentioned
above, we propose two new fusion schemes: a new ⎡ x11 x12 ... x1S ⎤
score fusion generation method with the use of the ⎢x ⎥
nearest neighbor classification to increase the x22 ... x2 S
=⎢ ⎥
21
confidence in the recognition results and a features X N 2*S (1)
fusion scheme directly on the PCA induce space. A
⎢ ... ... ... ... ⎥
complete validation in respect to pose and expression
⎢ ⎥
⎢⎣ xN 21 xN 2 2 ... xN 2S ⎥⎦
variation in the face images has been performed in
the paper for the first method, unlike the previous
works on the OTCBVS database. Our approach The goal of PCA is to derive another matrix P
provides an automatic procedure to select the optimal matrix which will describe a linear transformation of
value of the score fusion weight α between the two every column in X (every training face) in the
modalities to maximize an estimate of the eigenfaces subspace, in the form: W=PX, where W
recognition accuracy. For every approach, the are the projections of the training facial images on
optimal values of the fusion weight is different but the subspace described by the eigenfaces. The rows
near closely improved results are obtained for a of P matrix represent the principal components and
single value of the weight. The proposed approaches they are orthonormal.
to IR and visible face image fusion and its The computation of the matrix P requires the
performances in face recognition are presented in the following:
following sections of this paper: in Section II we • Find the mean face vector M, as in (2), where
review briefly the basic PCA approach and a few Xi (i=1,2,…,S) represents the ith column of X:
theoretical issues about identity classification using
the k-NN and SVM classifier. Section III presents 1
S
our fusion based approaches. Section IV summarizes M=
S ∑X i (2)
the experiments and results obtained. Finally, the last i =1
section contains the conclusions.
• Subtract the mean face M from each training
2 THEORETICAL BACKGROUND face Xi:
2.1 The basic PCA approach to face
Hi = Xi − M (3)
recognition
PCA is often used in many forms of data analysis,
from neuroscience to computer graphics, being a • Compute the covariance matrix CA, where A=
simple non parametric method which extracts the [H1, H2... HS]:
relevant information from large data sets. PCA offer
the solution of reducing a complex data set to a 1
CA = AAT (4)
lower dimensional one with a good representation of S −1
the information for discriminative feature selection.
The advantages of PCA were firstly explored by The largest Q eigenvectors of CA (where Q is
Turk and Pentland in their face recognition method determined based on some threshold on the
[2]. For that, a training face image database is needed. eigenvalues) are the vectors of the best basis for the
The training face images set selection is crucial for training set X. Every eigenvector represents a column
the face recognition performance, as they must be of the matrix P[N2×Q]. The representation of any
image I (represented as a vector of length N2) in the classes. In the SVM formulation, the two classes are
subspace described by the Q eigenvectors is given by called “the positive samples class” and “the negative
the vector WI (of length Q): samples class”. In a pattern classification problem,
the positive samples class contains the objects of
W I = P( I − M ) (5) interest, which we aim to identify, and the label +1 is
assigned to these objects. The negative samples class
For every image, a projection on the subspace contains anything but the objects of interest and all
will be extracted. The projection of the test image is these patterns are assigned the label –1.
compared with each training projection in all the Let {xi, yi}, i = 1, 2, . . . , Ntrn, denote Ntrn training
classical approaches on visible spectrum. The examples where xi comprises an N-dimensional
decision process is based on a classifier in the pattern and yi is its class label. We confine at two-
projections space and it returns the identity of a class pattern recognition problem, that is, yi ∈ {−1,+1} ,
person. In the basic approach of Turk and Pentland yi = +1 is assigned to positive examples, whereas yi =
[2], a simple minimum Euclidian distance classifier −1 is assigned to counter examples.
is used. The data to be classified by the SVM might or
might not be linearly separable in their original
2.2 K-NN classifier domain. If they are separable, then a simple linear
k-NN is a supervised classification method in a SVM can be used for their classification; when in the
given feature space – in our case, the eigenfaces the data cannot be separated by a hyperplane in the
space extracted with PCA. To build the k-NN original domain, we may use non-linear SVMs, by
classifier, one simply needs to define the number of projecting the data into a higher-dimensional Hilbert
classes C, to form a labeled training set of Ntrn space using kernel functions and linearly separate the
samples in the feature space, with class labels samples in the higher-dimensional space. In our
yi=1..C, and to consider the labeled samples in the experiments, we have used for now a linear SVM,
training set as known prototypes of the C classes. whose decision function is of the
Furthermore, a distance norm d in the feature space form: f : ℜ N → ℜ :
must be chosen for the data classification (e.g. the
Euclidean distance). Suppose that a vector W from the
feature space must be classified in one of the C ⎛ Ntrn ⎞
⎜ ⎟
classes. By k-NN, W is classified by a majority vote f ( x) = sign⎜ ∑ α i yi xT xi + b ⎟ (6)
of its k nearest neighbors in the sense of the distance ⎜ i =1 ⎟
⎝ ⎠
norm chosen, being assigned to the class most
common amongst them. where αi are the nonnegative Lagrange multipliers
The algorithm needs to follow the steps: associated with the quadratic optimization problem
• Compute the distances d(W,Wt,j) for each that aims to maximize the distance between the two
prototype (denoted by Wt,j, j=1,2,…,Ntrn) from
classes measured in ℜ .
N
the training set.
• Sort the distances d(W,Wt,j), j=1..Ntm, Our particular task, i.e., face recognition, is
increasingly, and keep the labels of the first k implicitly a multi-class classification task, therefore
prototypes (found at the first k smallest distances it requires a multi-class classifier. The two main
from W): {y1’, y2’,…,yk’}. strategies in the literature for building multi-class
SVM classifiers starting from binary SVM classifiers
• Assign to W the yl’ label, most frequent from the
are the one-against-one classification and one-
class sort array {y1’, y2’,…,yk’}.
against-all classification [37]. In the one-against-all
In this paper, for the fusion score approach, we
approach, one binary SVM classifier is constructed
use a k-NN classifier with Euclidean distance, and
for each class, having as positive examples – the
examined four values for k: 1; 3; 5 and 7.
ones from the class, and as negative examples – all
the others from the data set. However this approach
2.3 SVM classifier
may exhibit the disadvantage of an unbalanced
Support vector machines (SVMs) are powerful
training set in respect to the ratio of positive and
classifiers from the machine learning class of
negative examples. The other approach, namely the
techniques, with very good recall and generalization
one-against-one, avoids this drawback; in this case,
performances, able to learn their decision function
an SVM classifier is built for each pair of classes,
from sparse and relatively small training sets [34, 35,
and in the end the class label is decided by the
36]. In a SVM, a binary classification problem is
majority vote over all the SVM classifiers. This is
solved through the optimal separating hyperplane
the approach we adopted in our face recognition
principle. This means that, in the training phase, the
experiment.
SVM “learns” the parameters of a separating
hyperplane H, such that H separates with minimal
error and maximal margin the data from the two
Figure 2: Flowchart of the score fusion scheme proposed. (A) PCA-based feature extraction in visible spectrum;
(B) PCA-based feature extraction in thermal spectrum; (C1) Score fusion scheme.

3.3 The fusion schemes


3 BIMODAL SCORE FUSION SCHEME Every multimodal biometric system requires an
PROPOSED integrating rule for the fusion of the different type of
the feature data extracted. The fusion schemes can be
The main drawback of the PCA based face classified in: fusion at image level [23, 27], fusion at
recognition techniques in visible spectrum is their feature level [20, 27], fusion at matching score level
sensitivity to pose and expression variance, whereas [17, 21, 24, 38] and fusion at decision level [22].
in IR spectrum is the sensitivity of the thermal In our paper, we proposed two types of fusion in
images to ambient temperature and emotion states of order to maximize the face recognition efficiency in
the subject. PCA induced feature space. The first fusion scheme,
Therefore we propose two fusion schemes illustrated in block C1 from Fig. 2 is a score fusion
between the visible and IR thermal face recognition scheme based on the Euclidian distance and the k-NN
to improve the global performance by minimizing classifier. The second fusion scheme is illustrated in
the negative effects. Both schemes are based on the block C2 from Fig. 3 and represents a features fusion
reducing the complex data to a lower dimensional scheme based on a SVM classifier. A briefly review
one in the PCA feature induced space with a good of the fusion schemes that we proposed will be
representation of the information for discriminative presented below.
feature selection. After it, two different fusions are 3.3.1 The score fusion scheme
proposed: in the first method we propose a score As it is illustrated in Fig. 2, the first step in our
fusion scheme and in the second method, a features score fusion scheme is to compute the Euclidean
fusion scheme is proposed. distance between the projections of test image and the
projections of every training image in the visible
3.1 Computing the eigenfaces from the visible eigenfaces space (represented in block A from Fig. 2)
images and repeating the same operations for the projections
As it is illustrated in Fig. 2 and in Fig. 3, inside of the facial images in the thermal spectrum
the first main block from the figures (denoted with (represented in block B from Fig. 2).
A), the eigenfaces and the projections of the train The score fusion scheme, illustrated in block C1
and of the test images in the PCA induced feature from Fig. 2, is applied on the Euclidian distances
space are computed firstly in the fusion score scheme computed previously, with the purpose of
and, also, in the features fusion scheme. Thus, every maximizing the face recognition efficiency for all
facial image acquired in the visible spectrum is subjects by using the two modalities. In our approach
represented in the subspace of the eigenfaces. we proposed a fusion scheme based on the matching
score level, by introducing the weighted distance
3.2 Computing the eigenfaces from IR thermal between the visible spectrum images and the thermal
images images. The weighted distance dw(x,xt) between a
Using the same principle as in the visible bimodal pair of test images x={IV,IIR} and a pair of
spectrum, the eigenfaces in the IR spectrum are training face images xt={It,V,It,IR}, is computed as:
computed, as illustrated in blocks B. from Fig. 2 and
Fig. 3, and the projections of the IR thermal images dw( x, xt ) = α ⋅ d ( IV , I t , V ) + (1 − α ) ⋅ d ( I IR , I t ,IR ) (7)
on the IR thermal eigenfaces subspace are also
computed.
where d denotes the Euclidian distance in the PCA
induced subspace.
Figure 3: Flowchart of the features fusion scheme proposed. (A) PCA-based feature extraction in visible
spectrum; (B) PCA-based feature extraction in thermal spectrum; (C2) Features fusion scheme.

The weight α (between 0 and 1) in the thermal images. Few samples are illustrated in Fig. 4.
computation of dw(x,xt) is determined based on a The images are acquired under illumination,
validation set, to maximize the face recognition rate. expressivity (“surprised”, “laughing”, “angry”) and
On the weighted distances, a k-NN classification is pose (11 positions for each type of acquisition as
applied to obtain the recognition result. illustrated in Fig. 3) variation. We select OTCBVS
3.3.2 The features fusion scheme benchmark since its images simulate the main
The feature fusion scheme is illustrated in Fig. 3, variations of real scenarios.
where in block A is illustrated the feature extraction
in visible spectrum, in block B the feature extraction
in thermal spectrum and in block C2 the features
fusion scheme. The purpose of the scheme is to fuse
the projections of the visible facial image IV and the
projections of the similar thermal spectrum facial
image IIR in a β weighted combination, where the
weight β should be chosen to maximize the
recognition rate of the multi-class linear SVM
classifier over some validation set of examples. Let
us denote the recognition rate of this classifier,
depending on β, by rate(β),
Number of correctly classified instances in the validation set .
rate(β) =
Total number of instances in the validation set
Then the feature vector is described by Eq. (8) and Figure 4: Samples of different subjects from
the optimal value of the weight β is given by the Eq. OTCBVS; (A) Visible spectrum images; (B) IR
(9). thermal images

x FeatureFusion (β) = [β • IV ; (1 − β) • I IR] (8)

β = arg max rate(β) (9)

4 EXPERIMENTS AND RESULTS

In order to test our fusion approach, we use the Figure 5: Example of a subject acquisition under
OTCBVS benchmark [32]. For both of the fusion pose variation; (A) Visible spectrum images; (B) IR
schemes proposed, we established a small training thermal images
set and a validation set to derive the optimal weight
for the fusion (α or β), and several test sets for In our tests, no preprocessing is performed on the
expressions and pose variations. images, due to test the discriminative power of IR and
visible spectrum in individual PCA based face
4.1 The face database and the evaluation recognition methods and in our fusion based
design approach. The preprocessing techniques would
In our experiments, we use OTCBVS benchmark increase indirectly the recognition rate. The data used
[32], which contains 4228 pairs of visible and IR for the experiment is divided in three main sets:
training, validation and test sets. Efforts were made to The recognition rate in the validation set for
test the fusion schemes proposed, in order to reflect values of α ranging from 0 (IR modality only) to 1
the performance of real world scenarios according to (visible modality only – 100%) are illustrated in Fig.
high variations of pose or expressivity. Most of the 6 for the score fusion scheme. Thus, in Fig. 6, the
results reported in the literature are obtained on classical PCA based recognition rate in visible
experimental images with nearly frontal pose, with spectrum is draw with the dotted blue line and in IR
preprocessing (manually or automatically) such as spectrum with the dotted green line. According to the
localization, scale or rotation of the human faces, validation set, the weight of the visible image score
which are not practical in most of the face recognition in the score fusion scheme is chosen as the minimum
applications. weight 82% (i.e. α=0.82) that maximizes the
recognition rate. It is important to remark from Fig. 6
4.2 The training set that for a large set of α values, from 0.2 to 0.98, the
We include in the training set a total of 12 nearly performance of the bimodal approach is higher than
frontal images for each subject: 6 images in IR the performance of the classical PCA approach on
spectrum and 6 images in visible spectrum. For each visible images and for every α between 0.01 to 1, the
spectrum there are 3 images under the “surprised” bimodal performance is higher than the classical
expressivity (pose 4, 6 and 8) and 3 images under the PCA approach on IR thermal images.
“laughing” expressivity (also pose 4, 6 and 8). C For the features fusion scheme, many values of β
classes are defined from the training set, and each offers superior results than a classical PCA-a based
class contains 6 projections in the eigenfaces approach in visible or only thermal spectrum with a
subspace for each spectrum. linear SVM classifier. Such as values of the features
fusion weight which maximize the recognition
4.3 The validation set results are 0.7, 0.72 or 0.8. The 0.8 value for β is
After computing the projections of the training carried out for the performance evaluation tests to
images in the eigenfaces subspace, we must compare all the results from both of the fusion
determine the optimal value of the fusion parameters schemes proposed.
α (weight of the score fusion scheme) and β (weight
of the features fusion scheme). This is done by 4.4 Performance evaluation test
tuning the weights to maximize the recognition The next experiments aim to evaluate the
performance on a validation set. The validation set performance of our bimodal fusion based approaches.
includes images with the “angry” expressivity under To exhibit superiority of the bimodal systems, the
pose 5 and 7. single modal scheme’s performances of the visible
images and the IR thermal images are carried out for
comparison. The performances are evaluated by the
means of k-NN classification (with k=1, 3, 5 and 7)
for the score fusion scheme and SVM classification
for the features fusion scheme. A particular
experiment is for k-NN based only on the first
neighbor, i.e. k=1, and only on the visible spectrum
images, i.e. α=1, which is the classical PCA-based
method [2] but applied on the OTCBVS database.
The first evaluation experiment is considered an
expressivity test. The test set is consisting from
images with the “angry” expressivity, a different
expressivity than those from the training set, and the
same nearly-frontal poses: 4, 6 and 8.
Figure 6: Results on the validation set for various α
values in the score fusion scheme.

Table 1: Description of the images sets for the performance evaluation tests.

Test 1 Test 2 Test 3 Test 4 Test 5 Test 6


Expr. Pose Expr. Pose Expr. Pose Expr. Pose Expr. Pose Expr. Pose
- surprised 1, 2, 3, - surprised 2, 3,
4, 6, - surprised 3, 5, - surprised
- angry - angry 3, 9 5, 7 - laughing 5, 7, 9, - laughing 9,
8 - laughing 7 - laughing
- angry 10, 11 - angry 10
Table 2: Recognition rates for the score fusion scheme on OTCBVS (V- visible spectrum; IR – thermal IR
spectrum; Bi – bimodal score fusion scheme).

1-NN 3-NN 5-NN 7-NN


V IR Bi V IR Bi V IR Bi V IR Bi
Test 1(%) 91.66 73.80 96.42 85.71 73.80 94.04 82.14 70.23 85.71 79.76 65.47 86.90

Test 2(%) 95.08 87.05 98.21 90.62 78.57 94.64 83.48 75.00 91.07 76.33 64.73 83.03

Test 3(%) 89.28 62.50 92.85 82.14 51.78 87.50 73.21 44.64 78.57 64.28 37.50 71.42

Test 4(%) 99.10 92.85 100 99.10 86.60 99.10 97.32 84.82 99.10 89.28 77.67 96.42

Test 5(%) 69.10 53.72 74.85 64.13 48.21 70.53 61.01 43.75 66.51 55.65 36.16 59.07

Test 6(%) 75.89 56.25 82.73 66.36 48.80 75.29 61.30 42.55 69.34 55.65 33.03 58.33

The second evaluation experiment is a pose test, For the features fusion scheme it can be seen that
where the test set consist from images with the same also the bimodal approach had obtained better results
expressivity as those from the training test than the classical PCA approach on a single spectrum
(“surprised” and “laughing”) but with different poses with a linear SVM for the classification. Due to the
selected (3, 5, 7 and 9). The third test set includes complexity of the SVM, the results for the visible
images with a different expressivity (“angry”) and spectrum or the thermal spectrum facial images are
also different poses (3 and 9) than those from the superior to those obtained with k-NN in almost all the
training or the validation sets. Finally, we evaluate tests. Also, the improvements obtained with a
the performance of our approaches with other three bimodal system in the features fusion scheme are
test sets that include also the largest expressivity and usually better than the results of the score fusion
pose variations from the data set, with extreme pose scheme. In test 4, the rates for visible spectrum and
variations (such as pose 1 or 11) and all the for our features fusion scheme is maximum for the
expression variations. The description of the test weight β=0.8. All the results for the features fusion
images sets for our experiments is illustrated in scheme are illustrated in Table 3. As can be observed,
Table 1. the SVM classifier use more favorable the features
from the thermal spectrum than the k-NN classifier.
4.5 Results
The performances of each individual modality and Table 3: Recognition rates for the features fusion
of our fusion based approach are illustrated in Table 2 scheme on OTCBVS (V-visible spectrum; IR-
for the score fusion scheme and in Table 3 for the thermal IR spectrum; Bi-bimodal features fusion
features fusion scheme. As can be seen, in every test scheme).
for both of the fusion schemes, the recognition rate
for the IR spectrum is lower than the visible spectrum V IR Bi
with significantly differences, also due to the Test 1(%) 94.04 76.19 98.80
acquisition of the IR thermal images which are more Test 2(%) 96.42 92.85 98.21
variant in respect to pose and even rotation, compared Test 3(%) 83.92 66.07 92.85
with the images from the visible spectrum. It is Test 4(%) 100 99.10 100
expected that a preprocessing of both the visible and Test 5(%) 69.94 55.80 74.55
the IR images may solve partially the significant Test 6(%) 75.29 59.82 81.84
difference of the results.
For the score fusion scheme, it can be seen that For our experiments, we proposed a set with a
for the 1-NN classification, our approach has superior small number of training images in order to simulate
results in all the tests, also being the only one that a real practical application with difficult scenarios
achieves 100% recognition rate in some tests (i.e. such as: few samples for every person and high
Test 4). The recognition rate of our approach is lower variations in appearance in the test images. As can be
in expressivity variance than in pose variance, mainly seen in Table 2 and Table 3, for a difficult test which
due to the training set selected. For an extremely high considers all the expressions and a large set of
pose variance (i.e. pose 1 and 11), it is expected that positions of the subject (even with half of the face
all the rates will be lower, as in Tests 5 or 6. Another hidden), the recognition result of the classical PCA-
issue easily to remark is the lower recognition based approach is as poor as 69.10% and for a PCA-
performance of the higher nearest neighbor approach classified with SVM classifier is closely as
classification (i.e. k=5, 7) due to the small number of 69.94% and our fusion based approaches improves
samples in the training set as compared to the number the recognition rate with almost 6% in both cases.
of classes.
Figure 7: Performance of the fusion-based approaches.

From the same tests, it can be seen that even if the face recognition, Journal of Cognitive
SVM classifier offers superior results in most of the Neuroscience, 3(1), pp. 71-86 (1991).
test, there are situations when it has minor lower [3] M.R. Gupta and N.P. Jacobson: Wavelet
differences than the score fusion approach based on Principal Component Analysis and its
k-NN classifier. In all the experiments, we found that Application to Hyperspectral Images, IEEE Int’l
the performance of our score fusion based approach Conf. on Image Processing, pp. 1585-1588
with α=0.82 and our features fusion based approach (2006).
with β=0.8 exceeds the individual performances of [4] W. Hu, O. Farooq and S. Datta: Wavelet Based
the systems, sometimes with almost 10% (illustrated Sub-space Features for Face Recognition, CISP
in Fig. 7 for direct comparison of the results). '08. Congress on Image and Signal Processing,
Vol. 3, pp. 426-430, (2008).
5 CONCLUSIONS [5] H. Wang, S. Yang and W. Liao: An Improved
PCA Face Recognition Algorithm Based on the
Two fusion based approaches that highly Discrete Wavelet Transform and the Support
improves the performance of the classical PCA-based Vector Machines, Int’l Conf. on Computational
techniques are proposed in this paper. First approach Intelligence and Security Workshops, pp. 308-
is a score fusion system of the PCA induced feature 311 (2007).
space with a k-NN classification and the second [6] M. Mazloom and S. Ayat: Combinational
approach is a directly features fusion system in the Method for Face Recognition: Wavelet, PCA
same PCA induced space with a linear SVM and ANN, Digital Image Computing:
classification. The PCA-based techniques in visible Techniques and Applications, pp. 90-95 (2008).
spectrum aim to achieve high recognition rate for
[7] P. Parveen and B. Thuraisungham: Face
frontal images with low intra-personal variation.
recognition using multiple classifiers,
Thus, for practical face recognition application, the
acquisition’s conditions cannot be always controlled Proceedings of the 18th International Conference
(e.g. expressivity and pose variance), and the on Tools with Artificial Intelligence, pp. 179-
performance of the classical PCA-based approaches 186 (2006).
are highly affected. In order to minimize the intra- [8] D. He, L. Zhang and Y. Cui: Face Recognition
personal variations of the human faces, we combine Using (2D)^2PCA and Wavelet Packet
the discriminative power of IR and visible spectrum, Decomposition, Congress on Image and Signal
and provide a principled formulation of a procedure Processing, vol. 1, pp. 548-553 (2008).
to select optimal values of the fusion weight α or β [9] M. Zhao, P. Li and Z. Liu: Face Recognition
between the two modalities. To improve the Based on Wavelet Transform Weighted Modular
recognition rates, the classification with a non-linear PCA, CISP’08, vol. 4, pp.589-593 (2008).
more complex SVM can be performed or [10] G.F. Xu, S.Q. Ding, L. Huang and C.P. Liu:
preprocessing can be also performed, thus the scale, Recognition based on wavelet reconstruction
rotation and illumination variations would be face, International Conference on Machine
reduced. Learning and Cybernetics, pp. 3005-3020 (2008).
[11] Y. Yoshitomi, T. Miyaura, S. Tomita and S.
6 REFERENCES Kimura: Face Identification Using Thermal
Image Processing, 6th IEEE International
[1] R. Brunelli and T. Poggio: Face recognition: Workshop in Robot and Human Communication,
features versus templates, IEEE Trans. Patt. pp. 374–379 (1997).
Anal. Mach. Intell. 15(10), pp. 1042-1052 [12] D. Socolinsky, A. Selinger and J. Neuheisel:
(1993). Face Recognition With Visible And Thermal
[2] M.A. Turk and A.P. Pentland: Eigenfaces for Infrared Imagery, Computer Vision and Image
Understanding, Vol. 91, Issue 1-2, pp.72–114 on Biometrics: Theory, Applications, and
(2003). Systems, pp.1-6 (2009).
[13] A. Selinger and D. Socolinsky: Appearance- [26] M.D. Shahbe and S. Hati: Decision fusion based
Based Facial Recognition Using Visible And on voting scheme for IR and visible face
Thermal Imagery: A Comparative Study, recognition, Computer Graphics, Imaging and
Technical Report 02-01, Equinox Corporation Visualisation, pp. 358-364 (2007).
(2002). [27] S. Singh, A. Gyaourova, G. Bebis, I. Pavlidis:
[14] F.J. Prokoski, R.B. Riedel and J.S. Coffin: Infrared and visible image fusion for face
Identification of individuals by means of facial recognition, Proceedings of SPIE Defense and
thermography, Proceedings of the IEEE Security Symposium, vol. 5404. pp. 585-596
International Conference on Security (2004).
Technology, Crime Countermeasures, pp. 120– [28] R. Singh, M. Vatsa, A. Noore: Integrated
125 (1992). Multilevel Image Fusion and Match Score
[15] S. G. Kong, J. Heo, B.R. Abidi, J. Paik, M.A. Fusion of Visible and Infrared Face Images for
Abidi: Recent advances in visual and infrared Robust Face Recognition, Pattern Recognition,
face recognition-a review, Computer Vision and Vol. 41, Issue 3, pp. 880-893 (2008).
Image Understanding, Vol. 97, Issue 1, pp. 103- [29] H. Schwenk: The diabolo classifier, Neural
135 (2005). Computation, Vol. 10, Issue 8, pp. 2175–2200
[16] X. Chen, P. J. Flynn, K. W. Bowyer: PCA- (1998).
Based Face Recognition in Infrared Imagery: [30] <Notre Infrared Face Database>:
Baseline and Comparative Studies, International http://www.nd.edu/~cvrl/undbiometricsdatabase.
Workshop on Analysis and Modeling of Faces html.
and Gestures, IEEE, Nice, France, (2003). [31] <Equinox Infrared Face Database>
[17] D.A. Socolinski, A. Selinger: Thermal Face http://www.equinoxsensors.com/products/HID.h
Recognition In An Operational Scenario, CVPR tml
2004, Vol. 2, pp. II-1012 - II-1019 (2004). [32] <OTCBVS Thermal/Visible Face Database>:
[18] S.W. Jung, Y. Kim, A.B.J Teoh, K.A. Toh: http://www.cse.ohio-state.edu/OTCBVS-
Robust Identity Verification Based on Infrared BENCH/bench.html
Face Images, ICCIT’07, pp. 2066-2071 (2007). [33] F. Smarandache and J. Dezert: Advances and
[19] A.F. Abate, M. Nappi, D. Riccio, G. Sabatino: applications od DSmT for information fusion,
2d And 3d Face Recognition: A Survey, Pattern American Research Press (2004).
Recognition Letters, Vol. 28, Issue 14, pp. 1885- [34] N. Vapnik: Statistical Learning Theory, J. Wiley,
1906 (2007). N.Y., (1998).
[20] Y. Yao, X. Jing, H. Wong: Face And Palmprint [35] M. Gordan, C. Kotropoulos, I. Pitas: A Support
Feature Level Fusion For Single Sample Vector Machine-Based Dynamic Network for
Biometrics Recognition, Neurocomputing Vol. Visual Speech Recognition Applications,
70, Issues 7-9, pp. 1582–1586 (2007). EURASIP JASP, Special Issue on Joint Audio-
[21] S. Ribaric, I. Fratric: A Biometric Identification Visual Speech Processing, Vol. 2002, No. 11 ,
System Based On Eigenpalm And Eigenfinger pp. 1248-1259 (2002).
Features, IEEE Trans. on Patt. Anal. and Mach. [36] M. Gordan, A. Georgakis, O. Tsatos, G. Oltean,
Intell., Vol. 27, Issue 11, pp. 1698–1709 (2005). L. Miclea: Computational Complexity Reduction
[22] L. Hong and A. Jain: Integrating faces and of the Support Vector Machine Classifiers for
fingerprints for personal identification, IEEE Image Analysis Tasks Through the Use of the
Transactions on Pattern Analysis and machine Discrete Cosine Transform, Proc. of IEEE-
Intelligence, Vol. 20, Issue 12, pp. 1295-1307 TTTC International Conference on Automation,
(1998). Quality and Testing, Robotics A&QT-R 2006
[23] X. Jing, Y. Yao, D. Zhang, M. Li: Face and (THETA 15), Volume 2, pp. 350 – 355 (2006).
Palmprint Pixel Level Fusion And Kernel DCV- [37] J. Milgram, M. Cheriet, R. Sabourin: “One
RBF Classifier For Small Sample Biometrics against one” or “one against all”: which one is
Recognition, Pattern Recognition Vol. 40, Issue better for handwriting recognition with SVMs?,
11, pp. 3209–3224 (2007). Tenth International Workshop on Frontiers in
[24] K.I. Chang, K.W. Bowyer, P.J. Flynn, X. Chen: Handwriting Recognition (2006)
Multi-biometrics Using Facial Appearance, [38] C. Lu, J. Wang and M. Qi: Multimodal
Shape and Temperature, Proceedings Sixth IEEE Biometric Identification Approach Based on
International Conference on Automatic Face and Face and Palmprint, Second International
Gesture Recognition, pp. 43-48 (2004). Symposium on Electronic Commerce and
[25] P. Buyssens, M. Revenu, O. Lepetit: Fusion of Security, Vol.2, pp. 44-47 (2009).
IR and visible light modalities for face
recognition, IEEE 3rd International Conference

Das könnte Ihnen auch gefallen