Beruflich Dokumente
Kultur Dokumente
Journal:
Manuscript ID:
IPR-2013-0876
Manuscript Type:
Research Paper
Keyword:
17-Feb-2014
Borgi, Mohamed Anouar; university of sfax, computer science
el arbi, maher
Labate, Demetrio; University of Houston, department of Mathematics
Ben Amar, Chokri
COMPUTER VISION, IMAGE PROCESSING, PATTERN RECOGNITION
Page 1 of 14
Introduction
Biometrics is an attractive area of pattern recognition and computer vision, in fact, the current
trend emphasizes biometrics which can collected static characters on the move, as the face, so
that there is significant interest in more sophisticated and robust methods to go beyond current
state-of-the-art FR methods. One of the most successful approaches to template-based face
representation and recognition is based on Principal Component Analysis (PCA). However,
PCA approximates texture only, while the geometrical information of the face is not properly
captured. In addition to PCA, many other linear projection methods have been considered in
face recognition applications. The LDA (Linear Discriminant Analysis) has been proposed [1]
as an alternative to PCA. This method provides discrimination among the classes, while the
PCA deals with the input data in their entirety without paying much attention for the
underlying structure. Moreover, to deal with the challenges in practical FR system, active
shape model and active appearance model [2] were developed for face alignment; LBP [3]
1
IET Review Copy Only
and its variants were used to deal with illumination changes ; and Eigenimages [4][5] and
probabilistic local approach [6] were proposed for FR with occlusion.
On the other hand, a new generation of multiscale methods has emerged in recent years which
go far beyond traditional wavelets and have been shown to have the potential to provide better
performing algorithms in a variety of biometric-based approach. The shearlet system is one
notable example of these new classes of multiscale systems, which has the ability to capture
anisotropic information very efficiently, outperforming traditional wavelets. One particularly
appealing feature of shearlets is that they combine a multiscale framework which is
particularly effective to capture the geometry of a face, together with a simple mathematical
construction which can be associated to a multiresolution analysis and enables fast numerical
processing.
We recall that some recent work using shearlets for FR has recently appeared [7] [40]. In this
work, we present a new method for FR, called shearlet network (SN), which is refinement of
the classical wavelet network (WN). In this approach, faces will be approximated by a linear
combination of weighted sum of shearlets and the weights will be used in the on-line
recognition stage to calculate the similarity score between Gallery and probe face. We use
PCA-based approach for a fusion step with SN to provide more depth to the facial texture
appearance of the face; this fusion is achieved via a model of belief function which will be
explained below.
The rest of this paper is organized as follows. Sec. 2 describe the related work on face
recognition based sparsity and information fusion based methods, In Sec. 3, we briefly
describe some background on shearlets. Sec. 4 presents the proposed face recognition method.
In Sec. 5, the experimental results of the proposed algorithm are demonstrated and compared
with other algorithms. Finally, Sec. 6 concludes this paper.
Related Work
Recently, FR via sparse representation based classification (SRC) [8] has received more
attention as a powerful tool for statistical modelling, sparse representation (or sparse coding)
and has been successfully applied to face processing applications. SRC casts the recognition
problem as one of classifying among multiple linear regressions and uses sparse
representations computed via l1 minimization for efficient feature extraction suppose then the
testing face image is represented as a sparse weighted combination of the training samples
and the classification is performed by comparing which class yields the least representation
error.
Besides SRC, another powerful method recently proposed is the Regularized Robust Coding
(RRC) approach [9] [39] which could robustly regress a given signal with regularized
regression coefficients. By assuming that the coding residual and the coding coefficient are
respectively independent and identically distributed, the RRC seeks for a maximum a
posterior solution of the coding problem. An iteratively reweighted regularized robust coding
algorithm was proposed to solve the RRC model efficiently.
Although SRC performs well when the set of training images is sufficiently large, it is still
inadequate for many real world applications where only a single sample per subject is
available.
As a part of this work, a step a belief fusion, we will present next a simple review on
information fusion based methods. In fact, Nowadays, Information fusion has been employed
2
IET Review Copy Only
Page 2 of 14
Page 3 of 14
as an efficient tool for the alliance of data acquired from different sources. In particular, face
recognition was a niche where information fusion was applied broadly. Fusion techniques are
frequently used when both shape and texture modalities are available. The standard approach
is to design separate classifiers for each individual modality and to combine them at the score,
rank, or decision level. A typical example of this approach is given in the work of K. I. Chang
et al [10] where PCA-based matchers was used for shape (depth image) and texture modalities
and fused their match scores by a weighted sum rule.
G. Pan et al [11] present a face recognition system that combines profile and surface
matchers. The three profile experts use one vertical and two horizontal profile measurements.
The surface expert makes use of a weighted ICP-based surface matcher. The similarity scores
from these four matchers are combined by the sum rule.
Gkberk et al. [12] have briefly discussed shape-only features such as point-cloud-, surfacenormal-, depth-image-, and profile-based shape information. In this first round of fusion
experiments, they have observed that some fusion schemes outperform the best individual
classifier. They experimented with various combination methods such as fixed rules at score
level (sum/product), rank-based combination rule (Borda count), and abstract-level voting
method (plurality voting). In addition, they have proposed a two-level serial-fusion scheme
where the first level functions as a prescreener, whereas the second level uses linear
discriminant analysis (LDA) to better separate the gallery images.
Another type of shape-based expert fusion is proposed in [13]. This approach is essentially a
multiregion approach where different facial-region pairs from the gallery and probe images
are matched, and the matching scores are combined with the product rule. The local experts
compute the surface similarities of three overlapping regions around the nose by using the
ICP algorithm, and their registration errors by these three surface matches are then combined.
A feature-level fusion scheme is presented in [14] where global shape features are
concatenated with the local features. The dimensionality of the concatenated vector is reduced
by the PCA method.
Fusion techniques are frequently used when both shape and texture modalities are available.
The standard approach is to design separate classifiers for each individual modality and to
combine them at the score, rank, or decision level. A typical example of this approach is given
in [15] where PCA-based matchers was used for shape (depth image) and texture modalities
and fused their match scores by a weighted sum rule. Another approach is proposed by C. Ben
Abdelkader et al [16] who use a local-feature-analysis (LFA) technique instead of the
classical PCA to extract features from both shape and texture modalities. This classifier
combines texture and shape information with the sum rule. Another interesting variant is the
data-level fusion. The depth-image pixels are concatenated to the texture-image pixels to form
a single vector. LDA is then applied to the concatenated feature vectors to extract features.
Mian et al. [42] propose the use of local textural and shape features together in order to cope
with the variations caused by expressions, illumination, pose, occlusions, and makeup. The
textural features are based on scale-invariant feature transform. Tensors constructed in locally
defined coordinate bases are used as 3-D descriptors. The two schemes are fused at score
level with confidence-weighted sum rule.
3
IET Review Copy Only
Page 4 of 14
Background on Shearlets
The shearlet representation has emerged in recent years as one of the most effective
frameworks for the analysis and processing of multidimensional data [21]. The shearlet
approach is derived from the theory of wavelets with composite dilatation, a method
providing a general framework for the construction of waveforms defined not only at various
scales and locations, as traditional wavelets, but also at various orientations and with different
scaling factors in each coordinate. As shown in several publications [22][23] shearlets are
particularly effective in a number of image processing application, such as denoising and
feature extraction, where it is important to capture the geometric information efficiently. As a
generalization of the traditional wavelet approach, the continuous shearlet transform [24], is
defined as the mapping:
SH ( a , s , t ) = f , a , s ,t , a > 0, s , t
1
2
(1)
The shearlet transform is a function of three variables: the scale a, the shear s and the
translation t. By choosing the generator function appropriately, one can construct a collection
of functions ast ( x ) , called shearlets, which are well-localized waveforms at various scales,
orientations and locations. One of the main properties of the Continuous Shearlet Transform
is its ability to describe very precisely the geometry of the singularities of a 2-dimensional
function f [41].
4
IET Review Copy Only
Page 5 of 14
1 1
obtain a discrete transform. Specifically, M as is discretized as M jl = Bl A j , where B =
,
0 1
4 0
A=
are the shear matrix and the anisotropic dilation matrix, respectively. Hence, the
0 2
discrete shearlets are the functions of the form:
3j
j
(2)
Figure 1 illustrates the two-level shearlet decomposition of a face image from Yale database,
where the first-level and and the second-level decomposition generates 8 subbands
corresponding to the different directional bands.
Fig. 2. The top image is the original Face image. The image below the top image contains the
approximation shearlet coefficients. Images of the detail shearlet coefficients are shown
below.
4
The proposed approach is organized as follows. The Gallery faces are approximated by a
shearlet network (SN) to produce a compact biometric signature. It is this signature,
constituted by shearlets and their weights, which will be used to match a Probe with all faces
in the Gallery. In the recognition stage, the Test (Probe) face is projected on the shearlet
network (SN) of the Gallery face and new weights specific to this face are produced. The
family of shearlets remains then unchanged (that of the Gallery face). Finally, a similarity
score is produced by comparing the weights of the Gallery face with the weights of the test
5
IET Review Copy Only
Page 6 of 14
face. In another section of the pipeline, PCA is used for FR [17] in order to generate the
eigenfaces from Gallery dataset which will be operated for the projection step of Probe
dataset. Those two matchers, SN and PCA, will generate two confusion matrixes which are
used to fill two belief mass matrixes with belief functions. Finally, we combined those
matrixes using a conjunction fusion rule. The pipeline of all these stages is illustrated in
Figure 2.
A SN is a combination of the RBF neural network and the shearlet decomposition. The SN
algorithm approximates a 2D face image f using a linear combination of shearlet functions in
the network that are multiplied by corresponding weights according to:
~
f = w j ,l , k j ,l , k
(3)
j ,l , k
Page 7 of 14
where f is the face image, f is the face image approximation, w j ,l ,k are the weights and
w j ,l , k are
the shearlet functions. An important part in the design of this method is the weights
f i w j , l , k
j ,l , k
(4)
j ,l ,k
The mother shearlet used in our work to construct the family { j ,l ,k } is the second derivate of
the beta function. Details regarding beta function can be found in the work of C. Ben Amar et
al [25].
The algorithm used for the FR by SN is based the frames theory for the weights calculations
[26]. By sampling the continuous shearlet transform SH ( a , s, t ) on an appropriate
discretization lattice for the scaling, shear, and translation parameters ( a, s, t ) , one obtains a
discrete transform which is associated to a Parseval (tight) frame for L2 ( 2 ) . Indeed, we
obtain a discrete system of shearlets { j ,l ,k } , for j , l , k
j , l , k 2
f , j ,l , k j ,l , k
(5)
In the optimization stage, a shearlet coefficient from the library is processed through the
hidden layer of the network and used to update the weights. In order to know if a shearlet (n)
will be an activation function of a new "neuron", we must verify if it is a linear combination
of the other (n-1) shearlets of the network. The calculation of the weights connection in every
stage is obtained by projecting the image to be analysed on a family of shearlets. Hence, for a
given family of shearlets it is not possible to calculate the weights by direct projection of the
function f . At every stage of the optimization process we need to know the dual family of the
shearlets forming our shearlet network. The dual shearlet family is calculated by the formula:
~i
j ,l , k
(6)
7
IET Review Copy Only
Page 8 of 14
~i
j ,l , k
j ,l ,k =
j ,l , k
.
~i
3. Calculate the weights by direct projection of the image on the dual shearlet
w =< f ,
i
j ,l , k
>.
4.3
d d MIN
(7)
d MAX d MIN
Where d MAX , d MIN , d , d ' denote respectively the maximum, the minimum, the original and
the normalized distances. In the second step, we use a Bayesian assignment [29 to compute
the belief mass functions values from the confusion matrix. Finally, a conjunctive
combination rule [30] defined by the following equation is used to fuse confusion matrices.
mSN PCA ( A) = mSN mPCA ( A) =
mSN ( B ) mPCA ( C ) , A
BC = A
8
IET Review Copy Only
(8)
Page 9 of 14
(9)
With mSN and mPCA the two Belief mass functions associated to SN and PCA.
Experimental Results
We run our experiments using standard benchmark face databases to evaluate the
performance of the proposed approach. We used the Extended Cohn-Kanade (CK+) [31], FEI
[32], Extended YALE B [33] and FRGC v1 [34] face databases. All the images are copped
except the Extended Cohn-Kanade (CK+) database and resized to 2732.
In this paper, we chose to select randomly the face image both for Gallery and Probe dataset.
A description of the databases used in Table 1 and examples of subjects (Gallery and Probe)
in Figure 3.
Database
Database
Size
Subject
number
CK+
FEI
Yale B
593
2800
16128
123
200
38
FRGC
v1
4003
152
9
IET Review Copy Only
Page 10 of 14
considered the problems of face coding, recognition and authentication. Results are provided
in next subsections.
5.1
We use some standard quality measurement tool, such as the Peak Signal to Noise Ratio
(PSNR) and the Normalized Cross Correlation (NCC). We compared the two approaches, e.g,
SN vs. WN, and to the standard shearlet representation. We test with 3 face images, taken
randomly, for the same person from YALE database. The SN (table 2) is found to outperform
the two other methods that we considered.
Method
Shearlet
WN
SN
Test1
Test2
PSNR
NCC
PSNR
NCC
27.1227 0.9913 26.1698 0.9853
36.9600 0.9967 33.4986 0.9942
37.7788 0.9973 36.2613 0.9970
Test3
PSNR
NCC
26.1582 0.9861
32.5192 0.9931
35.8314 0.9962
5.2
The following experiments illustrate our results regarding authentication and recognition
tasks. In these experiments, similarity matrices and mass belief functions matrices are
produced, rank-one recognition rates (Rank-1) are computed and evaluation Receiver
Operator Characteristic (ROC) curves are plotted. Table 3 summarize the rank-one
recognition rates (Rank-1). SNPCA, the proposed approach, presents a Rank-1 which
outperforms SN, BHDT [38], PCA approaches with whole the face databases.
Method
PCA
BHDT
[41]
SN
SNPCA
Extended
CK+
80.49
91.87
FEI
Yale B
FRGC v1
59.00
62.50
51.50
84.81
12.50
26.97
95.94
96.75
79.50
85.50
97.37
97.37
34.87
43.50
10
IET Review Copy Only
Page 11 of 14
some experiments are illustrated in Figure 5, one image for each one of the 5 face database
considered. Each figure shows an ROC curve, where the x-axis represents the False Accept
rate. The figures show that SNPCA provides consistently the best performance.
100
100
95
90
90
85
85
Probability of Rejection
Probability of Rejection
95
80
75
70
65
PCA
WN
SN
SNPCA
60
55
50
0
10
75
70
65
PCA
WN
SN
SNPCA
60
55
50
0
10
10
False Accept Rate (%)
80
10
10
False Accept Rate (%)
(a)
10
(b)
100
100
90
80
Probability of Rejection
Probability of Rejection
90
80
70
60
40
0
10
10
False Accept Rate (%)
60
50
40
30
PCA
SN
SNPCA
WN
PCA
50
70
20
WN
SN
10
10
0
0
10
(c)
SNPCA
1
10
False Accept Rate (%)
10
(d)
Fig.6. ROC curves with (a) Extended Cohn-Kanade (CK+) database, (b) FEI database,
(c) Extended Yale B database, (d) FRGC v1 database.
Page 12 of 14
SN)
from
matchers
based
Conclusion
This paper presents a novel sparse coding (SC) model for robust face recognition (FR) called
Shearlet Network (SN). This method was performed via fusion step by PCA-based approach
to provide more depth to facial texture feature, using a refined model of belief function based
on the Dempster-Shafer rule. One important advantage of SNPCA is its robustness to various
types of challenging scenarios of FR (pose, expression and illumination). This paper focussed
on the challenging problem of a single training sample per subject. In order to illustrate and
validate our approach we used the Extended Cohn-Kanade (CK+), FEI, Extended YALE B
and FRGC v1 face databases. Our results show that our approach is very competitive in terms
of recognition rate with respect to other standard and state-of-the-art methods.
References
[1] Lu, J., Plataniotis Kostantinos, N., Venetsanopoulos Anastasios, Face recognition using
LDA-based algorithms. IEEE Trans. Neural Networks 14 (1), 195200. 2003.
[2] Lanitis , A., Taylor , C.J., Cootes, T.F.: Automatic Interpretation and Coding of Face
Images Using Flexible Models, IEEE Trans. Pattern Analysis and Machine Intelligence,
1997, 19, (7), pp. 743-756
[3] Ahonen, T., A., Hadid, Pietikainen, M.: Face description with local binary patterns:
Application to face recognition, IEEE Trans. Pattern Analysis and Machine Intelligence,
2006, 28, (12), pp.20372041
[4] A. Leonardis and H. Bischof, Robust recognition using eigenimages, Computer Vision
and Image Understanding, vol. 78, no. 1, pp. 99-118, 2000.
[5] S. Chen, T. Shan, and B.C. Lovell, Robust face recognition in rotated eigenspaces, Proc.
Intl Conf. Image and Vision Computing New Zealand, 2007.
[6] A.M. Martinez, Recognizing Imprecisely localized, partially occluded, and expression
variant faces from a single sample per class, IEEE Trans. Pattern Analysis and Machine
Intelligence, vol. 24, no. 6, pp. 748-763, 2002.
[7] W. R. Schwartz and R. D. da Silva and L. S. Davis and H. Pedrini. "A Novel Feature
Descriptor Based on the Shearlet Transform". IEEE International Conference on Image
Processing, pp. 1033-1036, 2011.
[8] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma. Robust face recognition via
sparse representation. IEEE PAMI, 31(2):210227, 2009.
[9] M. Yang, L. Zhang, J. Yang and D. Zhang, Robust sparse coding for face recognition,
Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[10] K. I. Chang, K.W. Bowyer, and P. J. Flynn, An evaluation of multimodal 2D + 3D face
biometrics, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 4, pp. 619624, Apr. 2005.
[11] G. Pan and Z. Wu, 3D face recognition from range data, Int. J. Image Graph., vol. 5,
no. 3, pp. 573593, 2005.
12
IET Review Copy Only
Page 13 of 14
[12] B. Gkberk, A. A. Salah, and L. Akarun, Rank-based decision fusion for 3D shapebased face recognition, in Proc. Audio-Video-Based Biometric Person Authentication T.
Kanade, A. Jain, and N. K. Ratha, Eds., 2005, vol. 3456, pp. 10191029. 2005.
[13] K. Chang, K. Bowyer, and P. Flynn, Adaptive rigid multi-region selection for handling
expression variation in 3D face recognition, in Proc. IEEE Workshop Face Recog. Grand
Challenge Experiments, pp. 157, 2005.
[14] C. Xu, Y. Wang, T. Tan, and L. Quan, Automatic 3D face recognition combining global
geometric features with local shape variation information, in Proc. Int. Conf. Autom. Face
Gesture Recog., pp. 308313,2004.
[15] K. I. Chang, K.W. Bowyer, and P. J. Flynn, An evaluation of multimodal 2D + 3D face
biometrics, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 4, pp. 619624, Apr. 2005.
[16] C. Ben Abdelkader and P. A. Griffin, Comparing and combining depth and texture cues
for face recognition, Image Vis. Comput., vol. 23, no. 3, pp. 339352, Mar. 2005.
[17] A. Mian, M. Bennamoun, and R. Owens, 2D and 3D Multimodal Hybrid Face
Recognition, Proc. ECCV, pp. 344355, 2006.
[18] Y. Wang and C.-S. Chua, Robust face recognition from 2D and 3D images using
structural Hausdorff distance, Image Vis. Comput., vol. 24, no. 2, pp. 176185, Feb. 2006.
[19] M. Husken,M. Brauckmann, S. Gehlen, and C. von der Malsburg, Strategies and
benefits of fusion of 2D and 3D face recognition, in Proc. IEEE Workshop Face Recog.
Grand Challenge Experiments, 2005, p. 174, 2005.
[20] Yager, R.R., Liu, L.: Classic works of the Dempster-Shafer theory of belief functions. In:
Studies in fuzziness and soft computing, vol. 219. Springer, Heidelberg, 2008.
[21] P. S. Negi and D. Labate, "3D discrete shearlet transform and video processing", IEEE
Trans. Image Process. 21(6) pp. 2944-2954, 2012.
[22] G. R. Easley, and D. Labate, "Critically sampled wavelets with composite dilations",
IEEE Trans. Image Process., 21 (2) pp. 550-561, 2012.
[23] S. Yi, D. Labate, G. R. Easley, and H. Krim, "A Shearlet approach to Edge Analysis and
Detection", IEEE Trans. Image Process, 18 (5) pp. 929941, 2009.
[24] G. Kutyniok and D. Labate, "Resolution of the wavefront set using continuous shearlets",
Trans. Amer. Math. Soc. 361 pp. 2719-2754, 2009.
[25] C. Ben Amar, M. Zaied and M.A. Alimi, "Beta Wavelets. Synthesis and application to
lossy image compression", Journal of Advances in Engineering Software, Elsevier edition, 36
(7) pp. 459-474, July 2005.
[26] M. Zaied, O. Jemai and C. Ben Amar, "Training of the Beta wavelet networks by the
frames theory: Application to face recognition", the international Workshops on Image
Processing Theory, Tools and Applications, Tunisia, November 2008.
[27] M. Kirby and L. Sirovich, Application of the KL Procedure for the Characterization of
Human Faces, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 1, pp.
103-108, Jan. 1990.
[28] P.S. Penev and L. Sirovich, The Global Dimensionality of Face Space,Proc. Fourth
IEEE Intl Conf. Automatic Face and Gesture Recognition, pp. 264-270, 2000.
13
IET Review Copy Only
14
IET Review Copy Only
Page 14 of 14