Sie sind auf Seite 1von 4

Comparison of Thermal and Visual Facial Imagery for use in Sparse Representation based Facial Recognition System

Asif Raza Butt and Asim Baig


Department of Electrical and Computer Engineering, Muhammad Ali Jinnah University, Islamabad, Pakistan
Abstract-Facial Recognition is probably one of the most commonly used biometric characteristics used by humans for recognition. This is one of the reasons why it has been subject of intense research for the past 30 years or so. In this time a lot of work is being done not only in the development of stable, real time facial recognition system but also in acquiring different modalities of facial imagery for use with these systems. One of the most successful recent attempts at developing a robust real time facial recognition system is based on representing the whole system as an underdetermined sparse linear system and solving it accordingly. On the other hand, the two mostly widely used modalities of facial imagery are Thermal and Visible images. In this paper, we compare the performance of a sparse representation based facial recognition system on both thermal and visible imagery. We also elaborate on the results in detail and explain the performances obtained.

I.

INTRODUCTION

Facial features based Recognition is such an integral part of human nature that it has become a dedicated process for the brain [1]. It is also one of the most sought after modalities in real world security and safety applications such as surveillance, access control, information security and identity detection. It has the advantage of being universally accepted and can be acquired overtly or covertly. The ultimate goal of a robust facial recognition is to provide accurate detection in presence of noise such as illumination variation, aging and facial expression. One of the biggest challenges in face recognition based systems is the high dimensional data space. A lot of work has been done in recent years to improve the speed, robustness and accuracy of the system by reducing the dimensionality of the data. The idea is to map the high dimensional facial data into fewer more discriminant dimensions. One of the earliest examples of this approach is the use of EigenFaces [2] and PCA [3] for facial recognition. Another more commonly used approach to facial recognition is to train the recognition algorithm on only a small subset of more discriminating images to

define a decision boundary such as Support Vector Machine based approaches in [4] and [5]. In recent years approaches based on the sparse representation of the facial recognition system have become more common as they allow for the development of a robust, accurate and real time facial recognition system. Wright et. al. in [6] were the first to propose this approach. They proposed to present the input image as an over-complete set of features whose base elements are the enrollment and training images. This allow for the representation of the whole system as an underdetermined sparse linear system which can be solved as an l1 minimization problem. The experimental results in their paper show that the proposed approach is robust to type of features selected and provides an accurate result. The other direction that the researchers are going into to improve the performance of facial recognition systems is to try different input modalities of facial images with existing recognition techniques such as Visual Imagery, Thermal Imagery, sketches and even fusion of multiple modalities. The aim is to use these different modalities of facial imagery to counteract the effects of illumination, pose, expression and aging in the input imagery. These various modalities allow for the face to be recognized both holistically as well as based on finer features. A number of current studies have shown that thermal IR imagery offers a promising alternative to visible imagery for handling variations in facial appearance due to illumination [7, 8], facial expression [9, 10] and face pose [11] . Thermal IR imagery is nearly invariant to change in ambient illumination and provide capabilities for identification under extremely low lighting conditions such as total darkness [12]. On the flip side, it does not provide the finer facial features that the visible imagery can provide for detection. These properties of thermal imagery make them ideal candidate for the use with approaches that focus on the more holistic approach to facial detection. In addition, any approach that utilized only the key/most prominent features of the facial image should also perform reasonably well with thermal imagery.

In this paper we compare the performance of the sparse representation based facial recognition system presented in [6] on both thermal and visual imagery. To evaluate the performance of sparse representation based approach we used an enrolment database and probe gallery with both thermal and visual facial images. The matching performance is evaluated for Thermal-to-Thermal matching, Thermal-to-Visual matching, Visual-to-Thermal matching and Visual-toVisual matching. This detailed evaluation provides an interesting insight into how the sparse representation based approach views the data and what is the ideal format to use with such type of approaches. The rest of the paper is organized as follows: Section 2 briefly outlines the working of a sparse representation based facial recognition system. Section 3 outlines the experimental setup and the database being used; section 4 discusses the results and comments on the systems performance. Section 5 provides conclusions and outlines future research directions. II. OVERVIEW OF SPARSE REPRESENTATION BASED APPROACH

real scenario the membership of the new image is unknown and to handle that a new matrix A is defined that encompasses the entire enrolment database and can be represented as [ ] In this case y can be written as (3) Where

(4)

[ ] (5) Represents a coefficient vector with all zero entries except for the ones associated with the ith user. Equation 4 then represents an underdetermined sparse linear system that can be solved for xo using any of the possible approaches such as l1-minimization or least square minimization approach. Although least square minimization based approaches are not as accurate as l1-minimization based approaches they tend to be simpler to implement and MATLAB provides a builtin function with an optimized implementation. In this paper we work with least square minimization based approach for the sake of simplicity. III. THE EXPERIMENTAL SETUP

The main idea behind the sparse representation based approaches is a generalization of the nearest subspace (NS) [13] approaches. Nearest Subspace based classifiers are based on the best linear representation of training samples in each class. The major difference between the two approaches is that one takes only the training samples from each class as the face subspace whereas the other takes the complete enrollment dataset as a linear span of training images for classification. This allows sparse representation based approaches to provide robustness against illumination and pose variations. The issue with this representation is that smaller variations between faces of different users can cause misclassification. This is the reason authors in [6] work with small size input images i.e. 12x12 or 15x15. Broadly, the sparse representation based approaches work as follows: Given a sufficient set of training images for ith user, the sample set Ai can be written as: (1) [ ] where vi,1, vi,2 etc. are the training images of the ith user. Then any new image from the same class will lie approximately on the same linear subspace and can be represented as

(2)
where y is the approximation of the new input image based on the existing training images. It is interesting to note that the more training image exists the better the representation of the new image. For a

We required a standard and established database of thermal and visual images to properly evaluate the performance of sparse representation based approach. In this regards, the enrolment database and probe gallery was generated from the Dataset 02: IRIS Thermal/Visible Face Database subset of the Object Tracking and Classification Beyond the Visible Spectrun (OTCBVS) database [14], freely available for download at http://www.cse.ohiostate.edu/OTCBVS-BENCH/. For this paper 30 users were selected and the probe gallery consisted of a thermal and a visual image each for every user. This means that the probe gallery consists of 60 images with 30 thermal and 30 visual images. The enrollment gallery consists of 4 thermal and 4 visual images of each user. Only forward facing images with slight variation in pose were selected and no restrictions were placed on the expressions. The enrollment database so generated consists of 240 almost forward looking images with expression variations. It should be noted that the faces were cropped from the image so as not to bias the results due to accidental matching of background or clothing in the image. The code for sparse representation based approach was written in MATLAB using the built-in LSQR function. The matching was results were verified visually and the results shown are for Rank Zero (0) matching only i.e. only the highest scoring enrolment image is compared visually with the gallery image and marked as match or non-match.

To evaluate the effect of scaling on the matching process the code is run multiple times with different size enrollment and probe images each time. The approach is evaluated for 9 different sizes. The sizes used are 8x8, 12x12, 15x15, 20x20, 25x25, 30x30, 35x35, 40x40, 45x45 and 50x50. The results for each of these sizes and their analysis are provided in the next section. IV. RESULTS AND ANALYSIS

Table 1 show the matching score comparisons for different sizes of thermal and visual images as well as overall matching scores. Two major observations are immediately obvious when these results are analyzed. First and foremost, as commented in [6] the correct match percentage increases with an increase in the size of the input images. It is interesting to note that this increase is not linear and in fact the matching starts to decrease once the image size increases beyond a certain limit. In our experimentation that limit was the size of 30x30. Total Correct Matches 19 25 29 30 32 32 31 31 31 30 Thermal Vs Thermal Matches 15 19 20 21 20 20 19 19 19 17

thermal image matching provides better matching results. An interesting observation is that although the overall accurate matching results were lower for smaller image sizes such as 8x8 and 12x12 a majority of the correct matches were due to thermal vs thermal matching. This phenomenon can easily be explained by the two observations provided above. The graph in figure 3 shows this result in a clearer fashion. In addition, it should be noted that although the although the thermal vs. thermal correct matching percentage reduces as he image size increases it is still greater than visual vs visual correct match percentage. It is safe to comment based on the results and their analysis that sparse representation based techniques are global feature matching techniques by nature and that it is better to use thermal imagery with these sparse representation based techniques. In addition, the results also show that the optimum size of probe and enrolment images should be between 20x20 and 30x30 when using lease mean square minimization Visual Vs Visual Matches 4 5 7 7 11 9 10 9 7 9 Thermal Vs Visual Matches 0 0 0 0 1 0 0 0 2 0 Visual Vs Thermal matches 0 1 2 2 0 3 2 3 2 4

TABLE 1. MATCHING RESULTS FOR DIFFERENT SIZE IMAGES

Pixel Size 8x8 12x12 15x15 20x20 25x25 30x30 35x35 40x40 45x45 50x50

The reason for this reduction is that once the image size goes beyond a certain threshold size more and smaller local feature become visible. The sparse representation based approaches are global matching approaches by nature and therefore work better when only the larger features such as eyes, nose, mouth and face shape are being utilized for matching. Once smaller features come into play these approaches tend to become more inaccurate. The second observation is that the thermal image vs. thermal image matching accuracy is always more than any other case. This is again due to the nature of the sparse representation based matching approaches. As mentioned above these approaches work on global scale and work best when only larger facial features are available for matching. In thermal images these global features are almost always more prominent than in visual imagery. Therefore, the thermal vs.

approach based system. It would be interesting to evaluate the l1minimization based approaches in the same way and we are currently working towards this evaluation. It would also be interesting to look more closely at those images that were matched correctly in Thermal Vs Visual and Visual Vs Thermal matches to evaluate the reason behind these correct matches. We believe that it will provide deeper insight into the working of sparse representation based approaches in particular and the global feature matching based approaches in general. V. CONCLUSION

A comparison is provided between visual and thermal images as input and enrolment dataset for a sparse representation based approaches. The results show not only that sparse representation based approaches can be considered global feature matching base approaches but also that thermal imagery

provides better accuracy as compared to visual imagery for these techniques. In addition, the results also show that the accuracy of these techniques will drop once the image size increases beyond a certain threshold. Further comparisons should be performed based on l1-minimization based approaches. Another interesting research direction should be to analyze and evaluate the images that provide correct matches in thermal vs visual matching and visual vs thermal matching.

[8] D. Socolinsky, A. Selinger, J. Neuheisel, Face recognition with visibleand thermal infrared imagery, Computer Vision and Image Understanding(2003) 72114. [9] G. Friedrich, Y. Yeshurun, Seeing people in the dark: face recognition ininfrared images, in: Second BMCV, 2003. [10]A. Jain, R. Bolle, S. Pankanti, Biometrics: Personal Identification inNetworked Society, Kluwer Academic Publishers, Dordrecht, 1999.

Pixel Size Vs Thermal-Thermal Matches % age


90% 80% 70% % of matches 60% 50% 40% 30% 20% 10% 0% 8x8 12x12 15x15 20x20 25x25 30x30 35x35 40x40 45x45 50x50
Figure 3. Graph showing comparison between Pixel Size and Thermal Match %age

Percentage of Thermal-Thermal Matches

REFERENCES
[1] A. K. Jain and S. Z. Li, Handbook of Face Recognition, Springer-Verlag New York, Inc. 2005 ISBN: 038740595X [2] M. Turk and A. Pentland. Eigenfaces for recognition. International Journal on Cognitive Neuroscience, 3(1):7186, 1991. [3] A. dAspremont, L.E. Ghaoui, M. Jordan, and G. Lanckriet, A Direct Formulation of Sparse PCA Using Semidefinite Programming, SIAM Rev., vol. 49, pp. 434-448, 2007. [4] V. Vapnik, The Nature of Statistical Learning Theory. Springer, 2000. [5] R. Singh, M. Vatsa and A. Noore. Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition. Pattern Recognition, vol. 41 pp. 880-893. 2008 [6] J. Wright, A. Yang, A. Ganesh, S. Sastry and Y. Ma. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. vol. 31, no. 2. pp. 201-227, 2009. [7]George B., Aglika G., Saurabh S. and Ioannis P., Face recognition by fusing thermal infrared and visible imagery , Image and Vision Computing 24 (2006) 727742 [11] I. Pavlidis, P. Symosek, The imaging issue in an automatic face/disguisedetection system, in: IEEE Workshop on Computer Vision Beyond theVisible Spectrum, 2000, pp. 1524. [12]J. Park, T. Oh, S. Ahn, S. Lee, Glasses removal from facial image using recursive error compensation, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (5) (2005) 805811. [13] P. Belhumeur, J. Hespanda, and D. Kriegman, Eigenfaces versus Fisherfaces: Recognition Using Class Specific Linear Projection, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997. [14] IEEE OTCBVS WS Series Bench; DOE University Research Program in Robotics under grant DOE-DE-FG02-86NE37968; DOD/TACOM/NAC/ARC Program under grant R01-1344-18; FAA/NSSA grant R01-1344-48/49; Ofce of Naval Research under grant #N000143010022.

Das könnte Ihnen auch gefallen