Sie sind auf Seite 1von 38

A PROJECT REPORT

On

AUTOMATIC FACIAL EXPRESSION RECOGNITION


SYSTEM USING CNN
Submitted in partial fulfilment of the

Requirements for the award of the Degree of

Bachelor of Technology

In

Computer Science and Engineering

By

Batch no : 16NCGVS06-04

E.Sai Tejaswini 160030356

Under the Supervision of


Dr. B Sekhar Babu
Assoc Prof. DEPT of CSE

1|P age
K L University
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DECLARATION

We hereby declare that this term paper report entitled “Automatic Facial Expression
Recognition Using CNN” is a record of bonafide work of E. Sai tejaswini (160030356
submitted in partial fulfilment for the award of B.Tech in Computer Science and Engineering
in K L University.

We also declare that this project based lab report is of our own effort and it has not been
submitted to any other university for the award of any degree.

2|P age
K L (Deemed To Be University)

DEPARTMENT OF COMPUTER SCIENCE AND


ENGINEERING

CERTIFICATE

This is to certify that this term paper report entitled “Automatic Facial Expression
Recognition Using CNN” is a record of bonafide work of E. Sai tejaswini (160030356)
partial fulfilment of the requirement for the award of degree in BACHELOR OF
TECHNOLOGY in Computer Science and Engineering during the academic year 2018-
2019 .

Signature of the Supervisor Signature of HOD


Dr.B Sekhar Babu V.Hari Kiran
Assoc Prof. DEPT of CSE HOD of CSE

Signature of the External Examiner

3|P age
ACKNOWLEDGEMENTS

My sincere thanks to DR. B. SEKHAR BABU in the class for their outstanding support

throughout the project for the successful completion of the work

We express our gratitude to Mr.HARI KIRAN VEGE, Head of the Department for

Computer Science and Engineering for providing us with adequate facilities, ways and means

by which we are able to complete this term paper work.

We express our heart full gratitude to our President DR. KONERU SATYANARAYANA,

for providing infrastructure facilities in pursuing this project.

We would like to place on record the deep sense of gratitude to the honourable Vice

Chancellor, K L University for providing the necessary facilities to carry the concluded

project work.

Last but not the least, we thank all Teaching and Non-Teaching Staff of our department and

especially my classmates and my friends for their support in the completion of our project

work.

4|P age
ABSTRACT

Facial expression recognition has been an active research area over the past few decades, and
it is still challenging due to its high intra-class variation. Most of these works perform
reasonably well on datasets of images captured in a controlled condition, but fail to perform
as good on more challenging datasets with more image variation and partial faces. In recent
years, several works proposed a framework for facial expression recognition, using deep
learning models. Despite the better performance of these works, there is still a room great for
improvement. In this work, we propose a deep learning approach which is based on
attentional convolutional network, which is able to focus on important parts of the face,
and achieves significant improvement over various datasets, including FER-2013, CK+,
FERG, and JAFFE. We also use a visualization technique which is able to find important
face regions for detecting different emotions, based on the users output. Through the
experimental results, we show that different emotions seems to be reactive to different parts
of the face.

5|P age
CONTENTS

S.No. Chatper Name Pg No.

1. Introduction 7

1.1 CNN 8

2 Literature Survey 11

2.1 2.1 Effective Semantic Features For Facial 12


Expression Recognition using SVM

2.2 2.2 Face Recognition System Using Artificial 13


Neural Networks Approach

3 System Analysis 15

3.1 Existing System 16

3.2 Block Diagram of the existing system 16

3.3 Proposed System 16

4 Python 17

5 Block Diagram 19

6 Implementation 20

7 Experiment Results 35

8 Conclusion 37

9 References 38

6|P age
1. INTRODUCTION

Emotional expressions are very important part in human interactions. As technology has
found everywhere in our society and is taking on coaching roles in education, emotion
recognition from facial expressions has become an important part in human-computer
interaction. However, human facial expressions change so insignificant that automatic
facial expression recognition has always been a challenging task.. . In this work, we
propose a deep learning approach based on attentional convolutional network, which is
able to focus on salient parts of the face and achieves significant improvement over
previous models on multiple datasets, including FER-2013, CK+, FERG, and JAFFE. We
also use a visualization technique which is able to find salient face regions for detecting
various emotions, based on the users output. Through experimental results, we show that
different emotions seems to be reactive to different parts of the face.

7|P age
1.1 Convolutional Neural Networks

Artificial Intelligence has been facing a monumental growth in bridging the gap between the

capabilities of humans and the machines. Researchers and work on numerous aspects of the

field to make amazing things happen. One of many such areas is the domain of Computer

Vision.

The main reason for this field is to enable machines to view the world as humans do, perceive

it in a similar manner and even use the knowledge for various tasks such as Image & Video

recognition, Image Analysis & Classification, Media Recreation, Recommendation Systems,

Natural Language Processing, etc. The advancements in Computer Vision with Deep Learning

has been constructed and perfected with time, basicly over one particular

algorithm Convolutional Neural Network.

A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can

take in an input image, and give importance (learnable weights and biases) to various aspects

in the image and be able to differentiate one from the other. The pre-processing required in a

Conv Net is much lower as compared to other classification algorithms While in basic

methods filters are hand-engineered, with enough training, Conv Nets have the ability to

learn these characteristics.

8|P age
The structure of a Conv Net is parallel to that of the connectivity pattern of Neurons in the

Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons

respond to stimuli only in a particular region of the visual field known as the Receptive Field.

A collection of such fields clashes to cover the entire visual area.

CNNs, like other neural networks, are made up of neurons with known weights and biases.

Each neuron receives several inputs, takes a weighted sum over them, pass it through an

activation function and gives an output. The whole network has a loss function and all the tips

that are developed for ANN will still apply to CNNs.

Benefits of CNN:

1. The usage of CNNs is initiated by the fact that they can capture the relevant features

from an image or video at different levels is similar to a human brain. This is called

feature learning. Conventional Neural networks (CNN) cannot do this.

2. For a completely new task CNNs are very good feature identifiers. This means that

you can extract useful attributes from an already trained CNN with its trained weights

by feeding your data on each level and tune the CNN with its trained weights by

feeding your data on each level and tune the CNN a bit for a specific task. For Eg.

Add a classifier after the last layer with labels specific to the task. This is also called

pre-training and CNNs are very efficient in such tasks compared to ANNs. Another

advantage of this pre-training is we avoid training of CNN and save memory time.

The only thing to train classifier is at the end of label.

9|P age
3. CNN take advantage of local spatial coherence in the input (often images), which

allow them to have fewer weights as some parameters are shared. This process, taking

the form of convolutions, makes them especially well-suited to extract relevant

information at a low computational cost.

Limitations of CNN:

Convolutional neural networks are computationally costly , like any other neural network

model. But this is more of an inconvenience than a weakness.

For better computer hardware such as Graphical User Interphase and Neuromorphic

processors, thus can be resolved.

So, what are the drawbacks of CNN?

Missing the theory, memory, learning without supervision. A hypothesis to understand why

and how this research in architecture is missing. For example, in understanding how much

information or how many layers are needed to get a certain result, this concept may be

important.

10 | P a g e
2. Literature Survey

2.1 Effective semantic features for facial expressions recognition using


SVM

Authors

Chen-Chiung Hsieh1 & Mei-Hua Hsih2 &


Meng-Kai Jiang1 & Yun-Maw Cheng1 & En-Hui Liang3

Abstract

The Most traditional facial expression-recognition systems track facial components such as
eyes, eyebrows, and mouth for feature extraction. Though some of the features can provide
solutions for expression recognition, other finer changes of the facial expressions can also
be deployed for classifying various facial expressions. This study locates facial components
by active shape model to extract seven dynamic face regions (eyebrows, nose wrinkle, two
nasolabial folds, lips, and mouth). Proposed semantic facial features could then be acquired
using directional gradient operators like Gabor filters and Laplacian of Gaussian. A support
vector machine (SVM) was trained to classify six facial expressions (neutral, happiness,
surprise, anger, disgust, and fear). The popular Cohn–Kanade database was tested and the
average recognition rate is 94.7 %.

Facial expressions analysis and proposed semantic facial features

According to research on facial expressions, the facial action coding system (FACS) [14],
which is a standard method for describing facial expression, was used in this study. Facial
expression in the FACS comprises variations of the upper face (i.e., the forehead, eyebrows,
and eyes) and those of the lower face (i.e., the mouth and nose). These varying facial
components, such as the eyebrows stretching upward or the eyes opening wide, are called
11 | P a g e
action units, of which 44 have been found. Figure 2 shows how these action units can be
combined to describe various facial expression.

Types N H S A D F Subtotal

Training 20 20 20 20 20 20 120

Testing 63 55 51 15 16 21 221
Subtotal 83 75 71 35 36 41 341

Conclusion

Previous studies have adopted entire facial images [6, 23, 48, 50, 51] or have manually set
feature points [19, 29, 34, 38] for classifying expressions. Although these methods have
generated good results, they are neither rapid nor automatic. Most of them are in the state of
the art adopt.

12 | P a g e
2.2 FACE DETECTION USING ARTIFICIAL NEURAL
NETWORK APPROACH

Abstract

A face detection system using artificial neural network is presented. In this The system used
integral image for image representation which allows fast computation of the features used.
The system also implements the AdaBoost learning algorithm to select a small number of
critical visual features from a very large set of potential features. Besides that, it also used to
combine the classifiers algorithm which allows background parts of the image to be quickly
discarded while spending more computation on face regions. Further , a set of experiments in
the domain of face detection is done. The system yields a good face detection performance.

SYSTEM OVERVIEW

The architecture of the proposed face detection system given. There are four (4) major
modules in the proposed face detection system, which are as follows .data acquisition, image
pre processing , feature extraction, and classification.

EXPERIMENTAL RESULTS

The purpose of the experiment is to compare the performance of the proposed face detection
using the two classifiers structures, the traditional and a modified structure of neural network,
and method of comparing the threshold. The traditional structure involves too many hidden
nodes, which increase the computation time for every hidden node. The modified structure
will decrease the computation time, and increase perception accuracy.

13 | P a g e
CONCLUSION

In this research , we have proposed a face detection system using the traditional and modified
ANN approaches. We use integral image to compute the features to decrease the computation
time. The resulting classifier is computationally efficient, since only a small number of
features need to be evaluated during run time. We also used the combining of classifiers,
which will reduce the computation time, and increase the detection accuracy. The cascade
presented is in similar structure, and has the advantage of making simple trade offs between
processing time and detection performance. The experimental results show that the improved
ANN, using voting approach increases the detection accuracy to 96% where as only 78%
using the traditional ANN approach.

14 | P a g e
3. SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

Effective expression analysis hugely depends upon the accurate representation of facial
features. Facial Action Coding System (FACS) [3] represents face by measuring all visually
observable facial movements in terms of Action Units (AUs) and associates them with the
facial expressions. Accurate detection of AUs depends upon proper identification and
tracking of different facial muscles irrespective of pose, face shape, illumination, and image
resolution. According to Whitehill et al. [4], the detection of all facial fiducial points is even
more challenging than expression recognition itself. Therefore, most of the existing
algorithms are based on geometric and appearance based features. The models based on
geometric features track the shape and size of the face and facial components such as eyes,
lip corners, eyebrows etc., and categorize the expressions based on relative position of these
facial components. Moreover, the distance between facial landmarks vary from person to
person, thereby making the person independent expression recognition system less reliable.
Facial expressions involve change in local texture. The appearance-based methods generates
high dimensional vector which are further represented in lower dimensional subspace by
applying dimensionality reduction techniques, such as principal component analysis
(PCA), linear discriminant analysis (LDA) etc. Finally, the classification is performed in
learned subspace. Although the time and space costs are higher in appearance based methods,
the preservation of discriminative information makes them very popular.

15 | P a g e
Draw backs

Extraction of facial features by dividing the face region into several blocks achieves better
accuracy as reported by many researchers . How-ever, this approach fails with improper face
alignment and occlusions. Some earlier works on extraction of features from specific face
regions mainly determine the facial regions which contributes more toward discrimination of
expressions based on the training data.

In these approaches, the positions and sizes of the facial patches vary according to the
training data. Therefore, it is difficult to conceive a generic system using these approaches.

Proposed System
we propose a deep learning based framework for facial expression recognition, which takes
the above observation into account, and uses attention mechanism to focus on the salient part
of the face. We show that by using attentional convolutional network, even a network with
few layers (less than 10 layers) is able to achieve very high accuracy rate. More specifically,

this paper presents the following contributions


We propose an approach based on an attentional convolutional network, which can focus
on feature-rich parts of the face, and yet, outperform remarkable recent works in accuracy.

16 | P a g e
4. PYTHON

What is Python?

• Python is a open source, object-situated, abnormal state incredible programming


language.

• Developed by Guido van Rossum in the mid-1990s. Named after Monty Python

• Python keeps running on numerous Unix variations, on the Mac, and on Windows
2000 and later.

Python Program

• Python programs are made up of modules.

• Modules contain explanations.

• Statements are articulations.

• Expressions make and procedure objects

Features of Python

Open source: Python is freely accessible open source programming; any one can utilize
source code that doesn't cost anything.

Easy-to-learn: Popular (scripting/augmentation) language, clear and simple punctuation, no


sort revelations, programmed memory the executives, abnormal state information types and
activities, structure to peruse (increasingly English like grammar) and compose (shorter code
contrasted with C, C++, and Java) quick.

17 | P a g e
High-level Language: High-level language (closer to human) alludes to the larger
amount of idea from machine language (for instance low level computing constructs).
Python is a case of an abnormal state language like C, C++, Perl, and Java with low-level
advancement.

Portable: High dimension dialects are compact, which implies they can keep running over
all significant equipment and programming stages with few or no adjustment in source code.
Python is versatile and can be utilized on Linux, Windows, Macintosh, Solaris, FreeBSD,
OS/2, Amiga, AROS, AS/400 and some more.

Object-Oriented: Python is a full-highlighted object-arranged programming language,


with highlights, for example, classes, legacy, protests, and over-burdening.

Python is Interactive: Python has an intuitive comfort where you get a Python brief
(order line) and connect with the mediator legitimately to compose and test your projects.
This is helpful for numerical programming.

Interpreted: Python programs are deciphered, takes source code as information, and after
that accumulates (to convenient byte code) every announcement and executes it right away.
No compelling reason to accumulating or connecting.

18 | P a g e
5. BLOCK DIAGRAM

About Facial Recognization System

19 | P a g e
6. Implementation

We propose an end-to-end deep learning framework, based on attentional convolutional


network, to classify the underlying emotion in the face images. Often times, improving a
deep neural network relies on adding more layers/neurons, facilitating gradient flow in the
network . Given a face image, it is clear that not all parts of the face are important in
detecting a specific emotion, and in many cases, we only need to attend to the specific
regions to get a sense of the underlying emotion. Based on this observation, we add an
attention mechanism, through spatial

transformer network into our framework to focus on important face regions.

20 | P a g e
Description of the corpus The evaluation of the proposed method for facial
expression recognition was performed on two databases:

• The JAFFE database (The Japanese Female Facial Expression) [LBA99a]: is widely used
in the facial expressions research community. It is composed of 213 images of 10 Japanese
women displaying seven facial expressions: the six basic expressions and the neutral one.
Each subject has two to four examples for each facial expression.

• The KANADE database [KCT00]: is composed of 486 video sequences of people


displaying 23 facial expressions within the six basic facial expressions. Each sequence begins
by a neutral expression and finish with the maximum intensity of the expression. For fair
comparison between KANADE and JAFFE databases, we selected from the KANADE
database the first image (neutral expression) and the last three images (with the maximum
intensity of the expression) of 10 people chosen randomly. Moreover, we selected the six
basic facial expressions and the neutral one.

21 | P a g e
Facial Expression Recognition basically performed in three major steps:

• Face detection

• Feature Extraction

• Facial Expression Classification

The primary need of Face Expression Recognition system is Face Detection which is used to
detect the face.

22 | P a g e
The next phase is feature extraction which is used to select and extract relevant features
such as eyes, eyebrow, nose and mouth from face. It is very essential that only those features
should be extracted from image that have highly contribution in expression identification.

The final step is facial expression classification that classifies the facial expressions based
on extraction of relevant features.

23 | P a g e
Modules

Pre-Processing

Image pre-processing often takes in the form of signal conditioning with the segmentation,
location, or tracking of the face or facial parts. A low pass filtering condition is performing
using a 3x3 Gaussian mask and it remove noise in the facial images fo llowed by face
detection for face localization. Viola-Jones technique of Haar-like features is used for the
face detection. It has lower computational complexity and sufficiently accurate detection of
near -upright and near-frontal face images. Using integral image, it can detect face scale and
location in real time. The localized face images are extracted. Then localized face image
scaled to bring it to a common resolution and this made the algorithm shift invariant.

Eye and Nose Localization

The coarse regions of interests are eyes and nose selected using geometric position of face.
Coarse region of interests can also use to reduce the computational complexity and false
detection. Haar classifiers used to both eyes are detected separately and then haar classifier
trained for each eye. The classifiers are returns to the vertices of the rectangular area of
detected eyes. The centers of the both eyes are computed as the mean of these coordinates.
The position of eyes does not change with facial expression s. Similarly, Haar cascades are
used to detected nose position. In this case eyes or nose was not detected using Haar classifier
s. In our experiment, for more than 99.6 percent cases these parts were detected correctly.

Lip Corner Detection

24 | P a g e
Algorithm for Lip Corner Detection

1) Step 1. Select coarse lips ROI using face width and nose position.

2) Step 2. Apply 3x3 Gaussian mask to the lips ROI.

3) Step 3. Apply horizontal sobel operator for edge detection.

4) Step 4. Apply Otsu-thresholding.

5) Step 5. Apply morphological dilation operation.

6) Step 6. Find the connected components in images.

7) Step 7. Remove the spurious connected components using threshold technique to the
number of pixels.

8) Step 8 .Scan the image from the top and select the first connected component as upper lip
position.

9) Step 9.Locate the left and right most positions of connected component as lip corners.

Eyebrow Corner Detection

The coarse region interest is used to both eyebrows are selected. The eyebrows corner
detection following the same procedure as that of upper lip detection. An adaptive threshold
operation is applying before horizontal sobel operator it improved the accuracy of eyebrow
corner detection. The different stage of the process are shown fig

25 | P a g e
26 | P a g e
Extraction of Active Facial Patches

Facial expressions are usually formed by the local facial appearance variations. However, it
is more difficult to automatically localize these local active areas on a facial image. Firstly
the facial image is divided into N local patches, and then local binary pattern (LBP)
features are used to represent the local appearance of the patch. During an expression, the
local patches are extracted from the face image depending upon the position of active facial
muscles. In our experiment are used active facial patches as shown in Fig. 6. Patches are
assigned by number. The patches do not have very fixed position on the face image. The
position of patches varying with different expression. The location depends on the positions
of facial landmarks. The sizes of all facial patches are equal and it was approximately one-
eighth of the width of the face. In Fig 1P,4P,18P and 17P are directly extracted from the
positions of lip corners and inner eyebrows respectively. 15P was at the center of both the
eyes; and 16P was the patch occur just above 15P. 19P and 6P are located in the midway of
eye and nose. 3P was located just below 1P.9Pwas located at the center of position of 3P
and 8P. 14P and 11P are located just below eyes. 2P,10P, and 7Pare clubbed together and
located at one side of nose position. It is similar to 5P 12P, and 13P are located.

27 | P a g e
28 | P a g e
FEATURE EXTRACTION

Local binary pattern is used robust illumination of feature extraction. The original LBP
operator labels the pixels of an image with decimal numbers, called Local Binary Patterns
which encode the local texture around each pixel. It proceeds thus, as illustrated in

PRINCIPAL COMPONENT ANALYSIS FOR FACE RECOGNITION

Algorithm Description
The face is the primary focus of attention in our society. The ability of human beings to
remember and recognize faces is quite a difficult task. Automation of this process finds
practical application in various tasks such as criminal identification, security systems and
human-computer interactions. Various attempts to implement this process have been made
over the years. Eigen face recognition is one such technique that relies on the method of
principal components. This method is based on an information theoretical approach, which
treats faces as intrinsically 2 dimensional entities spanning the feature space. The method
works well under carefully controlled experimental conditions but is error-prone when used
in a practical situation. This method functions by projecting a face onto a multi-dimensional
29 | P a g e
feature space that spans the gamut of human faces. Any face in the feature space is then
characterized by a weight vector obtained by projecting it onto the set of basis images. When
a new face is presented to the system, its weight vector is calculated and a CNN based
classifier is used for classification.

CODE

from statistics import mode

import cv2

from keras.models import load_model

import numpy as np

from utils.datasets import get_labels

from utils.inference import detect_faces

from utils.inference import draw_text

from utils.inference import draw_bounding_box

from utils.inference import apply_offsets

from utils.inference import load_detection_model

from utils.preprocessor import preprocess_input

import warnings

warnings.filterwarnings("ignore")

# parameters for loading data and images

30 | P a g e
detection_model_path =
'trained_models/detection_models/haarcascade_frontalface_default.xml'

emotion_model_path = 'trained_models/emotion_models/fer2013_mini_XCEPTION.102-
0.66.hdf5'

emotion_labels = get_labels('fer2013')

# hyper-parameters for bounding boxes shape

frame_window = 10

emotion_offsets = (20, 40)

# loading models

face_detection = load_detection_model(detection_model_path)

emotion_classifier = load_model(emotion_model_path, compile=False)

# getting input model shapes for inference

emotion_target_size = emotion_classifier.input_shape[1:3]

# starting lists for calculating modes

emotion_window = []

# starting video streaming

cv2.namedWindow('window_frame')

video_capture = cv2.VideoCapture(0)

31 | P a g e
while True:

bgr_image = video_capture.read()[1]

gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)

rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)

faces = detect_faces(face_detection, gray_image)

for face_coordinates in faces:

x1, x2, y1, y2 = apply_offsets(face_coordinates, emotion_offsets)

gray_face = gray_image[y1:y2, x1:x2]

try:

gray_face = cv2.resize(gray_face, (emotion_target_size))

except:

continue

gray_face = preprocess_input(gray_face, True)

gray_face = np.expand_dims(gray_face, 0)

gray_face = np.expand_dims(gray_face, -1)

emotion_prediction = emotion_classifier.predict(gray_face)

emotion_probability = np.max(emotion_prediction)

emotion_label_arg = np.argmax(emotion_prediction)

emotion_text = emotion_labels[emotion_label_arg]

32 | P a g e
emotion_window.append(emotion_text)

if len(emotion_window) > frame_window:

emotion_window.pop(0)

try:

emotion_mode = mode(emotion_window)

except:

continue

if emotion_text == 'angry':

color = emotion_probability * np.asarray((255, 0, 0))

elif emotion_text == 'sad':

color = emotion_probability * np.asarray((0, 0, 255))

elif emotion_text == 'happy':

color = emotion_probability * np.asarray((255, 255, 0))

elif emotion_text == 'surprise':

color = emotion_probability * np.asarray((0, 255, 255))

else:

color = emotion_probability * np.asarray((0, 255, 0))

color = color.astype(int)

color = color.tolist()

33 | P a g e
draw_bounding_box(face_coordinates, rgb_image, color)

draw_text(face_coordinates, rgb_image, emotion_mode,

color, 0, -45, 1, 1)

bgr_image = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2BGR)

cv2.imshow('window_frame', bgr_image)

if cv2.waitKey(1) & 0xFF == ord('q'):

break

video_capture.release()

cv2.destroyAllWindows()

34 | P a g e
EXPERIMENTAL RESULTS
OUTPUT :

35 | P a g e
Method Success rate Error rate

Traditional CNN 78% 22%

Modified 98% 4%
CNN

36 | P a g e
CONCLUSION

The face is the primary focus of attention in the society. The ability of human beings to
remember and recognize faces is quite robust. Automation of this process finds practical
application in various tasks such as criminal identification, security systems and human-
computer interactions. Various attempts to implement this process have been made over the
years. Eigen face recognition is one such technique that relies on the method of principal
components. This method is based on an information theoretical approach, which treats faces
as intrinsically 2 dimensional entities spanning the feature space. The method works well
under carefully controlled experimental conditions but is error-prone when used in a practical
situation. This method functions by projecting a face onto a multi-dimensional feature space
that spans the gamut of human faces. A set of basis images is extracted from the database
presented to the system by Eigenvalue-Eigenvector decomposition. Any face in the feature
space is then characterized by a weight vector obtained by projecting it onto the set of basis
images. When a new face is presented to the system, its weight vector is calculated and a
CNN based classifier is used for classification

37 | P a g e
References:

[1]Rowley, H., Baluja, S., and Kanade, T.


,‘‘Neural networkbased face detection’’.
IEEE Patt. Anal. Mach. Intell., 1998,

20: pp. 22--38

[2]Sung, K. and Poggio, T.,‘‘Example-


based learning for viewbased face
detection’’. IEEE Patt. Anal. Mach. Intell.,

1998, 20: pp. 39--51

[3]Osuna, E., Freund, R., and Girosi,


F.,’’Training support vector machines: an
application to face detection’’.
Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition.
1997.

[4]Freund, Y. and Schapire, R.E., ‘‘A


decision-theoretic generalization of on-
line learning and an application to
boosting’’. In Computational Learning
Theory: Eurocolt 95, Springer-Verlag,
1995, pp. 23--37

[6]Fleuret, F. and Geman, D., ‘‘Coarse-to-


fine face detection’’. Int.J. Computer
Vision, 2001, 41: pp. 85--107

38 | P a g e