Beruflich Dokumente
Kultur Dokumente
Masks are rectangular windows, each of which represents a certain image with a black and white
pattern
Detection algorithm
Masks are placed over different parts of the frame. The software determines whether or not there
is any face in a frame. Masking a certain portion of a snapshot provides a numeric value. This value
is the result of a frame matching with a mask. The software sums up the brightness of all pixels,
which are in the white part of the mask. It also sums up the brightness of all pixels belonging to the
black part of the mask. Then it calculates the difference between these values. The result is then
compared to a threshold value.
The popularity of this approach is because the calculation is carried out quickly and simply. It is
enough to perform only three operations for each rectangular mask element.
Face detector training
Human face images have several distinguishing characteristics:
1. From a frontal standpoint, a human face has dark and light zones and areas: eyes and lips are
dark, while the forehead, cheeks, and chin are light).
2. Faces are similar to each other. They differ in details but, in general, human faces are of the
same type.
This means that you can pick up a set of masks (Haars cascade) and create a classifier (an
algorithm that detects a particular object in a snapshot). This classifier will take these features into
account and will be able to detect faces as accurately as possible.
In the process of mask selection, a classifier can learn to improve detection accuracy. AdaBoost
algorithm is used for a classifiers training and performance improvement. A sampling is created for
machine learning purposes. It includes a large number of pictures with images of people. Each of
the classifiers masks is used in turn.
Positive learning sample consists of a large number of pictures with images of peoples faces
There can be a huge number of masks with different variations of black and white patterns. Each
mask gives a certain value in the process of comparison. If this value is above a threshold, it means
that a human face is present in a frame. Along with a positive training example containing human
faces images, a negative example is created as well. The negative example does not contain images
of human faces. This example is also used for classifiers training. In the case where the negative
example comparison returns a value, it is smaller than a threshold value.
If any image mask makes a mistake, the weight (importance) of this image increases for other
masks.
As a result of comparisons made with positive and negative examples, a mask is placed into a
cascade classifier. It gets there with some ratio showing a face detection error for this mask, and
the proportion of photo images on which this mask did not make a mistake. Taking into
consideration their individual error detection ratios, the face detection module compares the
deviations value for all masks with a threshold value within the cascade classifier. If a resulting
value is greater than a threshold, the faces detector signals a human face present in a frame.
More often than not, a sample contains frontal view images of faces. It is easier to detect faces
from a frontal view. However, a classifier can be trained to detect faces in different positions using
appropriate sampling.