Sie sind auf Seite 1von 33

Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Object Detection - Basics1

Lecture 28

See Sections 10.1.1, 10.1.2, and 10.1.3 in


Reinhard Klette: Concise Computer Vision
Springer-Verlag, London, 2014

1
See last slide for copyright information.
1 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Agenda

1 Localization, Classification, Evaluation

2 Descriptors, Classifiers, Learning

3 Performance of Object Detectors

4 Descriptor Example: Histogram of Oriented Gradients

2 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Localization

Localization, classification, and evaluation are three basic steps of an


object detection system

Object candidates are localized within a rectangular bounding box

3 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Classification
Localized object candidates are mapped by classification either in detected
objects or rejected candidates

Face detection: one false-positive and two false-negatives (not counting


the side-view of a face)
4 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Evaluation

A true-positive, also called a detection, is a correctly-detected object

A false-positive, also called a false detection, occurs if we detect an object


where there is none

A false-negative denotes a case where we miss an object

A true-negative describes the cases where non-object regions are correctly


identified as non-object regions (typically not of interest)

5 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Which one is TP or FP or FN or TN?

6 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Agenda

1 Localization, Classification, Evaluation

2 Descriptors, Classifiers, Learning

3 Performance of Object Detectors

4 Descriptor Example: Histogram of Oriented Gradients

7 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Descriptors

Classification is membership in pairwise-disjoint classes being subsets of


Rn , where n > 0 is defined by the used descriptors
A descriptor x = (x1 , . . . , xn ) is a point in the n-dimensional descriptor
space Rn representing measured or calculated property values in a given
order

Two Examples: n = 128 for SIFT


n = 2 on the next page: descriptor space is defined by properties
perimeter and area; e.g. descriptor x1 = (621.605, 10 940) for
Segment 1

8 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example: 2D Descriptor Space

Left: Regions in a segmented image. Right: Descriptor space


Area
80,000 3
70,000
1 4
60,000

50,000
-1
3
2 +1
40,000
5
6 30,000

20,000

10,000 1 2
5 6 Perimeter
4
200 600 1,000 1,400 1,800 2,200 2,600

The blue line defines a binary classifier; it subdivides the descriptor space
into two half-planes such that descriptors in one half-plane have value +1
(i.e. +1 is a class identifier) assigned, and -1 if in the other half-plane

9 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Classifiers

A classifier (i.e. a partioning of the descriptor space) assigns class numbers


to descriptors
Training: using a given set {x1 , . . . , xm } of already-classified descriptors
(the learning set) for defining the partitioning (the classifier)
Application: on descriptors generated for recorded data
General classifier: Assigns class numbers 1, 2, . . . , k for k > 1 classes, and
0 for not classified
Binary classifier: Assigns class numbers 1 or +1

10 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Weak or Strong Classifiers

A classifier is weak if it does not perform up to expectations (e.g., it might


be just a bit better than random guessing)
Multiple weak classifiers can be mapped into a strong classifier, aiming at
a satisfactory solution of a classification problem
Weak or strong classifiers can be general-case (i.e. multi-class) classifiers
or just binary classifiers; just being binary does not define weak

Example: AdaBoost defines a statistical combination of multiple weak


classifiers into one strong classifier (see later)

11 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example 1: Binary Classifier by Linear Separation

We define a binary classifier by constructing a hyperplane

: w> x + b = 0

in Rn , for n 1
Vector w Rn is the weight vector
Real b R is the bias of

Example: n = 2 or n = 3, then w is the gradient or normal orthogonal to


the defined line or plane , respectively

12 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example 1: Continued

x2 x2

x1 x1

Left: Linear-separable distribution of descriptors pre-classified to be either


in class +1 (green descriptors) or -1 (red descriptors)
Right: Not linear separable; sum of shown distances (black line segments)
of misclassified descriptors defines total error for
13 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example 1: Continued

h(x) = w> x + b

h(x) 0: One side of the hyperplane (including the plane itself) defines
value +1
h(x) < 0: The other side (not including the plane itself) value -1
Linear classifier defined by w and b can be calculated for a distribution of
(pre-classified) training descriptors in nD descriptor space
Error for a misclassified descriptor x is the perpendicular distance
>
w x + b
d2 (x, ) =
||w||2
to the hyperplane
Task: Calculate such that total error for all misclassified training
descriptors is minimized
14 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example 2: Classification by Using a Binary Decision Tree

Classifier defined by binary decisions at split nodes in a tree


(i.e. yes or no)
Each decision is formalized by a rule, and given input data can be tested
whether they satisfy the rule or not
Accordingly, we proceed with the identified successor node in the tree
Each leaf node of the tree defines finally an assignment of data arriving at
this node into classes
Example: each leaf node identifies exactly one class in Rn ; see next slide
for n = 2

15 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Example 2: Continued
Left: Decision tree
Right: Resulting subdivison in 2D descriptor
x
space
2

200
x1 < 100
180
yes no 160
x2 >60 x1 >160 140

120
yes no yes no
100
x1 + x2 < 120
80
yes no 60

40

20
x1
20 40 60 80 100 120 140 160 180 200

Tested rules in the shown example of a tree define straight lines in the 2D
descriptor space; descriptors arriving at one of the leaf nodes are then in
one of the shown subsets of R2
16 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Trees, Forests, Cascades of Binary Classifiers

A single decision tree (defined by at least one split node) can be


considered to be an example for a weak classifier
A set of decision trees, called a forest, can then be used for defining a
strong classifier

Observation.
A single decision tree provides a way to partition a descriptor space into
multiple regions (i.e. classes)
When applying binary classifiers defined by linear separation then we need
to combine several of those (e.g. in a cascade) to achieve a similar
partitioning of a descriptor space

17 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Learning

Learning is the process when defining or training a classifier based on a set


of descriptors
Classification is the actual application of the classifier
During classification we may also identify some misbehavior, and this can
lead again to another phase of learning
The set of descriptors used for learning may be pre-classified or not
Supervised learning: We have a mechanism for assigning class numbers to
descriptors (e.g. manually based on expertise such as yes, the driver
does have closed eyes in this image)
Unsupervised learning: We do not have prior knowledge about class
memberships of descriptors, e.g. for randomly selected patches in an
image: a typical patch for a pedestrian or not?

18 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Unsupervised Learning: Two Examples

Data distribution in learning set decides about the classifier


Clustering
Apply a clustering algorithm for a given set of descriptors for identifying a
separation of Rn into classes
Example: Analyze the density of the distribution of given descriptors in
Rn ; a region having a dense distribution defines a seed point of one class,
and then we assign all descriptors to identified seed points by applying, for
example, the nearest-neighbor rule
Learn Rules at Split Nodes in a Decision Tree
Learn decision rules at split nodes e.g. by having a general scheme how to
define such rules, and optimise parameters by maximising the information
gain at this split node (e.g. equal number of training descriptors passing
to either the left or the right successor)
19 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Positive (for Pedestrian) and Negative Class Examples

20 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Combined Learning Approaches

There are also cases where we may combine supervised learning with
strategies known from unsupervised learning
Example
Supervised: Decide whether a given bounding box shows a pedestrian, or
decide for a patch, being a subwindow of a bounding box, whether it
possibly belongs to a pedestrian
Unsupervised: Generate a decision tree, e.g. by maximising information
gain at split nodes
Result: Assign class probabilities to a leaf node in the generated tree
according to percentages of pre-classified descriptors arriving at this leaf
node

21 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Agenda

1 Localization, Classification, Evaluation

2 Descriptors, Classifiers, Learning

3 Performance of Object Detectors

4 Descriptor Example: Histogram of Oriented Gradients

22 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Object Detector and Measures

An object detector is defined by applying a classifier for an object


detection problem
We assume that any made decision can be evaluated as being either
correct or false
Evaluations of designed object detectors are required to compare their
performance under particular conditions
There are common measures in pattern recognition or information retrieval
for performance evaluation of classifiers

23 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Basic Definitions

Let tp or fp denote the numbers of true-positives or false-positives,


respectively
Let tn or fn denote the numbers of true-negatives or false-negatives,
respectively

What are the numbers for the example on Page 6?

Note: just the image does not indicate how many non-object regions have
been analyzed (and correctly identified as being no faces); thus we cannot
specify the number tn; we need to analyze the applied classifier for
obtaining tn

24 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

PR, RC, MR, and FPPI


Precision is the ratio of true-positives compared to all detections
Recall (or sensitivity) is the ratio of true-positives to all potentially
possible detections
tp tp
PR = and RC =
tp + fp tp + fn
PR = 1: no false-positive is detected
RC = 1: all visible objects are detected & there is no false-negative

Miss rate is the ratio of false-negatives to all objects


False-positives per image is the ratio of false-positives to all detected
objects
fn fp
MR = = 1 RC and FPPI = = 1 PR
tp + fn tp + fp
MR = 0: all visible objects are detected
FPPI = 0: detected objects are correctly classified
25 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

TNR and AC

tn is not a common entry for performance measures, but, if available then


we also have TNR and AC:

True-negative rate (or specificity) is the ratio of true-negatives to all


decisions in no-object regions
Accuracy is the ratio of correct decisions to all decisions

tn tp + tn
TNR = and AC =
tn + fp tp + tn + fp + fn

26 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Detected?
How to decide whether a detected object is true-positive?
Assume: Objects in images have been locally identified (e.g. manually) by
bounding boxes, serving as the ground truth
Detected objects are matched with these ground-truth boxes by
calculating ratios of areas of overlapping regions
A(D T )
ao =
A(D T )
where A denotes the area of a region in an image, D is the detected
bounding box of the object, and T is the area of the bounding box of the
matched ground-truth box
If ao T , say for T = 0.5, the detected object is taken as a true-positive
If more than one possible matching for a detected bounding box then use
the one with the largest ao -value
27 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Agenda

1 Localization, Classification, Evaluation

2 Descriptors, Classifiers, Learning

3 Performance of Object Detectors

4 Descriptor Example: Histogram of Oriented Gradients

28 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Scanning an Image for Object Candidates

1 Window of the size of the expected bounding box scans through an


image

2 The scan stops at potential object candidates

3 If a potential bounding box has been identified, a process for


descriptor calculation starts

Histogram of oriented gradients (HoG) is a common way to derive a


descriptor for a bounding box for an object candidate

29 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Bounding Box, Blocks, and Cells


A bounding box (here: of a pedestrian) is subdivided into blocks, and each
block into smaller cells for calculating the HoG

Yellow solid or dashed blocks are subdivided into red cells; a block moves
left to right, top down, through a bounding box
Right: Magnitudes of gradient vectors
30 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Algorithm for Calculating the HoG Descriptor

1 Preprocessing. Intensity normalization and smoothing


2 Calculate an edge map. Gradient magnitudes and gradient angles for
each pixel, generating a magnitude map Im and an angle map Ia
3 Spatial binning.
1 Group pixels into non-overlapping cells (e.g. 8 8)
2 Accumulate magnitude values in Im into direction bins (e.g., nine bins
for intervals of 20 each) to obtain a voting vector for each cell
calculation
4 Normalize voting values for generating a descriptor.
1 Group cells (e.g., 2 2) into one block
2 Normalize voting vectors over each block, and combine them into one
block vector
5 Concatenation. Augment all block vectors consecutively; this
produces the final HoG descriptor
31 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Two Examples

Length of vectors in nine different directions in each cell represents the


accumulated magnitude of gradient vectors for one of those nine directions
32 / 33
Localization, Classification, Evaluation Descriptors, Classifiers, Learning Performance of Object Detectors HoG

Copyright Information

This slide show was prepared by Reinhard Klette


with kind permission from Springer Science+Business Media B.V.

The slide show can be used freely for presentations.


However, all the material is copyrighted.

R. Klette. Concise Computer Vision.


Springer-Verlag,
c London, 2014.

In case of citation: just cite the book, thats fine.

33 / 33

Das könnte Ihnen auch gefallen