Beruflich Dokumente
Kultur Dokumente
=
=
W
k
k kj j
p L r
1
x x e
Loss incurred if x actually came
from , but assigned to
k
e
j
e
j
e
( )
( )
( ) ( )
k
W
k
k kj j
P p L
p
r e e
=
=
1
x
x
1
x
p(A/B)p(B)=p(B/A)p(A)
Classifiers 20 H. H. Kha
Bayes classifier
Because 1/p(x) is positive and common to all r
j
(x),
so it can be dropped w/o affecting the comparison
among r
j
(x)
The classifier assigns x to the class with the
smallest average loss --- Bayes classifier
( ) ( ) ( )
k
W
k
k kj j
P p L r e e
=
=
1
x x
( ) ( ) ( ) ( ) i j j P p L P p L
q
W
q
q kj k
W
k
k ki
= <
= =
; , x x
1 1
e e e e
(1)
Classifiers 21 H. H. Kha
The Loss Function (L
ij
)
0 loss for correct decision, and same nonzero value
(say 1) for any incorrect decision.
where
ij ij
L o =1
j i j i
ij ij
= = = = if 0 and if 1 o o
(2)
Classifiers 22 H. H. Kha
Bayes Classifier
Substituting (2) into (1) yields
The classifier assigns x to class if for all
( ) ( ) ( ) ( )
( ) ( ) ( )
j j
k
W
k
k kj j
P p p
P p r
e e
e e o
x x
x 1 x
1
=
=
=
i
e
i j =
( ) ( ) ( ) ( ) i j W j P p P p
j j i i
= = > ; ,..., 2 , 1 , x x e e e e
p(x) is common to all
classes, so is dropped
Classifiers 23 H. H. Kha
Decision Function
Using Bayes classifier for a 0-1 loss function, the
decision function for is
Now the questions are
How to get ?
How to estimate ?
j
e
( ) ( ) ( ) W j P p d
j j j
,..., 2 , 1 , x x = = e e
( )
j
P e
( )
j
p e x
Classifiers 24 H. H. Kha
Using Gaussian Distribution
Most prevalent form (assumed) for is the
Gaussian probability density function.
Now consider a 1D problem with 2 pattern classes
(W=2)
( )
j
p e x
( ) ( ) ( )
( )
( ) 2 , 1 ,
2
1
x x
2
2
2
= =
=
j P e
P p d
j
m x
j
j j j
j
j
e
o t
e e
o
variance
mean
Classifiers 25 H. H. Kha
Example
Where is the decision if
1.
2.
3.
( ) ( )
2 1
e e p p =
( ) ( )
2 1
e e p p >
( ) ( )
2 1
e e p p <
( ) ( ) ( )
j j j
P p d e e x x =
Classifiers 26 H. H. Kha
N-D Gaussian
For jth pattern class,
where,
( )
( )
( ) ( )
j j
T
j
m C m
j
n
j
e
C
p
=
x x
2
1
2 1
2
1
2
1
x
t
e
e
=
j
j
j
N
m
e x
x
1
e
~
j
T
j j
T
j
j
m m
N
C
e x
xx
1
Classifiers 27 H. H. Kha
N-D Gaussian
Working with the logarithm of the decision function:
If all covariance matrices are equal, then
( ) ( ) | | ( ) | |
( ) ( ) ( ) | |
j j
T
j j j
j j j
m x C m x C
n
P
P p d
=
+ =
1
2
1
ln
2
1
2 ln
2
ln
ln x ln x
t e
e e
( ) ( )
j
T
j j
T
j j
m C m m C P d
1 1
2
1
x ln x
+ = e
Common covariance
Classifiers 28 H. H. Kha
For C=I
If C=I (identity matrix) and is 1/W, we get
which is the minimum distance classifier
Gaussian pattern classes satisfying these conditions
are spherical clouds of identical shape in N-D.
( )
j
p e
( ) W j m m m d
j
T
j j
T
j
,..., 2 , 1 ,
2
1
x x = =
Classifiers 29 H. H. Kha
Example
(
(
(
=
1
1
3
4
1
1
m
(
(
(
=
3
3
1
4
1
2
m
Decision boundary
(
(
(
= =
3 1 1
1 3 1
1 1 3
16
1
2 1
C C
Classifiers 30 H. H. Kha
Example
Assuming
We get
The decision surface is
( ) ( ) 2 1
2 1
= = e e p p
( )
j
T
j j
T
j
m C m m C d
1 1
2
1
x x
=
Dropping , which
is common to all classes
( )
j
p e ln
(
(
(
8 4 4
4 8 4
4 4 8
16
1
1
1
C
( ) ( ) 5 . 5 8 8 4 x and 5 . 1 4 x
3 2 1 2 1 1
+ + = = x x x d x d
( ) ( ) 0 4 8 8 8 x x
3 2 1 2 1
= + = x x x d d
Classifiers 31 H. H. Kha