Beruflich Dokumente
Kultur Dokumente
Will get me there on time?
Problems:
1. partial observability (road state, other drivers’ plans,
)
4. immense complexity of modelling and predicting
traffic
Hence a purely logical approach either 1) risks
falsehood: “ will get me there on time” or 2) leads to
will get me there on time if there’s no accident on the
bridge and it doesn’t rain and my tires remain intact ”
might reasonably be said to get me there on time
Handling Uncertain Knowledge
p Symptom(p,Toothache) Disease(p,Cavity)
Not all patients with toothaches have cavities
p Symptom(p,Toothache)
Disease(p,Cavity)
Disease(p,GumDisease)
Disease(p,Abcess)
We have to add an almost unlimited list of possible
causes
p Disease(p,Cavity) Symptom(p,Toothache)
Not all cavities cause pain
These are not claims of some probabilistic tendency in
the current situation (but might be learned from past
experience of similar situations)
Probabilities of propositions change with new evidence:
e.g., no reported accidents 5 a.m.
Artificial Intelligence 13, 2004 – 5
Making Decisions under Uncertainty
Suppose I believe the following:
gets me there on time
gets me there on time
gets me there on time
Which action to choose?
Depends on my preferences for missing flight vs. airport
cuisine, etc.
Utility theory is used to represent and infer preferences
Decision theory = probability theory + utility theory
!
rolls of a die. is a sample point/possible
!
#
"
world/atomic event
A probability space or probability model is a sample
space with an assignment for every s.t.
"
!
#
"
"
$
$
"
%
e.g.,
&
'
(
An event is any subset of
!
"
)
+,
*
%
(
(
(
(&
.
.
-
213
4
induces a probability distribution for any r.v. :
5
"
5
67
)
9:
<=% ;
,
8%
>?
e.g.,
4
0/
0
(
(
(
(&
13
event = set of sample points where
"
213
@
4
event = set of sample points where
"
B
@EDC
@
4
A
"
"
G
213
213
F
@
4
Often in AI applications, the sample points are defined
by the values of a set of random variables, i.e., the
sample space is the Cartesian product of the ranges of
the variables
With Boolean variables, sample point = propositional
logic model e.g., , , or B .
@EDC
G
13
F
4
@
A
Proposition = disjunction of atomic events in which it is
true, e.g.,
HIG
G
G
G
F
@
@
A
A
G
G
G
G
F
.
@
A
A A B B
>
The definitions imply that certain logically related events
must have related probabilities
E.g.,
G
@
J G
K
G
F
.
@
@
de Finetti (1931): an agent who bets according to
probabilities that violate these axioms can be forced to
bet so as to lose money regardless of outcome
Artificial Intelligence 13, 2004 – 10
Syntax for Propositions
Propositional or Boolean random variables
e.g., (do I have a cavity?)
L
O2 N
@M
Discrete random variables (finite or infinite)
e.g., is one of
D3 Q
UV
P
SET3C
0
@N
@
4
41
R
RO
1
R
O
D
RT
is a proposition
P
@N
@
4
41
1
R
Values must be exhaustive and mutually exclusive
Continuous random variables (bounded or unbounded)
e.g., ; also allow, e.g., .
W
W
&
-&
&
4X
Y
4X
Y
Arbitrary Boolean combinations of basic propositions
4
L
&
O N
13
@
@M
41
D3
R
RO
correspond to belief prior to arrival of new evidence
Probability distribution gives values for all possible
assignments: P
V
P
&
Z
@
4
41
(normalized, i.e., sums to 1)
Joint probability distribution for a set of r.v.s gives the
probability of every atomic event on those r.v.s (i.e.,
every sample point) P =a matrix:
L
P
\[
&
O N
@
4
41
@M
P
SET3C
0
@N
@
4
41
D3
R
RO
1
O
D
RT
U
L
&
&
O2 N
213
@M
4
L
B
@EDC
Z
Z
O2 N
@M
0.125
18 dx 26
6
]^
_
6
5
Z
&
and 26
Here is a density; integrates to 1.
really means
5
&
&
6
0(
> d cba `
.0
$5
&
&
&
$
6
e
0
i l
j :
k;
lm
hg f
6
>
j
4
i
4
P
P
Z
O N
T
S
@M
@
S
i.e., given that is all I know
P
P
T
T
@
S
4
NOT “if then 80% chance of ”
P
P
O N
T
T
@
S
4
S
@M
Notation for conditional distributions:
P = 2-element vector of 2-element
4
L
P
P
TW
O2 N
@M
@
vectors S
If we know more, e.g., is also given, then we have
O2 N
S
@M
P
P
O N
2O N
T
S
@M
@
S
4
S
@M
R
4
P
P
P
P
Z
O N
O N
T
T
S
@M
@
S
4
41
D
S
@M
@
S
This kind of inference, sanctioned by domain
knowledge, is crucial Artificial Intelligence 13, 2004 – 15
Conditional Probability 2
Definition of conditional probability:
G
F
@
if
@
G
Gn
G
Product rule gives an alternative formulation:
G
@
G
G
Y
G
@
@
F
@
L
L
P
P
O N
O2 N
O N
@
@
4
41
@M
41
@M
@M
5
op5 op5
op5
p
o 5
o 5 5
o 5
j
j
5
5
P P P
5
oq 5
o 5
o 5
p
j
j
j
=
o
= P
7 5
5
7 5
j
<7
L
catch catch catch catch
L
cavity .108 .012 .072 .008
cavity .016 .064 .144 .576
L
"
<% s
t
8%
L
catch catch catch catch
L
cavity .108 .012 .072 .008
cavity .016 .064 .144 .576
L
"
<% s
t
8%
4
P
P
Z
&
&
T
.
T
@
S
L
catch catch catch catch
L
cavity .108 .012 .072 .008
cavity .016 .064 .144 .576
L
"
<% s
t
8%
4
P
P
O N
T
S
@M
@
S
&
Z
&
Z
&
Z
.
L
catch catch catch catch
L
cavity .108 .012 .072 .008
cavity .016 .064 .144 .576
L
4
P
P
2O N
T
F
S
@M
@
S
A
4
P
P
O N
T
S
@M
@
S
A
4
P
P
T
T
@
S
.
Z
&
.
.
L
catch catch catch catch
L
cavity .108 .012 .072 .008
cavity .016 .064 .144 .576
L
P P
4
4
L
L
P
P
P
P
O2 N
O N
T
T
@M
@
S
@M
@
S
P P
^
P
L
L
P
P
P
P
O N
2O N
T
@
T
@
.
@M
@
S
4
S
S
@M
@
S
4
S
S
A
^Q
V
_
Z
&
.
Q
V
V
&
Z
J
Then the required summation of joint entries is done by
summing out the hidden variables:
P YE e P Y E e P Y E e H h
The terms in the summation are joint entries because Y,
E, and H together exhaust the set of random variables
Obvious problems:
1. Worst-case time complexity where is the
0/
o
0
largest arity
2. Space complexity to store the joint distribution
0/
o
P P or P P or P P P
P
L
P @M L
P
P
@ P
P
TW
L 2O N
@
@
T
@
S
4
L S
4
4 1
NO
P P
P
P
P
TW
@
T
@
S
4
@M
41
u
R
J v
P
P
L
L
P
P
TW
&
O N
@
T
@
S
4
@M
S
entries
If I have a cavity, the probability that the probe catches
in it doesn’t depend on whether I have a toothache:
(1)
P
P
P
P
O N
O2 N
@
T
@
S
S
@
S
4
S
@M
S
S
S
@M
The same independence holds if I haven’t got a cavity:
(2)
P
P
P
P
2O N
O N
@
T
@
S
S
@
S
4
S
@M
S
S
S
@M
A
A
is conditionally independent of given
L
P
P
TW
@
S
@
S
4
:
L
2O N
@M
P P
P
P
L
L
P
P
TW
O N
O N
@
@
S
@
S
4
@M
@M
Equivalent statements:
P P
4
S
4
L @L
L
P
P
P P
P
P
TW
TW
N
2O N
@ O O
T
@
S
@M
@
S
@M
P
L
P
P
TW
L N
@
T
@
S
4 4
N2O S
@M
P P
P
L
L
P
P
TW
O N
T
@
S
M@
@M
4 L
L
P
P
@ S P
TW
L N
@
2O
T
@
S
4
@M
S
P P
L
L
P
P
P
TW
O2 N
O N
@
T
@
S
@M
@M
P
P P P
4
P
L
L
P
P
TW
2O N
O N
O N
@
@
T
@
S
@M
@M
@M
P P P
4
P
L
L
P
P
TW
O N
O2 N
O2 N
@
T
@
S
@M
@M
@M
R
Conditional independence is our most basic and robust
form of knowledge about uncertain environments.
G
@
G
G
G
@
@
F
@
G
@
@
Bayes’ rule
@
G
G
or in distribution form
P P
5w
w
P P P
w
5
5w
w
P
5
Useful for assessing diagnostic probability from causal
probability:
x
4
4
B
B
3L
3L
S
4
@
D
@
D
4 x
3L
B
B
S
@
D
x
B
B
S
4
E.g., let be meningitis, be stiff neck:
y
D
X
X
Z
[
X
D
Z
D
Note: posterior probability of meningitis still very small!
Artificial Intelligence 13, 2004 – 23
Bayes’ Rule - Cond. Independence
Cavity Cause
P
P
L
P
P
O N
T
@
F
@M
@
S
4
S
S
P P
T
P
L
L
P
P
O2 N
O N
@
F
@
S
4
S
S
@M
@M
P P P
T
4
P
L
L
P
P
2O N
2O N
O2 N
@
@
S
@M
S
S
@M
@M
P P P
4
x
4
3L
B
B
B
B
3L
B
B
3L
4x
x
IS
o S
7S
@
D
@
D
@
D
7
N
{
13
4
iff [ ] is breezy
N
{
13
4
7z
|
}
|
}
|
|
|
}
|}
|}
|
Apply product rule:
P P
|
|
|
|
|
}
|
}
|
}
Do it this way to get
x
4
B
B
3L
S
4
@
D
First term: 1 if pits are adjacent to breezes, 0 otherwise
Second term: pits are placed randomly, probability 0.2
per square:
|
~
P P
jo
|
|
| 7z
&
Z
[
}
| 7z
|
}
<
for pits
R
R |} G
| Y |} G
A | G
F
F
U A
F
TR
A
}
|} Y
| Y
Query is P
v
G
|
RT
U
R
}
Define = s other than and
]
7z
|
R
RT
U
R
RT
U
R
}
v
For inference by enumeration, we have
P P
v
G
G
|
|
RT
U
R
v3
R
RT
U
R
RT
U
R
}
}
o
o
P
]
1
N
R
RT
U
R
R
4
41
P P
G
R
G
4
|}
|}
1
N
RT
U
R
R
RT
U
RT
U
R
R
v
v
Manipulate query into a form where we can use this!
Artificial Intelligence 13, 2004 – 28
P
< < < < < < < q :
s
o
P
? ? ? ?
q : :
o | o
P
; o :s
P < ;
? ; : s :
o
P
o
P
P
q : q s
:s :s
P
| o | o |
q o o
:s ; q o
o ? | | o | o
|
P
P
7
q o o | | o |
:s
| o ; ; 7 7 o o
o o
7
| o
o o
o |
P
; ;
q
P
P
P
q : q : q : | o
; | q :
| o
: |
;
|
P
7 ;
7
o o o q : o
o : | o
o | o |
; ; | o | o ;
: o o
7 7 o
7 ; o o
o | o o
: | |
; 7
o
7 o
o ;
;
;
Using Conditional Independence 2
|
;
: :
;
; ;
Artificial Intelligence 13, 2004 – 29
Using Conditional Independence 3
1,3 1,3 1,3 1,3 1,3
1,2 2,2 1,2 2,2 1,2 2,2 1,2 2,2 1,2 2,2
B B B B B
OK OK OK OK OK
1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1
B B B B B
OK OK OK OK OK OK OK OK OK OK
0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16 0.8 x 0.2 = 0.16 0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16
P
v
G
V
|
&
Z
.
.
RT
U
R
}
Q
V
'
¡
P
G
V
|
Z
¡
RT
U
R
distributions