A16 Simple Decisions

Making Simple
Decisions
Chapter 16
Topics
Decision making under uncertainty
Expected utility
Utility theory and rationality
Utility functions
Multi-attribute utility functions
Preference structures
Decision networks
Value of information
Uncertain Outcome of Actions

Some actions may have uncertain outcomes
Action: spend $10 to buy a lottery which pays $10,000 to the
winner
Outcome: {win, not-win}
Each outcome is associated with some merit (utility)

Win: gain $9990
Not-win: lose $10
There is a probability distribution associated with the

outcomes of this action (0.0001, 0.9999).
Should I take this action?
Expected Utility
Random variable X with n values x1,,xn and distribution
(p1,,pn)
X is the outcome of performing action A (i.e., the state reached
after A is taken)
Function U of X
U is a mapping from states to numerical utilities (values)
The expected utility of performing action A is

EU[A] = i=1,,n p(xi|A)U(xi)
Probability of each outcome
Utility of each outcome
One State/One Action Example

s0
U(A1) = 100 x 0.2 + 50 x 0.7 + 70 x 0.1
= 20 + 35 + 7
= 62
A1
s1
0.2
100
s2
0.7
50
s3
0.1
70
One State/Two Actions Example

s0
A1
s1
0.2
100
s2
0.7 0.2
50
U1(A1) = 62
U2(A2) = 74
A2
s3
0.1
70
s4
0.8
80
MEU Principle
Decision theory: A rational agent should choose the action
that maximizes the agents expected utility
Maximizing expected utility (MEU) is a normative criterion
for rational choices of actions
Must have complete model of:
Actions
Utilities
States
Even if you have a complete model, will be

computationally intractable
Decision networks
Extend Bayesian nets to handle actions and utilities
a.k.a. influence diagrams
Make use of Bayesian net inference

Useful application: Value of Information
Decision network representation

Chance nodes: random variables, as in Bayesian
nets
Decision nodes: actions that decision maker can take
Utility/value nodes: the utility of the outcome state.
Airport example
Airport example II
Evaluating decision networks

Set the evidence variables for the current state.
For each possible value of the decision node (assume just
one):
Set the decision node to that value.
Calculate the posterior probabilities for the parent nodes of the
utility node, using BN inference.
Calculate the resulting utility for the action.
Return the action with the highest utility.
Exercise: Umbrella network

take/dont take
P(rain) = 0.4
Umbrella
Weather
Lug umbrella
P(lug|take) = 1.0
P(~lug|~take)=1.0
Happiness
U(lug, rain) = -25
U(lug, ~rain) = 0
U(~lug, rain) = -100
U(~lug, ~rain) = 100
Forecast
f
w
p(f|w)
sunny rain
0.3
rainy
rain
0.7
sunny no rain 0.8
rainy no rain 0.2
Value of Perfect Information (VPI)

How much is it worth to observe (with certainty) a random
variable X?
Suppose the agents current knowledge is E. The value of the current
best action is:
EU( | E) = maxA i U(Resulti(A)) p(Resulti(A) | E, Do(A))
The value of the new best action after observing the value of X is:
EU( | E,X) = maxA i U(Resulti(A)) p(Resulti(A) | E, X, Do(A))
But we dont know the value of X yet, so we have to sum over its
possible values
The value of perfect information for X is therefore:
VPI(X) = ( k p(xk | E) EU(xk | xk, E)) EU ( | E)
Probability of
each value of X
Expected utility
of the best action
given that value of X
Expected utility
of the best action
if we dont know X
(i.e., currently)
VPI exercise: Umbrella network

Whats the value of knowing the weather forecast before
leaving home?
take/dont take
P(rain) = 0.4
Umbrella
Weather
Lug umbrella
P(lug|take) = 1.0
P(~lug|~take)=1.0
Happiness
U(lug, rain) = -25
U(lug, ~rain) = 0
U(~lug, rain) = -100
U(~lug, ~rain) = 100
Forecast
f
w
p(f|w)
sunny rain
0.3
rainy
rain
0.7
sunny no rain 0.8
rainy no rain 0.2
Information gathering agent

Using VPI we can design an agent that gathers information
(greedily)
function INFORMATION-GATHERING-AGENT (percept) return an action
Persistent D a decision network
integrate percept into D
j = the value that maximizes VPI(Ej) / Cost(Ej)
if VPI(Ej) > Cost(Ej)
return REQUEST(Ej)
else
return the best action from D
// or VPI(Ej) - Cost(Ej)

A16 Simple Decisions

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A16 Simple Decisions

Hochgeladen von

Copyright:

Verfügbare Formate

Making Simple

Uncertain Outcome of Actions

Each outcome is associated with some merit (utility)

There is a probability distribution associated with the

The expected utility of performing action A is

Utility of each outcome

One State/One Action Example

One State/Two Actions Example

Even if you have a complete model, will be

Make use of Bayesian net inference

Decision network representation

Evaluating decision networks

Return the action with the highest utility.

Exercise: Umbrella network

Value of Perfect Information (VPI)

VPI exercise: Umbrella network

Information gathering agent

Das könnte Ihnen auch gefallen