Sie sind auf Seite 1von 18

Dynamic Programming

Richard de Neufville
Professor of Engineering Systems
and of
Civil and Environmental Engineering
MIT

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 1 of 36

Objective
z To develop formally “dynamic programming”,
the method used in lattice valuation of options
— Assumptions: Separability and Monotonicity
— Its optimization procedure: “Implicit Enumeration”

z To show its wide applicability


— Options analysis
— Sequential problems: routing and logistics;

inventory plans; replacement policies; reliability


— Non-sequential problems: investments

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 2 of 36

Page 1
Outline
1. Why in this course?
2. Basic concept: Implicit Enumeration
— Motivational Example
3. Key Assumptions
— Independence (Separability) and Monotonicity
4. Mathematics
— Recurrence Formulas
5. Example
6. Types of Problems DP can solve
7. Summary

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 3 of 36

Why in this course?


z DP is used to find optimum exercise of
options in lattice, that has non-convex
feasible region (see comments in that
presentation)

z This presentation gives general method,


so you understand it at deeper level

z DP used in lattice is simple version


— only 2 states considered compared at any time

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 4 of 36

Page 2
Motivational Example

Consider Possible Investments in 3 Projects


PR OJEC T 1 PROJECT 2 PR O JEC T 3

7 7
7

6 6
6

5 5
5

4 4 4

3 3 3

2 2 2

1 1
1

0
0
0
0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 4
0 0.5 1 1.5 2 2.5 3 3.5
I N V E ST M E N T I N VES TM EN T I N V E ST M E N T

What is best investment of 1st unit? P3 Î +3


Of 2nd? 3rd? P1 or P3 Î +2, +2 Total = 7
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 5 of 36

Motivational Example: Best Solution


PR OJEC T 1 PROJECT 2 PR O JEC T 3

7 7
7

6 6
6

5 5
5

4 4 4

3 3 3

2 2 2

1 1
1

0
0
0
0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 4
0 0.5 1 1.5 2 2.5 3 3.5
I N V E ST M E N T I N VES TM EN T I N V E ST M E N T

Optimum Allocation is Actually (0, 2, 1) Î 8


Marginal Analysis misses this….
because Feasible Region is not convex
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 6 of 36

Page 3
Point of Example
z Non-convexity of feasible region can “hide”
optimum solution
z Marginal analysis ,“hill climbing” methods
to search for optimum not appropriate in
these cases (for lattice models in particular)

z We need to search entire space of


possibilities
z This is what “Dynamic Programming” does
to define optimum solution
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 7 of 36

Semantic Note
z “Dynamic” Programming so named because
— Originally associated with movements through
time and space (e.g., aircraft gaining altitude,
thus “dynamic”)
— “programming” by analogy to “linear

programming” and other forms of optimization

z This approach can be used in many cases


that are not in fact “dynamic”

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 8 of 36

Page 4
Basic Solution Strategy
z Enumeration is basic concept
— This means evaluating “all” the possibilities
— Checking “all” possibilities, we must find best

z No assumptions about regularity of


Objective Function
z Means that DP can optimize over
— Non-Convex Feasible Regions
— Discontinuous, Integer Functions

— Which other optimization techniques cannot do

z HOWEVER…
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 9 of 36

Curse of Dimensionality
z Number of Possible Designs very large
z Example: a simple development of 2 sites,
for 4 sizes of operations over 3 periods
— Number of Combinations in 1 period = 42 = 16
3
— Possibilities over 3 periods = 16 = 4096

z In general, size of design is exponential


= [ (Size)locations] periods
— Actual enumeration is impractical

z In lattice model…. See next page

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 10 of 36

Page 5
The Curse -- in lattice model
OUTCOME LATTICE

100.00 120.00 144.00 172.80 207.36 248.83 298.60


80.00 96.00 115.20 138.24 165.89 199.07
64.00 76.80 92.16 110.59 132.71
51.20 61.44 73.73 88.47
40.96 49.15 58.98
32.77 39.32
26.21

z End states = N
z Total States ~ Order of only N2 /2
z Number of paths ~ Order of 2N…
— To reach each state at last stage =
1 + 6 +13 +16 + 13 + 6 +1 = 46 paths

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 11 of 36

Implicit Enumeration
z IE considers all possibilities in principle
z Exploits features of problem to
— Identify classes of dominated possibilities
— Reject these classes

— Vastly reduce dimensionality of enumeration

z Size of numeration for DP


— Order of [(Size) Locations] Periods
— Multiplicative size, not exponential

— This analysis computationally practical

z Examples will illustrate what this means


Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 12 of 36

Page 6
Demonstration of I E

z Select a “dynamic” problem – logistic


movement from Seattle to Washington DC
z Suppose that
— there are 4 days to take trip…
— Can go through several cities

— There is a cost for the movement between any city

and possible city in next stage


z What is the minimum cost route?

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 13 of 36

Possible routes through a node


z Many routes, with link costs as in diagram
z Consider Omaha
— 3 routes to get there, as shown
— 3 routes from there => 9 routes via Omaha

Seattle 100 Boise 500 Fargo Detroit

200 250

Salt Lake 200 Omaha Memphis DC

300 150

Phoenix 200 Houston Atlanta

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 14 of 36

Page 7
Instead of Costing all Routes… IE
z We find best cost to Omaha (350 via Boise)
z Salt Lake (400), Phoenix (450) routes dominated,
“pruned” – we drop routes with those segments
z Thus don’t look at all Seattle to DC routes
Seattle 100 Boise 500 Fargo Detroit

200 250

Salt Lake 200 Omaha Memphis DC

300 150

Phoenix 200 Houston Atlanta

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 15 of 36

Result: Fewer Combinations


z Total Routes:
= 3 to Omaha + 3 after = 6 = 3 x 2, not 9 = 32
z Savings not dramatic here, illustrate idea

Seattle 100 Boise 500 Fargo Detroit

200 250

Salt Lake 200 Omaha Memphis DC

300 150

Phoenix 200 Houston Atlanta

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 16 of 36

Page 8
Some nomenclature definitions
z In Dynamic Programming:
z The Objective Function is G(X)
z Where X = (X1,…. XN ) is the set of states at
each of N “stages” --
z Selection of all Xi defines optimum policy
z giXi are the “return functions” that provide
the effect of an Xi state at the ith stage
z giXi denotes the functional form; Xi the
different states at the ith stage
z So that G(X) = [g1X1,…. gNXN]
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 17 of 36

Useful Concepts -- Stages

z Each giXi “return function” is associated


with a “stage” in the problem
— Example: 1st Stage is from Seattle to Boise, etc
— Thus g1X1 are costs from Seattle to Boise, etc

z “Stages” may
— have a physical meaning, as in example,
— or be conceptual (as the investments in later

example) where a “stage” represents the next


project or “knob” for system that we address

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 18 of 36

Page 9
Useful Concepts -- States
z Each giXi takes on different “states”, that is,
possible situations at a stage
z Examples:
— For cross-country shipment, there are 3 states
(of system, not as “states” of USA) for 1st stage,
Boise, Salt Lake and Phoenix
— For plane accelerating to altitude, a state might

be defined by (speed, altitude) vector


— For investments, states might be $ invested

— If stage is “knob” we manipulate on system,

“state” is the setting of the knob

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 19 of 36

Stages and States


z “Stages” are associated with each move along trip
z Stage 1 consists of set of endpoints Boise, Salt Lake and
Phoenix, Stage 2 the set of Fargo, Omaha and Houston; etc.
z “States” are possibilities in each Stage: Boise, Salt Lake, etc...

Seattle 100 Boise 500 Fargo Detroit

200 250

Salt Lake 200 Omaha Memphis DC

300 150

Phoenix 200 Houston Atlanta

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 20 of 36

Page 10
Solution depends on Decomposition
z Must be able to “decompose” objective
function G(X) into functions of individual
“stages” Xi:
G(X) = [g1X1,…. gNXN]
— Example: cost of Seattle to DC trip can be

decomposed into cost of 4 segments of which


Seattle to Boise, Salt Lake or Phoenix is first
z Necessary conditions for decomposition
— Separability
— Monotonicity

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 21 of 36

Separability
z Objective Function is separable if all giXi are
independent of gJXJ for all J not equal to I

z In example, it is reasonable to assume that


the cost of driving between any pair of cities
is not affected by that between another pair
z However, not always so…

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 22 of 36

Page 11
Monotonicity
z Objective Function is monotonic if:
improvements in each giXi lead to
improvements in Objective Function, that is if
z given G(X) = [giXi , G’(X’) ] where X = [Xi , X’]
z for all giX’i > giX”i where X’i ,X”i different Xi
● It is true that [giX’i , G’(X) ] > [giX”i , G’(X) ]

z Additive functions always monotonic


z Multiplicative functions monotonic only if giXi
are non-negative, real
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 23 of 36

Solution Strategy
z Two Steps
— Partial optimization at each stage
— Repetition of process for all stages

z This is the process used to value options


through the lattice
— At each stage (period), for each state
(possible outcome for system)
— The process chooses better of exercising

option or not

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 24 of 36

Page 12
Cumulative Return Function
z Result of Optimization at each stage and state
is the “cumulative return function”

z fS (K) denotes best value for being in state K,


having passed through previous S stages
z Example: f2 (Omaha) = 350

z Defined in terms of best over previous stages

and return functions for this stage:


fS (K) = Max or Min of [giXi , fS-1 (K) ]
(note: K understood to be a variable)
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 25 of 36

Mathematics: Recurrence formulas


z Transition from one stage to next is via a
“recurrence formula” (or equivalent analysis)
z Formally, we seek the best we can obtain to
any specified level K*, by finding the best
combination of possible giXi and fS-1 (K)
z This is what we did for the valuation of
options in the lattice:
= NPV[r, Max[EV(mine open), cost of closing]]
z Where value of “mine open” is immediate
value (giXi) and later stages, fS-1 (K)
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 26 of 36

Page 13
Application of Recurrence formulas

z For Example: Consider the Maximization


investments in independent projects
— Each project is a “stage”
— Amount of Investment in each is its “state”

— Objective Function Is Additive:

Value = Σ (value each project)


— Recurrence formula: fi (K) = Max[giXi + fi-1 (K- Xi ) ]

— that is: optimum for investing K over “i” stages

= maximum of all combinations of investing level Xi


in stage “i” and (K- Xi ) in previous stages

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 27 of 36

Application to Investment Example


z 3 Projects, 4 Investment levels (0, 1, 2, 3)
z Objective: Maximum for investing 3 units
z Stages = projects ; States = investment levels
7
6
gi(Xi) return
5 function I=1
4 gi(Xi) return
3 function I =2
2 gi(Xi) return
function I=3
1
0

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 28 of 36

Page 14
Dynamic Programming Analysis (1)
z At 1st stage the cumulative return function
identically equals return for X1
z That is, f1(X1 ), the best way to allocate
resource over only one stage ≡ g1X1
— There is no other choice

z So f1 (0) = 0
z f1(1) = 2 ; f1 (2) = 4 ; f1 (3) = 6

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 29 of 36

Dynamic Programming Analysis (2)


z At 2nd stage, best way to spend:
0 : is 0 on both 1st and 2nd stage (= 0) = f2 (0)
1 : either: 0 on 1st and 1 on 2nd stage (= 1)
or: 1 on 1st and 0 on 2nd stage (= 2) BEST = f2 (1)
2 : 2 on 1st , and 0 on 2nd stage (= 4)
1 on 1st, and 1 on 2nd stage (= 3)
0 on 1st, and 2 on 2nd stage (= 5) BEST = f2 (2)
3: 4 Choices, Best allocation is (1,2) Î 7 = f2 (3)

These results, and the corresponding allocations,


shown on next figures…
Engineering Systems Analysis for Design Richard de Neufville ©
Massachusetts Institute of Technology Dynamic Programming Slide 30 of 36

Page 15
Dynamic Programming Analysis (3)
z LH Column: 0 in no Project f0(0) = 0
z 2nd Column: 0…4 in 1st project, e.g.: f1(2) = 4
A B F
f 0 (0 )= 0 f 1 (0 )= 0 f 2 (0 )= 0

C G
f 1 (1 )= 2 f 2 (1 )= 2

D H
f 1 (2 )= 4 f 2 (2 )= 5

E I M
f 1 (3 )= 6 f 2 (3 )= 7 f 3 (3 )= 8

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 31 of 36

Dynamic Programming Analysis (4)


z For 3rd stage (all 3 projects) we want optimum
allocation of all 4 units: Î (0,2,1) Î f3(4) = 8
A B F
f 0 (0 )= 0 f 1 (0 )= 0 f 2 (0 )= 0

C G
f 1 (1 )= 2 f 2 (1 )= 2

D H
f 1 (2 )= 4 f 2 (2 )= 5

E I M
f 1 (3 )= 6 f 2 (3 )= 7 f 3 (3 )= 8

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 32 of 36

Page 16
Contrast DP and Marginal Analysis
z Marginal Analysis:
— reduces calculation burden by only looking at
“best” slopes towards goal, discards others
— Misses opportunities to take losses for later gains

— approach Î 7

z Dynamic Programming :
— Looks at “all” possible positions
— But cuts out combinations that are dominated

— Using independence return functions (value from

a state does not depend on what happened before)

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 33 of 36

Classes of Problems suitable for DP


z Sequential, “Dynamic” Problems
-- aircraft flight paths to maximize speed, altitude
-- movement across territory (example used)
z Schedule, Inventory (Management over time)
z Reliability -- Multiplicative example, see text
z Options analysis!

z Non-Sequential: Investment Maximizations


— Nothing Dynamic. Key is separability of projects

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 34 of 36

Page 17
Formulation Issues
z No standard (“canonical”) form
z Careful formulations required (see text)
z DP assumes discrete states
— thus easily handles integers, discontinuity
— in practice does not handle continuous variables

z DP handles constraints in formulation


— Thus certain paths not defined or allowed
z Sensitivity analysis is not automatic

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 35 of 36

Dynamic Programming Summary

z The method used to deal with lattices


z Solution by implicit enumeration
z Approach requires
— separability, monotonicity
z Careful formulation needed
z Useful for wide range of issues
-- in particular for options analyses!

Engineering Systems Analysis for Design Richard de Neufville ©


Massachusetts Institute of Technology Dynamic Programming Slide 36 of 36

Page 18