Beruflich Dokumente
Kultur Dokumente
In the last chapter, we saw that greedy algorithms are efficient solutions to
certain optimization problems. However, there are optimization problems
for which no greedy algorithm exists. In this chapter, we will examine a more
general technique, known as dynamic programming, for solving optimization
problems.
Dynamic programming is a technique of implementing a top-down solu-
tion using bottom-up computation. We have already seen several examples
of how top-down solutions can be implemented bottom-up. Dynamic pro-
gramming extends this idea by saving the results of many subproblems in
order to solve the desired problem. As a result, dynamic programming algo-
rithms tend to be more costly, in terms of both time and space, than greedy
algorithms. On the other hand, they are often much more efficient than
straightforward recursive implementations of the top-down solution. Thus,
when greedy algorithms are not possible, dynamic programming algorithms
are often the most appropriate.
392
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 393
An obvious greedy strategy is to choose at each step the largest coin that
does not cause the total to exceed n. For some sets of coin denominations,
this strategy will result in the minimum number of coins for any n. However,
suppose n = 30, d1 = 1, d2 = 10, and d3 = 25. The greedy strategy first
takes 25. At this point, the only denomination that does not cause the total
to exceed n is 1. The greedy strategy therefore gives a total of six coins: one
25 and five 1s. This solution is not optimal, however, as we can produce 30
with three 10s.
Let us consider a more direct top-down solution. If k = 1, then dk = 1,
so the only solution contains n coins. Otherwise, if dk > n, we can reduce
the size of the problem by removing dk from the set of denominations, and
the solution to the resulting problem is the solution to the original problem.
Finally, suppose dk ≤ n. There are now two possibilities: the optimal
solution either contains dk or it does not. In what follows, we consider these These two
possibilities are
two cases separately. not exclusive —
Let us first consider the case in which the optimal solution does not there could be
contain dk . In this case, we do not change the optimal solution if we remove one optimal
solution that
dk from the set of denominations. We therefore have reduced the original contains dk and
problem to a smaller problem instance. another that
does not.
Now suppose the optimal solution contains dk . Suppose we remove one
dk coin from this optimal solution. What remains is an optimal solution
to the instance with the same set of denominations and a target value of
n − dk . Now working in the other direction, if we have the optimal solution
to the smaller instance, we can obtain an optimal solution to the original
instance by adding a dk coin. Again, we have reduced the original problem
to a smaller problem instance.
To summarize, when dk ≤ n, the optimal solution can be obtained from
the optimal solution to one of two smaller problem instances. We have no
way of knowing in advance which of these smaller instances is the right
one; however, if we obtain both of them, we can compare the two resulting
candidate solutions. The one with fewer coins is the optimal solution. In
fact, if we could quickly determine which of these smaller instances would
yield fewer coins, we could use this test as the selection criterion for a greedy
algorithm. Therefore, let us focus for now on the more difficult aspect of
this problem — that of determining the minimum number of coins in an
optimal solution.
Based on the above discussion, the following recurrence gives the mini-
mum number of coins needed to obtain a value of n from the denominations
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 394
d1 , . . . , dk :
n
if k = 1
C(n, k) = C(n, k − 1) if k > 1, dk > n (12.1)
min(C(n, k − 1), C(n − dk , k) + 1) if k > 1, n ≥ dk .
n − k ≥ k2 − k
= k(k − 1)
≥ (k − 1)2
2. C(n, k) also requires C(n − k, k), which requires C(n − k, k − 1), which
requires C(n − k, k − 2), which requires C(n − 2k + 2, k − 2).
each C(i, j) in constant time. All (n + 1)k of these values can therefore be
computed in Θ(nk) time. Once all of these values have been computed, then
the optimal collection of coins can be constructed in a greedy fashion, as
suggested above. The algorithm is shown in Figure 12.1. This algorithm is
easily seen to use Θ(nk) time and space.
A characteristic of this problem that is essential in order for the dynamic
programming approach to work is that it is possible to decompose a large
problem instance into smaller problem instances in a way that optimal so-
lutions to the smaller instances can be used to produce an optimal solution
to the larger instance. This is, of course, one of the main principles of the
top-down approach. However, this characteristic may be stated succinctly
for optimization problems: For any optimal solution, any portion of that
solution is itself an optimal solution to a smaller instance. This principle
is known as the principle of optimality. It applies to the change-making
problem because any sub-collection of an optimal collection of coins is it-
self an optimal collection for the value it yields; otherwise, we could replace
the sub-collection with a smaller sub-collection yielding the same value, and
obtain a better solution to the original instance.
The principle of optimality usually applies to optimization problems, but
not always in a convenient way. For example, consider the problem of finding
a longest simple path in a graph from a given vertex u to a given vertex v. A simple path in
a graph is a path
If we take a portion of the longest path, say from x to y, this subpath is in which each
not necessarily the longest simple path from x to y in the original graph. vertex appears at
However, it is guaranteed to be the longest simple path from x to y in the most once.
subgraph consisting of only those vertices on that subpath and all edges
between them in the original graph. Thus, a subproblem consists of a start
vertex, a final vertex, and a subset of the vertices. Because a graph with
n vertices has 2n subsets of vertices, there are an exponential number of
subproblems to solve. Thus, in order for dynamic programming to be an
effective design technique, the principle of optimality must apply in a way
that yields relatively few subproblems.
One characteristic that often leads to relatively few subproblems, while
at the same time causing direct recursive implementations to be quite ex-
pensive, is that the top-down solution results in overlapping subproblems.
As we have already discussed, the top-down solution for the change-making
problem can result in two subproblems which have a subproblem in common.
This overlap results in redundant computation in the direct recursive imple-
mentation. On the other hand, it reduces the total number of subproblems,
so that the dynamic programming approach is more efficient.
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 396
Figure 12.1 Algorithm for computing the minimum number of coins needed
to achieve a given value
Precondition: d[1..k] is an array of Ints such that 1 = d[1] < d[2] < · · · <
d[k], and n is a Nat.
Postcondition: Returns an array A[1..k] such that A[i] gives the number
of coins of denomination d[i] in a minimum-sized collection of coins with
value n.
Change(d[1..k], n)
C ← new Array[0..n, 1..k]; A ← new Array[1..k]
for i ← 0 to n
C[i, 1] ← i
for i ← 0 to n
for j ← 2 to k
if i < d[j]
C[i, j] ← C[i, j − 1]
else
C[i, j] ← Min(C[i, j − 1], C[i − d[j], j] + 1)
for j ← 1 to k
A[j] ← 0
i ← n; j ← k P
k
// Invariant: l=1 A[l]d[l] = n − i, and there is an optimal solution
// that includes all of the coins in A[1..k], but no additional coins from
// d[j + 1..k].
while j > 1
if i < d[j] or C[i, j − 1] < C[i − d[j], j] + 1
j ←j−1
else
A[j] ← A[j] + 1; i ← i − d[j]
A[1] ← i
return A[1..k]
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 397
M1 M2 · · · Mn ,
• M1 : 2 × 3;
• M2 : 3 × 4; and
• M3 : 4 × 1.
2 · 3 · 4 + 2 · 4 · 1 = 32.
3 · 4 · 1 + 2 · 3 · 1 = 18.
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 398
Thus, the way in which the matrices are parenthesized can affect the
number of scalar multiplications performed in computing the matrix prod-
uct. This fact motivates an optimization problem: Given a sequence of
positive integer dimensions d0 , . . . , dn , determine the minimum number of
scalar multiplications needed to compute the product M1 · · · Mn , assuming
Mi is a di−1 × di matrix for 1 ≤ i ≤ n, and that the number of scalar
multiplications required to multiply two matrices is as described above.
Various greedy strategies might be applied to this problem, but none can
guarantee an optimal solution. Let us therefore look for a direct top-down
solution to the problem of finding the minimum number of scalar multipli-
cations for a product Mi · · · Mj . Let us focus on finding the last matrix
multiplication. This multiplication will involve the products Mi · · · Mk and
Mk+1 · · · Mj for some k, 1 ≤ k < n. The sizes of these two matrices are
di−1 × dk and dk × dj . Therefore, once these two matrices are computed, an
additional di−1 dk dj scalar multiplications must be performed. The principle
of optimality clearly holds for this problem, as a better way of computing
either sub-product results in fewer total scalar multiplications. Therefore,
the following recurrence gives the minimum number of scalar multiplications
needed to compute Mi · · · Mj :
0 if i = j
m(i, j) = (12.2)
min (m(i, k) + m(k + 1, j) + di−1 dk dj ) if i < j
i≤k<j
fractional knapsack problem does not extend to the so-called 0-1 knapsack
problem — the variation in which the items cannot be broken. Specifically,
in this variation we are given a set of n items, each having a positive weight
wi ∈ N and a positive value vi ∈ N, and a weight bound W ∈ N. We wish
to find a subset S ⊆ {1, . . . , n} that maximizes
X
vi
i∈S
Let us then compute the minimum weight required to achieve each possible
value v ≤ V . The largest value v yielding a minimum weight no larger than
W is then our optimal value.
Taking this approach, we observe that item n is either in the set of items
for which value v can be achieved with minimum weight, or it isn’t. If it
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 403
is, then the minimum weight can be computed by removing item n and
finding the minimum weight needed to achieve a value of v − vn . Otherwise,
the minimum weight can be computed by removing item n. The following
recurrence therefore gives the minimum weight Wi (j) needed to achieve a
value of exactly j from the first i items, for 0 ≤ i ≤ n, 0 ≤ j ≤ V :
0 if j = 0
∞ if i = 0, j > 0
Wi (j) = (12.5)
Wi−1 (j)
if i > 0, 0 < j < vi
min(W (j), W (j − v ) + w ) otherwise.
i−1 i−1 i i
12.5 Summary
Dynamic programming algorithms provide more power for solving optimiza-
tion problems than do greedy algorithms. Efficient dynamic programming
algorithms can be found when the following conditions apply:
• The principle of optimality can be applied to decompose the problem
into subinstances of the same problem.
12.6 Exercises
Exercise 12.1 Prove by induction on n + k that C(n, k), as defined in
recurrence (12.1), gives the minimum number of coins needed to give a
value of exactly n if the denominations are d1 < d2 < · · · < dk and d1 = 1.
Exercise 12.2 Prove that Change, shown in Figure 12.1, meets its spec-
ification. You do not need to focus on the first half of the algorithm; i.e.,
you can assume that C(i, j), as defined in recurrence (12.1), is assigned to
C[i, j]. Furthermore, you may use the result of Exercise 12.1 in your proof.
a. Prove that for denominations d1 < d2 < · · · < dk , where k > 1, if the
greedy algorithm fails for some value, then it must fail for some value
n < dk + dk−1 .
Exercise 12.6
a. Modify Floyd’s algorithm (Figure 12.3) so that it returns an array
S[0..n − 1, 0..n − 1] such that for i 6= j, S[i, j] gives the vertex k such
that (i, k) is the first edge in a shortest path from i to j. If there is
no path from i to j, or if i = j, then S[i, j] should be −1.
b. Give an algorithm that takes the array S[0..n − 1, 0..n − 1] defined
above, along with i and j such that 0 ≤ i < n and 0 ≤ j < n, and
prints the vertices along a shortest path from i to j. The first vertex
printed should be i, followed by the vertices in order along the path,
until the last vertex j is printed. If i = j, only i should be printed. If
there is no path from i to j, a message to that effect should be printed.
Your algorithm should run in O(n) time.
Exercise 12.7 Give an algorithm for the 0-1 knapsack problem that runs
in O(nW ) time and space, where n is the number of items and W is the
weight bound. Your algorithm should use dynamic programming to compute
recurrence (12.4) for 0 ≤ i ≤ n and 0 ≤ j ≤ W , then use these values to
guide a greedy algorithm for selecting the items to put into the knapsack.
Your algorithm should return an array selected[1..n] of booleans such that
selected[i] is true iff item i is in the optimal packing.
Exercise 12.8 Repeat Exercise 12.7 using recurrence (12.5) instead of (12.4).
Your algorithm should use Θ(nV ) time and space, where n is the number
of items and V is the total value of all the items.
Exercise 12.10 Let A[1..m] and B[1..n] be two arrays. An array C[1..k]
is a common subsequence of A and B if there are two sequences of indices
hi1 , . . . , ik i and hj1 , . . . , jk i such that
a. Give a recurrence for L(i, j), the length of the longest common subse-
quence of A[1..i] and B[1..j].
Exercise 12.11 A palindrome is a string that reads the same from right
to left as it does from left to right (“abcba”, for example). Give a dynamic
programming algorithm that takes a String (see Figure 4.17 on page 144)
s as input, and returns a longest palindrome contained as a substring within
s. Your algorithm should operate in O(n2 ) time, where n is the length of
s. You may use the results of Exercise 4.13 (page 143) in analyzing your
algorithm. [Hint: For each pair of indices i ≤ j, determine whether the
substring from i to j is a palindrome.]
a) b) c)
* Exercise 12.15 A chain is a rooted tree with exactly one leaf. We are
given a chain representing a sequence of n pipelined processes. Each node i
in the chain represents a process and has a positive execution time ei ∈ N.
Each edge (i, j) has a positive communication cost cij ∈ N. For edge (i, j),
if processes i and j are executed on separate processors, the time needed to
send data from process i to process j is cij ; if the processes are executed on
the same processor, this time is 0. We wish to assign processes to processors
such that each processor has total weight no more than a given value B ∈ N.
The weight of a processor is given by the sum of the execution times of the
processes assigned to that processor, plus the sum of the communication
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 408
3
Processor 1: Weight = 8
2
B = 20
3 Communication cost = 2
5
Processor 2: Weight = 17
8
costs of edges between tasks on that processor and tasks on other processors
(see Figure 12.5). The communication cost of an assignment is the sum of
the communication costs of edges that connect nodes assigned to different
processors.
Give a dynamic programming algorithm that finds the minimum com-
munication cost of any assignment of processes to processors such that each
processor has weight no more than B. Note that we place no restriction on
the number of processors used. Your algorithm should run in O(n2 ) time.
Prove that your algorithm is correct.
Exercise 12.16 Given two strings x and y, we define the edit distance from
x to y as the minimum number of operations required to transform x into
y, where the operations are chosen from the following:
• insert a character;
• delete a character; or
CHAPTER 12. OPTIMIZATION II: DYNAMIC PROGRAMMING 409
• change a character.
We say that a binary search tree containing these keys is optimal if the
expected cost of a look-up in this tree is minimum over the set of all binary
search trees containing these keys.
* d. Using the above result, improve your algorithm to run in O(n2 ) time.
Your algorithm should run in O(n2 ) time and use O(n) space. Prove that
your algorithm is correct.
including a single blank character between adjacent words on the same line.
Furthermore, we wish to minimize a “sloppiness” criterion. Specifically, we
wish to minimize the following objective function:
k−1
X
f (m − ci ),
i=1