Beruflich Dokumente
Kultur Dokumente
Lecture 1-1
Introduction
COMP3111/3811 Algorithms
Two hour lectures
Wednesday (11-12am, Carslaw Theatre 275)
Thursday (11-12am, Carslaw Theatre 275)
One hour tutorial
http://www.it.usyd.edu.au/~comp3111.html
Contact: comp3111@it.usyd.edu.au
Dr. Seokhee Hong
Consultation: Thursday 12-1pm, Madsen G86A.
1. Course Aims
strategies for solving search and optimisation
problems in graphs will be presented, including
network flow methods.
The unit will also provide a survey of algorithmic
approaches for which traditional analyses are not
appropriate.
These will include randomisation, online algorithms
and competitive analysis, and parallel and distributed
algorithms.
Problems drawn from such areas as networks,
systems and databases will be used to illustrate these
algorithmic approaches; for these, the student will
design and analyse their corrective and efficiency.
An introduction to intractable problems, NP-hardness,
and heuristics will also be given.
2. Learning Outcomes
Be familiar with a collection of core
algorithms
Be fluent in algorithm design paradigms:
divide & conquer, greedy algorithms,
dynamic programming.
Be able to analyze the correctness and
runtime performance of a given algorithm
Be familiar with the inherent complexity
(lower bounds & intractability) of some
problems
Be familiar with advanced data structures
Be able to apply techniques in practical
problems
Course Aims
Primary aim: Develop thinking ability
problem solving skills
(algorithm design and application)
formal thinking
(proof techniques & analysis)
Secondary aim: have fun with algorithms
3. Assumed Knowledge
Assumed knowledge: MATH 2009.
Prerequisite: COMP (2111 or 2811) or 2002
or 2902) and MATH(1004 or 1904 or 2009
or 2011) and MATH (1005 or 1905).
Prohibition: May not be counted with
COMP (3811 or 3001 or 3901).
COMP2111 Algorithm 1
a formal introduction to the analysis of
algorithms.
Commonly used data structures such as
lists, stacks, queues, priority queues,
search trees, hash tables and graphs are
all analysed according to a notion of
asymptotic complexity.
Design principles such as the greedy
strategy, divide and conquer, and dynamic
programming are covered, as well as
efficient techniques for searching within
graphs.
4. Assessment
5. School Policies
6. Topics Covered
http://www.it.usyd.edu.au/current_ugrad/h
andbook2003/policies.html#acadhonesty
Week1: introduction
Week2: sorting
Week3: divide and conquer
Week4: greedy algorithm
week5: dynamic programming
6. Topics Covered
Week7: graph algorithms
Week8: network flow
Week9: advanced data structures
Week10: amortized time complexity
Week11: randomized algorithm
Week12: NP-complete/ approximate algorithms
week13: review
Algorithm Analysis
This
course
Greedy Algorithms
Huffman Codes
Activity selection
Minimum Spanning Trees
Shortest Paths
GOAL:
Know when to use greedy algorithms and their
essential characteristics.
Be able to prove the correctness of a greedy
algorithm in solving an optimization problem.
Understand where minimum spanning trees and
shortest path computations arise in practice.
Asymptotic Notation
Recurrence Relations
Proof Techniques
Inherent Complexity
GOAL:
Merge sort
Quick sort
Closest pair
Selection
GOAL:
Know when the divide-and-conquer paradigm is
an appropriate one, and the general structure of
such algorithms.
Be able to characterize their complexity using
techniques for solving recurrences.
Memorize the common case solutions for
recurrence relations.
Dynamic Programming
Longest common subsequences
Matrix chain multiplication
Optimal binary search tree
GOAL
Know what problem characteristics make it
appropriate to use dynamic programming and
how it differs from divide-and-conquer.
Be able to move systematically from one to
the other.
Graph Algorithms
Advanced Topics
Randomized algorithm
NP-completeness
Approximation algorithm
If time permits,
Parallel algorithm
Online algorithm
Tutorial
Each student must attend one tutorial per week, as
allocated by the University timetable system.
Tutorials commence in week 2.
You should have read and answered the "prework" before you come to the tutorial.
All tutorial activity is done in groups of up to 3
people.
It is important to be able to explain your ideas to
others and to contribute effectively to a
collaborative solution.
The tutor will discuss the solution and comment
on issues raised by the exercise.
You must submit 1 page result to your tutor (10%).
Communications
COMP3111/3811 Algorithms 2
Lecture 1-2
Goal
Asymptotic notations
motivation
, O, , o,
formal definition
know the difference
Growth of Functions
Running Time
Worst case: the longest running time for any input
of size n
O(n2)
tj = j, j = 2, 3, , n
Asymptotic Notation
, O, , o,
Used to describe the running times of algorithms
Instead of using exact running time, we use
asymptotic notation
Simple characterization of the algorithms
efficiency
Compare the relative performance of algorithms
We are concerned with how the running time of
an algorithm increases with the size of the input
in the limit
-notation
For a given function g(n), we
denote by (g(n)) the set of
functions
(g(n)) = {f(n): there exist
positive constants c1, c2 and
n0 such that
0 c1g(n) f(n) c2g(n),
for all n n0 }
A function f(n) belongs to
(g(n)) if there exist such
constants c1, c2.
Example
1/2n2
(n2),
To show that
- 3n =
we need to determine positive constants c1, c2, n0
such that c1 n2 1/2n2 - 3n c2 n2 for all n n0
c1 1/2 3/n c2
1/2 3/n c2 : holds for n 1 by choosing c2 1/2
c1 1/2 3/n : holds for n 7 by choosing c1 1/14
Example
To show that
=/= (n2),
we use contradiction.
6n3
Example
Example
10n2 - 3n = (n2)
f(n) = an2 +bn +c = (n2), where
a, b, c: constants and a > 0
To compare orders of growth, look at
leading term
O-notation
For a given function g(n),
we denote by O(g(n)) the
set of functions
O(g(n)) = {f(n): there exist
positive constant c and n0
such that
0 f(n) cg(n)
for all n n0 }
f(n) is a member of set
O(g(n))
To prove 7n - 2 = O(n),
We need to show that there exist positive constant c
and n0 such that 7n - 2 cn for all n n0
a possible choice: n0 = 1, c =7
Example
f(n) =
an2
+bn +c =
O(n2),
Example
where
-notation
For a given function g(n), we
denote by (g(n)) the set of
functions
(g(n)) = {f(n): there exist
positive constants c and n0
such that
0 cg(n) f(n)
for all n n0 }
f(n) is a member of set (g(n))
Example
Relations Between , , O
o-notation
For a given function g(n), we denote by o(g(n)) the set
of functions
o(g(n)) = {f(n): for any positive constant c > 0, there exists a
constant n0 > 0 such that 0 f(n) < cg(n) for all n n0 }
f(n) becomes insignificant relative to g(n) as n
approaches infinity: lim [f(n) / g(n)] = 0
n
-notation
Example
Comparison of Functions
f g a b : real number
f (n) = O(g(n)) a b
f (n) = (g(n)) a b
f (n) = (g(n)) a = b
f (n) = o(g(n)) a < b
f (n) = (g(n)) a > b
Conclusion
Asymptotic notations: , O, , o,
definition
difference
usage
Your homework
Answer tutorial questions
Attend tutorial (Tue 2-3pm, Wed 12-1pm)
Suggested reading: chapter 6,7,8
COMP3111/3811 Algorithms 2
Lecture 2-1
Sorting
1. Sorting algorithms
2. Lower bound for sorting problem
3. Linear time sorting algorithms
1. Sorting Algorithms
1.1 Insertion sort: (n2)
1.2 Merge sort: (n lg n)
1.3 Heap sort: O(n lg n)
1.4 Quick sort:
(n2) worst case, (n lg n) average case
Bubble sort
Shell sort
Fun sort
[Knuth] sorting & searching
(n2)
worst case
Asymptotically optimal
Not in place
In place
n-2
n-3
(n)
n-1
(n-1)/2
(n-1)/2
Good split
(n-1)/2 -1
(n-1)/2
2
1
(n) split
n-1
n-2
Bad split
n-1
1
Average-Case Splitting
(n2)
Path: execution of
sorting algorithm
a3 a2 a1
n! 2h h lg(n!)
By Stirlings approximation: n! > (n/e)n
h lg(n!) lg(n/e)n = n lg n - n lg e = (n lg n)
10
2 0 2
0
0
1 2 4
6 7 8
1
7 8
3
2
1 2 4
6 7 8
Step2
C[i] : #of
elements i
Step3
Position in B
Decrement C
Counting-Sort (A, B, k)
for i 1 to k
(k)
do C[i] 0
for j 1 to length[A]
(n)
do C[A[j]] C[A[j]] + 1 C[i] : #of elements = i
for i 1 to k
(k)
do C[i] C[i] + C[i-1] C[i] : #of elements i
for j length[A] downto 1
(n)
do B[C[A[ j ]]] A[j]
C[A[j]] C[A[j]] - 1
(n+k)
If k=O(n), then worst case (n).
1 2
B
2 3
5 6
7 8
3
4 5
B
0
0
0
7 8
3 0 1
Step1
C[i] : #of
B
elements = i
1.
2.
3.
4.
5.
6.
7.
8.
9.
Counting Sort
1
1 2
1 2 4
8
3 5
5 7 8
Algorithm Analysis
The overall time is (n+k).
When we have k=O(n),
the worst case is (n).
Stable, but not in place.
No comparisons made
it uses actual values of the elements to
index into an array.
392
356
446
928
631
532
495
631
392
532
495
356
446
928
928
631
532
446
356
392
495
Radix-Sort(A,
d)
356
392
446
495
532
631
928
1. for i 1 to d
2.
do use a stable sort to sort array A on digit i (n+k)
Counting sort
(d(n+k))
11
Algorithm Analysis
Each pass over n d-digit numbers then takes time
(n+k).
There are d passes, so the total time for radix sort is
(d(n+k)).
When d is a constant and k = O(n), radix sort runs in
linear time.
Radix sort, if uses counting sort as the intermediate
stable sort, does not sort in place.
If primary memory storage is an issue, quicksort or
other sorting methods may be preferable.
Algorithm Analysis
Bucket-Sort (A)
1.
2.
3.
4.
5.
6.
n length[A]
for i 1 to n
do insert A[i] into list B[ nA[i] ]
for i 0 to n-1
do sort list B[i] with insertion sort
Concatenate the lists B[0], B[1], B[n-1]
together in order
Conclusion
12
COMP3111/3811 Algorithms
Lecture 2-2
Order Statistics
Recurrences
1. Minimum (A)
1. min A[1]
2. for i 2 to length[A]
3.
do if min > A[i]
4.
then min A[i]
5. return min
2. Selection in
Worst-Case Linear Time
It finds the desired element(s) by
recursively partitioning the input array
Basic idea: to generate a good split
when array is partitioned using a
modified partition algorithm of quick
sort
13
Solving Recurrence
Algorithm Analysis
At least half of the medians found in step 2 are
greater or equal to the median-of-medians x.
Thus, at least half of the n/5 groups contribute
3 elements that are greater than x, except the
one that has < 5 and the one group containing x.
The number of elements > x is at least
3 ( (1/2)n/5 - 2) 3n/10 - 6
Similarly the number of elements < x is at least
3n/10 - 6.
In the worst case, SELECT is called recursively
on at most 7n/10 + 6.
(2) Recurrences
Describe functions in terms of their values
on smaller inputs
Arise from Recursive call
Arise from Divide and Conquer
T(n) = (1)
if n c
T(n) = a T(n/b) + D(n) + C(n) otherwise
Solution Methods
1. Substitution Method
2. Recursion Tree Method
3. Master Method
Example
To Solve:
T(n) = 2T(n/2) + n
Guess: T(n) = O(n lg n)
We need to prove T(n) c n lg n for an appropriate choice
of the constant c > 0
Assume that this bound holds for n/2,
i.e.
T(n/2) c n/2 lg n/2
Substituting the recurrence
T(n) 2c n/2 lg n/2 + n
c n lg (n/2) + n
= c n lg n - c n lg2 + n
= c n lg n - cn + n
c n lg n : true as long as c 1
1. Substitution Method
Step 1. Guess the form of solution
Step 2. Use mathematical induction to
show that the solution works.
Works well when the solution is easy to
guess
No general way to guess the correct
solution
Can be used to establish an upper or
lower bounds on the recurrence
14
Subtleties
When the math doesnt quite work out in the
induction, try to revise your guess by
subtracting a lower-order term. For example:
T(n) = T(n/2) + T(n/2) + 1
We guess T(n) = O(n), try to show T(n) cn for
an appropriate choice of the constant c > 0
T(n) c(n/2) + c(n/2) + 1 = c n + 1
New guess is T(n) cn - b, where b 0
T(n) (c(n/2) - b) + (c(n/2) - b) + 1
= c n - 2b + 1
c n - b as long as b 1
Avoiding Pitfalls
Be careful not to misuse asymptotic notation.
For example:
We can falsely prove T(n) = O(n) by
guessing T(n) c n for T(n) = 2T(n/2) + n
T(n) 2c n/2 + n
cn+n
= O(n) Wrong!
we havent proved the exact from of the
induction hypothesis: T(n) c n
Changing Variables
Use algebraic manipulation to turn an
unknown recurrence into one similar to what
you have seen before.
Consider T(n) = 2T(n1/2) + lg n
Rename m = lg n and we have
T(2m) = 2T(2m/2) + m
Set S(m) = T(2m) and we have
S(m) = 2S(m/2) + m S(m) = O(m lg m)
Changing back from S(m) to T(n), we have
T(n) = T(2m) = S(m) = O(m lg m) = O(lg n lg lg n)
Analysis
Divide: computing the middle takes (1)
Conquer: solving 2 subproblems takes
2T(n/2)
Combine: merging n elements takes (n)
Total:
T(n) = (1)
if n = 1
T(n) = 2T(n/2) + (n) if n > 1
T(n) = (n lg n): master theorem
T(n) = 2T(n/2) + cn
15
Recurrence Tree
Running times to merge two sublists
cost cn
Per level cost
height
2c(n/2) = cn
cn/2
cn/2
cn
cn2
lg n
cn/4
cn/4
cn/4 4c(n/4) = cn
cn/4
c(n/4) 2
c(n/4) 2
c(n/4) 2
lg4n
c
cn
(3/16) 2cn2
c(n/16) 2
c
Total: cn lg n+ cn
T(n) = (n lg n)
(lgn+1 level)
(n lg43 )
Total: O(n2)
T(1)
COMP3111/3811 Algorithms
(3/16)cn2
3. Master Method
Provides a cookbook method for solving
recurrences of the form
T(n) = a T(n/b) + f(n)
Lecture 3-1
Assumptions:
Recurrences
16
Example
T(n) = 3T(n/4) + n lg n
a = 3, b=4, thus nlogba = nlog43 = O(n0.793)
f(n) = n lg n = (nlog43 + ) where 0.2 case 3.
therefore, T(n) = (f(n)) = (n lg n)
T(n) = 2T(n/2) + n lg n
a = 2, b=2, f(n) = n lg n, and nlogba = nlog22 = n
f(n) is asymptotically larger than nlogba, but not
polynomially larger (ratio f(n)/nlogba = nlgn/n = lg n
is asymptotically less than n for any positive ).
Thus, the Master Theorem doesnt apply here.
Example
T(n) = 16T(n/4) + n
a = 16, b = 4, thus nlogba = nlog416 = (n2)
f(n) = n = O(nlog416 - ) where = 1 case 1.
therefore, T(n) = (nlogba ) = (n2)
T(n) = T(3n/7) + 1
a = 1, b=7/3, and nlogba = nlog 7/3 1 = n0 = 1
f(n) = 1 = (nlogba) case 2.
therefore, T(n) = (nlogba lg n) = (lg n)
Solving Recurrence
Step 1, 2 and 4 take O(n) time. Step 3 takes time
T(n/5) and step 5 takes time at most T(7n/10 + 6).
T(n) (1), if n 140
T(n) T( n/5 ) + T(7n/10 + 6) + O(n), if n > 140
Substitution Method: Guess T(n) cn
T(n) c n/5 + c (7n/10 + 6) + an
cn/5 + c + 7cn/10 + 6c + an
= 9cn /10 + 7c + an
= cn +(-cn/10 + 7c + an)
c n if (-cn/10 + 7c + an) 0
10a(n/(n-70)) c when n>70
COMP3111/3811 Algorithms 2
Lecture 3-2
Divide & Conquer
Closest Pair
Tree Drawing
17
1. Closest Pair
D&C Algorithm
If | P | >3 then
Divide:
find a vertical line l that bisects P into
PL and PR s.t. | PL | = | PR | =n/2
X (Y) are divided by XL (YL) and XR (YR).
Some points may lie on the line
D&C Algorithm
Conquer:
L = CP (PL, XL, YL)
R = CP (PR, XR, YR)
= min (L, R)
Combine:
The closest pair:
either
or a pair of points one in PL and the other in
PR.
If there is a pair of points with distance less than
, they must reside in the 2-wide vertical strip
centered at line l.
Combine Algorithm
1. Y = Y all points not in the 2 -wide
vertical strip (Y : sorted)
2. For each point p in Y,
find points p in Y that are within
units of p (Only 7 points in Y that
follow p need to be considered).
compute distance from p to p
keep minimum
3. Return min( , )
Correctness
Why we only need to consider 7 points
following each point p in array Y
the closest pair pL in PL and pR in PR ( < ):
within *2 rectangle centered at line l.
pL (pR) must be on or to the left (right) of l
and less than units away.
pL and pR : within units each other
vertically.
At most 8 points can lie on *2 rectangle
centered at line l
At most 4 points in PL (PR) can reside in
* square, left (right) half of the rectangle
(as points in PL are at least apart).
18
Implementation
Ensure XL, XR, YL, YR, Y: sorted properly
when they are passed to recursive calls.
We wish to form a sorted subset of a sorted
array.
Divide X into XL, XR is easy.
Running time
Presorting before the first recursive call:
O(nlogn)
T(n) = O(nlogn)
T(n) = O(1) if n 3
T(n) = 2T(n/2) + O(n) if n > 3
Simple Method
inorder traversal
layered grid drawing
two flaws:
too wide: width n-1
parent vertex is not centered with
respect to the children
Implementation
Given a subset P and the array Y, partition P
into PL and PR:
needs to form YL and YR sorted by ycoordinates in linear time
Method: the opposite of the MERGE
procedure in Merge Sort.
Split a sorted array into two sorted arrays
Examine points in Y in order.
If a point Y[i] is in PL, append it to the end
of YL; otherwise append it to the end of YR.
D &C Algorithm
Divide
recursively apply the algorithm to draw the
left and right subtrees of T.
Conquer
move the drawings of subtrees until their
horizontal distance equals 2.
place the root r vertically one level above
and horizontally half way between its
children.
If there is only one child, place the root at
horizontal distance 1 from the child.
19
Implementation
Two traversals
step 1. postorder traversal
For each vertex v, recursively computes the
horizontal displacement of the left & right
children of v with respect to v.
step 2. preorder traversal
Computes x-coordinates of the vertices by
accumulating the displacements on the
path from each vertex to the root.
Postorder Traversal
Processing v: scan the right contour of the left
subtree and the left contour of the right
subtree
accumulate the displacements of the vertices
on the left & right contour
keep the max. cumulative displacement at any
depth
Construction of contour list: v with T, T
case 1: height(T) = height(T)
case 2: height(T) < height(T)
case 3: height(T) > height(T)
Postorder Traversal
it is necessary to travel down the
contours of two subtrees T and T only
as far as the height of the subtree of
lesser height
the time spent processing vertex v in the
postorder traversal is proportional to the
minimum heights of T and T
Postorder Traversal
left (right) contour: the sequence
of vertices vi such that vi is the
leftmost (rightmost)vertex of T
with depth i
In conquer step, we need to
follow the right contour of the left
subtree and the left contour of the
right subtree
After we process v, we maintain
the left & right contour of the
subtree rooted at v as a linked list
Postorder Traversal
L(T) (R(T)): left (right) contour of T
case 1: height(T) = height(T)
L(T) = L(T) + v
R(T) = R(T) + v
case 2: height(T) < height(T)
R(T) = R(T) + v
L(T) = v+ L(T) + {part of L(T) starting from w}
h: depth of T
w: the vertex on L(T) whose depth = h+1
case 3: height(T) > height(T) : similar to case2
COMP3111/3811 Algorithms 2
Lecture 4-1
Greedy Algorithms
(Chapter 16)
20
Motivation
Greedy Algorithms
Greedy Algorithms
Greedy algorithms normally consist of:
Set (list) of candidates
Two other sets: chosen & rejected
Function that checks whether a particular set of
candidates provides a solution to the problem
Function that checks if a set of candidates is feasible
Selection function indicating at any time which is the
most promising candidate not yet used
Objective function giving the value of a solution; this is
the function we are trying to optimize
4.
C C \ {x}
5.
Analysis
The selection function is usually based on the objective
function; they may be identical.
But, often there are several plausible ones.
At every step, the procedure chooses the best morsel it
can swallow, without worrying about the future.
It never changes its mind: once a candidate is included
in the solution, it is there for good; once a candidate is
excluded, its never considered again.
Greedy algorithms do NOT always yield optimal
solutions, but for some problems they do.
21
More formally
the 0-1 knapsack problem:
The thief must choose among n items, where the ith
item worth vi dollars and weighs wi pounds
Carrying at most W pounds, maximize value
Note: assume vi, wi, and W are all integers
0-1: each item must be taken or left in entirety
the fractional knapsack problem:
Thief can take fractions of items
Think of items in 0-1 problem as gold ingots, in
fractional problem as buckets of gold dust
22
i =1
i =1
xi wi W and xi vi is maximized.
n
i =1
i =1
xi wi W and xi vi is maximized.
w 10 20 30 40 50
v 20 30 66 40 60
v/w 2.0 1.5 2.2 1.0 1.2
select
xi
value
Max vi 0 0 1 0.5 1 146
Min wi 1 1 1 1 0 156
Max vi/wi 1 1 1 0 0.8 164
Optimal Substructure
Both variations exhibit optimal substructure
An optimal solution to the problem contains
optimal solutions to subproblems.
To show this for the 0-1 problem, consider the
most valuable load weighing at most W pounds
If we remove item j from the load, what do we
know about the remaining load?
A: remainder must be the most valuable load
weighing at most W - wj that thief could take
from museum, excluding item j
23
Optimality Proof
[Theorem] This greedy algorithm is always
optimal.
(Proof) Let I = (i1, , in) be any permutation of the
integers {1, 2, , n}.
If customers are served in the order I, the total time
passed in the system by all the customers is
T = ti1 + (ti1 + ti2) + (ti1+ ti2+ ti3) +
= n ti1 + (n-1)ti2 + (n-2) ti3 +
= k = 1 to n (n - k + 1) tik
Designing Algorithm
Imagine an algorithm that builds the optimal schedule
step by step.
Suppose after serving customer i1, , im we add
customer j. The increase in T at this stage is
ti1 + + tim + tj
To minimize this increase, we need only to minimize tj.
This suggests a simple greedy algorithm: at each
step, add to the end of schedule the customer
requiring the least service among those remaining.
COMP3111/3811 Algorithms 2
Lecture 4- 2
Greedy Algorithms
(Chapter 16)
24
Optimal Substructure
An optimal solution to the problem contains optimal
solutions to the subproblems
An optimal solution to the subproblem
+ the greedy choice
= an optimal solution to the original problem
4. Activity-Selection Problem
Example: get your moneys worth out of a carnival
Buy a wristband that lets you onto any ride
Lots of rides, each starting and ending at different
times
Your goal: ride as many rides as possible
Formally:
Given a set S of n activities
si = start time of activity i
fi = finish 3time of activity i
4
6 activities
Find max-size subset
A of compatible
2
1
A Greedy Algorithm
So actual algorithm is simple:
Sort the activities by finish time
Schedule the first activity
Then schedule the next activity in
sorted list which starts after previous
activity finishes
Repeat until no more activities
Intuition is even more simple:
Always pick the shortest ride available
at the time
Optimal Substructure
Let k be the minimum activity in A (i.e., the
one with the earliest finish time).
Then A - {k} is an optimal solution to S = {i
S: si fk}
once activity #1 is selected, the problem
reduces to finding an optimal solution for
activity-selection over activities in S
compatible with #1
Proof: if we could find optimal solution B
to S with |B| > |A - {k}|, then B U {k} is
compatible and |B U {k}| > |A|
25
Iterative
greedy algorithm
Example
merge L1 & L2,: 30 + 20 = 50 comparisons,
resulting in a list of size 50.
then merge the list & L3: 50 + 10 = 60
comparisons
total number of comparisons: 50 + 60 = 110.
Alternatively, merge L2 & L3: 20 + 10 = 30
comparisons, the resulting list (size 30)
then merge the list with L1: 30 + 30 = 60
comparisons.
total number of comparisons: 30 + 60 = 90.
60
10
50
30
20
20
5
2
10
Iteration 2:
5
merge 5
and 5 into
2
3
10
5
2
5
3
Iteration 4: merge
10 and 16 into 26
16
10
7
16
26
10
Algorithm:
(1) create a min-heap T[1..n ] based on the n initial sizes.
(2) while (the heap size 2) do
(2.1) delete from the heap two smallest values, call
them a and b, create a parent node of size a + b
for the nodes corresponding to these two values
(2.2) insert the value (a + b) into the heap which
corresponds to the node created in Step (2.1)
30
30
Cost = 30*1 +
20*2 + 10*2 = 90
60
26
6. Huffman codes
Widely used and effective technique for compressing
data: savings of 20% to 90% are typical depending on
file characteristics
Motivation: suppose we wish to save a text (ASCII) file
on the disk or to transmit it though a network using an
encoding scheme that minimizes the number of bits
required.
Without compression, characters are typically
encoded by their ASCII codes with 8 bits per character.
We can do better if we have the freedom to design our
own encoding.
Example
Given a text file that uses only 5 different
letters (a, e, i, s, t), the space character, and
the newline character.
Since there are 7 different characters, we
could use 3 bits per character because that
allows 8 bit patterns ranging from 000
through 111 (so we still one pattern to spare).
The following table shows the encoding of
characters, their frequencies, and the size of
encoded (compressed) file.
Character
a
e
i
s
t
space
newline
Frequency
10
15
12
3
4
13
1
Total
58
Code
000
001
010
011
100
101
110
Total bits
30
45
36
9
12
39
3
Code
001
01
10
00000
0001
11
00001
174
Fixed-length encoding
Total bits
30
30
24
15
16
26
5
146
Variable-length
encoding
Huffman codes
Total # of bits of encoded file = freq1 * length(code1) +
freq2 * length(code2) + + freqk * length(codek)
Huffman code: an optimal prefix code constructed by
a greedy algorithm.
An optimal code for a file is always represented by a
full binary tree, in which every nonleaf node has two
children.
Idea: to start with |C| leaves and perform a sequence
of |C|-1 merging operations to create the final tree.
Greedy property: Smaller the frequency, make the
code longer to improve the compression.
Priority queues can be used to find the two leastfrequent objects to merge together.
27
0
1
0
1
1
Codes 001, 01, 10, 00000,
0001, 11, and 00001
Code
001
Codes 001
and 01
Codes 001,
01, and 10
3
s
1
\n
4
t
e
15 12
10 i
a
13
x
y
Node x has
only one
child y
Merge x and y,
reducing total
size
O(nlogn)
28
Optimal Substructure
[lemma16.3] C: an alphabet, each c in C has frequency f(c).
x,y: two characters in C with min. frequency.
C: alphabet set C=C-{x,y} + {z}, f(z)=f(x)+f(y)
T: any tree representing an optimal prefix code for C
Then, T, obtained from T by replacing the leaf node z with an
internal node having x and y as children, represents an
optimal prefix code for C.
<Proof> first show that B(T) can be expressed in terms of B(T).
For each c in C-{x,y}, dT(c)=dT(c), hence f(c)dT(c)=f(c)dT(c).
Since dT(x)=dT(y)=dT(z)+1,
we have f(x)dT(x)+f(x)dT(y) = (f(x)+f(y))(dT(z)+1)
= f(z)dT(z) +(f(x)+f(y))
We conclude B(T)=B(T)+f(x)+f(y) or
B(T) B(T) f( ) f( )
Optimal Substructure
<Proof>
We concluded B(T)=B(T)+f(x)+f(y) or B(T)=B(T)-f(x)-f(y).
We now prove the lemma by contradiction.
Suppose that T does not represent an optimal prefix code for C.
Then there is a tree T s.t. B(T) < B(T).
w.l.o.g(by lemma16.2), T has x and y as siblings.
T:tree T with parent of x and y replaced by a leaf z with
f(z)=f(x)+f(y).
Then B(T) = B(T)-f(x)-f(y)
< B(T)-f(x)-f(y)
= B(T)
Contradiction as T represents an optimal prefix code for C.
Thus, T must represent an optimal prefix code for C.
COMP3111/3811 Algorithms 2
Motivation
Lecture 5-1/5-2
Optimization Problems
In which a set of choices must be made
in order to arrive at an optimal (min/max)
solution, subject to some constraints.
(There may be several solutions to
achieve an optimal value.)
Dynamic Programming
(Chapter 15)
29
Dynamic Programming
Similar to divide-and-conquer, it breaks
problems down into smaller problems that
are solved recursively.
In contrast, DP is applicable when the subproblems are not independent,
i.e. when sub-problems share sub-subproblems.
It solves every sub-sub-problem just once
and save the results in a table to avoid
duplicated computation.
Elements of DP Algorithms
Sub-structure: decompose problem into
smaller sub-problems. Express the solution
of the original problem in terms of solutions
for smaller problems.
Table-structure: Store the answers to the subproblem in a table, because sub-problem
solutions may be used many times.
Bottom-up computation: combine solutions
on smaller sub-problems to solve larger subproblems, and eventually arrive at a solution
to the complete problem.
Dynamic Programming
Divide & Conquer
independent subproblems
Dynamic Programming
subproblems are not independent
(subproblems share subproblems)
algorithm solves every subproblem just
once and then saves its answer in a table,
avoiding recomputation
applied to optimization problem: we want
to find an optimal solution with the
optimal (minimum or maximum) value
(there may be several solutions)
1. Assembly-Line Scheduling
4 steps
1. Characterize the structure of an optimal
solution
2. Recursively define the value of the optimal
solution
3. Compute the value of an optimal solution in
a bottom-up fashion
4. Construct an optimal solution from
computed information (can be omitted)
30
Station S1,1
S1,2
a1,1
e1
chassis
enter e
2
a1,2
t1,1
t1,2
S1,3
S1,4
S1,n-1
S1,n
a1,3
a1,4 a1,n-1
a1,n
t1,3
t1,n-1
t2,1
a2,1
Station S2,1
t2,2
t2,3
a2,2
a2,3
S2,2
S2,3
t2,n-1
a2,4 a2,n-1
S2,4
S2,n-1
x2
a2,n
completed
auto exit
assembly
line 2
S2,n
Fastest time
to any given
station
=min
Station S1,1
f1[1] = e1 + a1,1 if j = 1
f1[ j] = min( f1[ j-1] + a1, j , f2 [ j-1]+ t2, j-1+ a1, j ) if
j>= 2
f2[1] = e2 + a2,1 if j = 1
f2[ j] = min( f2[ j-1] + a2, j , f1 [ j-1]+ t1, j-1+ a2, j ) if
j>= 2
li[j]: line number 1 or 2 whose station j-1 is used
in a fastest way through station S i, j
l*: the line whose station n is used in a fastest
way through the entire factory
7
2
chassis
enter 4
Fastest time
through prev
station (same
line)
Time it
to
+ takes
switch lines
S1,2
S1,3
S1,4
S1,5
S1,6
8
Station S2,1
S2,2
S2,3
S2,4
S2,5
S2,6
= min{18+3, 16+1+3}
4
5 6
j 2 3 4
18 20 24 32 35 f*=38 l1[j] 1 2 1
f2[j] 12 16 22 25 30 37
l2[j] 1 2 1
= min{18+3+6, 16+6}
j 1
f1[j] 9
assembly
line 1
completed
auto exit
assembly
line 2
5
1
2
6
2 l*=1
2
31
2. Matrix-Chain Multiplication
Given a sequence of matrices A1 A2An , and
dimensions p0 p1pn where Ai is of dimension
pi-1 x pi , determine multiplication sequence
that minimizes the number of operations.
This algorithm does not perform the
multiplication, it just figures out the best
order in which to perform the multiplication.
O(n)
COMP3111/3811 Algorithms 2
Lecture 6-1/6-2
Dynamic Programming
(Chapter 15)
32
Matrix Multiplication
In particular for 1 i p and 1 j r,
C[i, j] = k = 1 to q A[i, k] B[k, j]
Observe that there are pr total entries in C
and each takes O(q) time to compute, thus
the total time to multiply two matrices is pqr.
Example
Consider 3 matrices: A1 be 5 x 4, A2 be
4 x 6, and A3 be 6 x 2.
Mult[((A1 A2)A3)] = (5x4x6) + (5x6x2) = 180
Mult[(A1 (A2A3 ))] = (4x6x2) + (5x4x2) = 88
Even for this small example, considerable
savings can be achieved by reordering
the evaluation sequence.
33
A1 A2 A3 A4 A5 A6 A7 A8 A9
14243
m[ 3, 7 ]
(m[i, k] mults)
(m[k+1, j] mults)
(pi-1 pk pj mults)
s
m
6 1
6
1
i
i
3 2
j
j
5
5 15125 2
4 3
3 3
4 11875 10500 3
3 3
3
3 4
9375
5375
4
7125
3
1
3
5
3
5
2
2
3
4
5
2 7875 4375 2500 3500 5 1
1 15750 2625 750 1000 5000 6
0
0
0
0
0
0
A1
A2
A3
A4
A5
A6 ((A1 (A2 A3))((A4 A5 )A6))
30x35 35x15 15x5 5x10 10x20 20x25
m[2,5] = min k
k
{m[2,2] + m[3,5] +p1p2p5 = 0 + 2500 + 35x15x20 = 13000,
m[2,3] + m[4,5] +p1p3p5 = 2625 + 1000 + 35x5x20 = 7125,
m[2,4] + m[5,5] +p1p4p5 = 4375 + 0 + 35x10x20 = 11375}
= 7125
Matrix-Chain-Order(p)
1. n length[p] - 1
2. for i 1 to n
// initialization: O(n) time
3.
do m[i, i] 0
4. for L 2 to n
// L = length of sub-chain
5.
do for i 1 to n - L+1
6.
do j i + L - 1
7.
m[i, j]
8.
for k i to j - 1
9.
do q m[i, k] + m[k+1, j] + pi-1 pk pj
10.
if q < m[i, j]
11.
then m[i, j] q
12.
s[i, j] k
13. return m and s
Analysis
The array s[i, j] is used to extract the
actual sequence (see next).
There are 3 nested loops and each can
iterate at most n times, so the total
running time is (n3).
34
Example
The initial set of dimensions are <5, 4, 6, 2, 7>:
we are multiplying A1 (5x4) times A2 (4x6) times
A3 (6x2) times A4 (2x7).
Optimal sequence is (A1 (A2A3 )) A4.
35
Common Subsequence
Given two sequences X and Y, we say that a
sequence Z is a common sequence of X and Y if Z
is a subsequence of both X and Y.
X=<A,B,C,B,D,A,B>, Y=<B,D,C,A,B,A>
subsequence <B,C,A>: common subsequence of X
and Y, but not a longest
<B,C,B,A>, <B,D,A,B>: an LCS of X and Y
Longest Common Subsequence(LCS) problem
Given two sequences X = < x1, x2, , xm> and Y= <
y1, y2, , yn>, we want to find a maximum-length
common subsequence of X and Y.
36
if i=0 or j=0
if i,j>0 and xi=yj
if i,j>0 and xi=/=yj
0
yj
1
B
2
D
3
C
4
A
5
B
6
A
0 xi
1 A
2 B
3 C
4 B
5 D
6 A
7 B
X=<A,B,C,B,D,A,B>
Y=<B,D,C,A,B,A>
LCS=<B,C,B,A>
O(mn) time
O(m+n) time
37
Improvement
O(nm) time and O(nm) space DP algorithm
38
# of choices
# of subproblems
Bottom-up approach
Steps:
Alternatively:
4. Minimum-Weight Triangulation of
Convex Polygon
Motivitaion: computational geometry (graphics)
A polygon is a piecewise linear closed curve in
the plane.
We form a cycle by joining line segments
end to end.
The line segments are called the sides of the
polygon and the endpoints are called the
vertices.
A polygon is simple if it does not cross itself,
i.e. if the edges do not intersect one another
except for two consecutive edges sharing a
common vertex.
39
Triangulations
Example
intersection
exterior
interior
Triangulations
A triangulation of a convex polygon is a maximal
set T of chords.
Every chord that is not in T intersects the
interior of some chord in T.
Such a set of chords subdivides interior of a
polygon into set of triangles.
Dual graph of the triangulation is a graph whose
vertices are the triangles, and in which two
vertices share an edge if the triangles share a
common chord.
NOTE: the dual graph is a free tree.
In general, there are many possible
triangulations.
chord
Example
40
Lemma
A triangulation of a simple polygon has n-2 triangles
and n-3 chords.
(Proof)
The result follows directly from the previous figure.
Each internal node corresponds to one triangle and each edge
between internal nodes corresponds to one chord of
triangulation.
If we consider an n-vertex polygon, then well have n-1 leaves,
and thus n-2 internal nodes (triangles) and n-3 edges
(chords).
DP Solution
DP Solution
v5
v4
For the basis case, the weight of the trivial 2sided polygon is zero, implying that t[i, i] = 0
(line (vi-1, vi)).
v3
Min. weight
triangulation
= t[2, 5]
v6
v2
v0
v1
41
DP Solution
DP Solution
j = 0, 1, , N (8)
0 1 2 3 4 5 6 7 8
d2=4
0 1 2 3 1 2 3 4 2
d3=6
0 1 2 3 1 2 1 2 2
c[1..n, 0..N]
c[i,j] = c[i-1,j]
Amount 0 1 2 3 4 5 6 7 8
d1=1
for i = 1 to n
i = 1, 2, , n (3)
c[i,0]=0
for i =1 to n
for i=1 to N
if i=1 and j<d[1] then c[i,j] <- infinity
c[1..n, 0..N]
42
COMP3111/3811 Algorithms 2
Lecture 7-1/7-2
If c[i,j] = c[i-1,j]
Move up to c[i-1,j]
Graph Algorithms
(Chapter 22)
1. Graphs
1. Graphs
2. BFS
3. DFS
4. Topological Sort
5. Strongly Connected Component
Digraphs/ Graphs
Directed Graph (or digraph) G = (V, E) consists
of a finite set V, called vertices or nodes, and E,
a set of ordered pairs, called edges of G. E is a
binary relation on V.
Self-loops are allowed.
Multiple edges are not allowed, though (v, w)
and (w, v) are distinct edges.
43
Basic Terminology
Vertex w is adjacent to vertex v if there is an edge (v,w).
Given an edge e = (u,v) in an undirected graph, u and v are
the endpoints of e and e is incident on u (or on v).
In a digraph, u & v are the origin and destination. e leaves
u and enters v.
A digraph or graph is weighted if its edges are labeled
with numeric values.
In a digraph,
Out-degree of v: number of edges coming out of v
In-degree of v: number of edges coming in to v
In a graph, degree of v: no. of incident edges to v
Connectivity
Combinatorial Facts
In a graph
0 e C(n,2) = n (n-1) / 2 O(n2)
vV deg(v) = 2e
In a digraph
0 e n2
vV in-deg(v) = vV out-deg(v) = e
A graph is said to be sparse if e O(n),
and dense otherwise.
History on Cycles/Paths
Eulerian cycle is a cycle (not
necessarily simple) that visits every
edge of a graph exactly once.
Hamiltonian cycle (path) is a cycle (path
in a directed graph) that visits every
vertex exactly once.
44
Graph Representations
Let G = (V, E) be a digraph with n = |V| & e = |E|
Adjacency Matrix: a n x n matrix for 1 v,w n
A[v,w] = 1 if (v,w) E and 0 otherwise
If digraph has weights, store them in matrix.
Dense graphs: O(V2) memory
Free Tree
Forest
DAG
2. Breadth-First-Search (BFS)
Breadth-First-Search (BFS)
Given:
G = (V, E)
A distinguished source vertex s
45
BFS(G,s)
white: undiscovered
1. for each vertex u in (V[G] \ {s}) gray: discovered
black: finished
2
do color[u] white
3
d[u]
Q: a queue of discovered
4
[u] nil
vertices
5 color[s] gray
color[v]: color of v
6 d[s] 0
d[v]: distance from s to v
[u]: predecessor of v
7 [s] nil
8 Q
9 enqueue(Q,s)
10 while Q
11
do u dequeue(Q)
12
for each v in Adj[u]
13
do if color[v] = white
14
then color[v] gray
15
d[v] d[u] + 1
Analysis of BFS
Initialization: O(V).
Traversal Loop
After initialization, each vertex is enqueued and
dequeued at most once, and each operation
takes O(1) : total time for queuing is O(V).
The adjacency list of each vertex is scanned at
most once: the sum of lengths of all adjacency
lists is (E).
1
1
3
S
3
2
2
S
1
Shortest Paths
2
2
Finished
Discovered
3
Undiscovered
46
3. Depth-First-Search (DFS)
Explore edges out of the most recently
discovered vertex v
When all edges of v have been explored,
backtrack to explore edges leaving the
vertex from which v was discovered (its
predecessor)
Search as deep as possible first
Whenever a vertex v is discovered during a
scan of the adjacency list of an already
discovered vertex u, DFS records this event
by setting predecessor [v] to u.
DFS(G)
1. for each vertex u V[G]
2.
3.
do color[u] WHITE
[u] NIL
4. time 0
5. for each vertex u V[G]
6.
7.
do if color[u] = WHITE
then DFS-Visit(u)
Breadth-First Tree
For a graph G = (V, E) with source s, the predecessor
subgraph of G is G = (V , E) where
V ={vV : [v] NIL}U{s}
E ={([v],v)E : v V - {s}}
The predecessor subgraph G is a breadth-first tree if:
V consists of the vertices reachable from s
for all vV , there is a unique simple path from s to
v in G that is also a shortest path from s to v in G.
The edges in E : tree edges (|E | = |V | - 1)
There are potentially many BFS trees for a given
graph.
Depth-First Trees
Coloring scheme is the same as BFS.
The predecessor subgraph of DFS is G = (V , E)
where E ={([v],v) : v V and [v] NIL}.
The predecessor subgraph G forms a depth-first
forest composed of several depth-first trees.
The edges in E are called tree edges.
Each vertex u has 2 timestamps:
d[u]: records when u is first discovered (grayed)
f[u]: records when the search finishes
(blackens)
For every vertex u, d[u] < f[u].
DFS-Visit(u)
1. color[u] GRAY
White vertex u has been discovered
2. d[u] ++time
3. for each vertex v Adj[u]: Explore edge (u,v)
4.
do if color[v] = WHITE
5.
then [v] u
6.
DFS-Visit(v)
7. color[u] BLACK
Blacken u; it is finished.
8. f[u] time++
47
Analysis of DFS
Loops on lines 1-2 & 5-7 take (V) time,
excluding time to execute DFS-Visit.
DFS-Visit is called once for each white vertex
vV when its painted gray the first time.
Lines 3-6 of DFS-Visit is executed |Adj[v]|
times.
The total cost of executing DFS-Visit is
vV|Adj[v]| = (E)
Total running time of DFS is (V+E).
Properties of DFS
Predecessor subgraph G forms a forest
of trees (the structure of a depth-first tree
mirrors the structure of DFS-Visit)
The discovery and finishing time have
parenthesis structure, i.e. the parentheses
are properly nested.
Parenthesis Theorem
In any DFS of a graph G = (V, E), for any two
vertices u and v, exactly one of the
followings holds:
the interval [d[u], f[u]] and [d[v], f[v]] are
entirely disjoint
the interval [d[u], f[u]] is contained entirely
within the interval [d[v], f[v]], and u is a
descendant of v in the depth-first tree, or
the interval [d[v], f[v]] is contained entirely
within the interval [d[u], f[u]], and v is a
descendant of u in the depth-first tree
White-Path Theorem
In a depth-first forest of a graph G, vertex v
is a descendant of vertex u if and only if at
the time d[u] that the search discovers u,
vertex v can be reached from u along a
path consisting entirely of white vertices.
48
Classification of Edges
4. Topological Sort
Topological Sort
Theorem
In a depth-first search of an undirected graph
G, every edge of G is either a tree edge or a
back edge.
Topological-Sort (G)
1. Call DFS(G) to compute finishing time
f[v] for each vertex
2. As each vertex is finished, insert it onto
the front of linked list
3. Return the linked list of vertices
49
Another Example
Analysis of Topological-Sort
belt
4/5
jacket
pants
shirt
11/14
tie
12/13
socks 15/16
Lemma
7/8 shoes
Final Ordering: socks, shirt, tie, u-shorts, pants, shoes, belt, jacket
TopVisit(u) {
// start search at u
1. color[u] = gray;
// mark u visited
2. for each vAdj(u)
3.
if (color[v] == white) TopVisit(v);
4. append u to the front of L // finish & add u to L
}
Component DAG
Ordering DFS
If we merge the vertices in each strong component into a single super vertex, and joint
two super vertices (A, B) if and only if there
are vertices u A and v B such that (u, v)
E, then the resulting digraph, called the
component digraph, is necessarily acyclic.
c
a
d
a, b, c
b
h
f
e
d, e
f, g, h, i
Once the DFS starts within a given strong component, it must visit every vertex within the
component (and possibly some others) before
finishing.
If we dont start with reverse topological order,
then the search may leak out into other strong
components.
However, by visiting components in reverse
topological order of the component tree, each
search cannot leak out into other components,
since they would have already been visited earlier
in the search.
50
StrongComp(G)
1. Run DFS(G) to compute finish time f[u] for each
vertex u
COMP3111/3811 Algorithms 2
Lecture 8-1
51
Generic Approaches
Two greedy algorithms for computing MSTs:
Kruskals Algorithm
Prims Algorithm
Generic-MST (G, w)
1. A
invariant
// A trivially satisfies
// A is now a MST
Definitions
A cut (S, V-S) is just a partition of the vertices
into 2 disjoint subsets.
An edge (u, v) crosses the cut if one endpoint is
in S and the other is in V-S.
Given a subset of edges A, we say that a cut
respects A if no edge in A crosses the cut.
An edge of E is a light edge crossing a cut, if
among all edges crossing the cut, it has the
minimum weight (the light edge may not be
unique if there are duplicate edge weights).
Theorem:
Let G = (V, E) be a connected, undirected graph
with real-value weights on the edges.
Let A be a viable subset of E (i.e. a subset of some
MST), let (S, V-S) be any cut that respects A, and
let (u,v) be a light edge crossing this cut.
Then, the edge is safe for A.
Proof :
show that A {(u,v)} is a subset of some MST:
1. Find arbitrary MST T containing A
2. Use a cut-and-paste technique to find
another MST T that contains A {(u,v)}
52
Now we show
T is a minimum spanning tree
A {(u,v)} is a subset of T
T is an MST: We have
w(T) = w(T) - w(x,y) + w(u,v)
Since (u,v) is a light edge crossing the cut, we
have w(u,v) w(x,y). Thus w(T) w(T).
So T is also a minimum spanning tree.
A {(u,v)} T: Remember that (x, y) is not in A.
Thus A T - {(x, y)}, and thus
A {(u,v)} T - {(x, y)} {(u,v)} = T
2. Kruskals Algorithm
Attempts to add edges to A in increasing order
of weight (lightest edge first)
If the next edge does not induce a cycle
among the current set of edges, then it is
added to A.
If it does, then this edge is passed over, and
we consider the next edge in order.
As this algorithm runs, the edges of A will
induce a forest on the vertices and the trees
of this forest are merged together until we
have a single tree containing all vertices.
Detecting a Cycle
We can perform a DFS on subgraph induced by
the edges of A, but this takes too much time.
Use disjoint set UNION-FIND data structure.
This data structure supports 3 operations:
Create-Set(u): create a set containing u.
Find-Set(u): Find the set that contains u.
Union(u, v): Merge the sets containing u and v.
Each can be performed in O(lg n) time.
The vertices of the graph will be elements to be
stored in the sets; the sets will be vertices in
each tree of A (stored as a simple list of edges).
53
MST-Kruskal(G, w)
1.
2.
3.
4.
5.
6.
7.
8.
9.
A
// initially A is empty
for each vertex v V[G]
// O(V) time
do Create-Set(v)
// create set for each vertex
sort the edges of E by nondecreasing weight w //O(E lg E)
for each edge (u,v) E, in order by nondecreasing weight
do if Find-Set(u) Find-Set(v) // u&v on different trees
then A A {(u,v)}
Union(u,v)
//O(E lg E)
return A
3. Prims Algorithm
Consider the set of vertices S currently part of
the tree, and its complement (V-S).
We have a cut of the graph and the current set of
tree edges A is respected by this cut.
Which edge should we add next? Light edge!
Implementation Issues:
How to update the cut efficiently?
How to determine the light edge quickly?
All the vertices that are not in the S (the vertices of the
edges in A) reside in a priority queue Q based on a key
field. When the algorithm terminates, Q is empty. A =
{(v, [v]): v V - {r}}
MST-Prim(G, w, r)
1. Q V[G]
2. for each vertex u Q
// initialization: O(V) time
3.
do key[u]
4. key[r] 0
// start at the root
5. [r] NIL
// set parent of r to be NIL
6. while Q
// until all vertices in MST
7.
do u Extract-Min(Q)
// vertex with lightest edge
8.
for each v adj[u]
9.
do if v Q and w(u,v) < key[v]
10.
then [v] u
11.
key[v] w(u,v) // new lighter edge
out of v
12.
decrease_Key(Q, v, key[v])
54
Analysis of Prim
COMP3111/3811 Algorithms 2
Lecture 8-2
Shortest Path
(Chapter 24)
Variants
Single-destination shortest-paths problem:
Find a shortest path to a given destination
vertex t from every vertex v V.
Single-pair shortest-path problem: Find a
shortest path from u to v for given vertices u
and v.
Optimal substructure
A shortest path contains other shortest paths
within it
Greedy: Dijkstras algorithm
Dynamic: Floyd-Warshall algorithm
[lemma] Subpaths of shortest paths are
shortest paths
vi
v1
vj
vk
55
Triangle inequality
Well-definedness
<0
v
u
Cycles
Can a shortest path contain a cycle?
Negative-weight cycle
Positive-weight cycle
0-weight cycle
WLOG, we can assume that we find a cycle-free
shortest path.
at most |V|-1 edges
56
Relaxation
For each v, maintain d[v], shortest path estimate:
an upper bound on the weight of the shortest path for
each vertex v.
This value will always be greater than or equal to the
true shortest path distance from s to v.
Initially,all d[v]= & d[s]=0.
As the algorithm goes on and sees more vertices, it
tries to update d[v] for each vertex in the graph, until
all d[v] values converge to true shortest distances.
Relaxing an edge (u,v): testing whether we can improve
the shortest path to v by going through u (if yes, then
update d[v])
O(V) time
Relax (u, v, w)
1. if d[v] > d[u] + w(u,v) // is the path thru u shorter?
2. then d[v] d[u] + w(u,v);
// yes, then take it.
[v] u;
3.
// the shortest way back to the source is thru u
// by updating the predecessor pointer
NOTE: If we perform Relax (u, v, w) repeatedly
over all edges of the graph, all the d[v] values
will eventually converge to the true final
distance values from s.
How to do this most efficiently? (how many
times? order?)
2. Bellman-Ford Algorithm
Simple
Allow negative edge weight
Can test whether a negative weight cycle is
reachable from the source
Relax each edge many times: progressively
decreasing an estimate d[v] until it achieves
the actual shortest-path weight (s,v)
O(VE) time
57
Application
Determine critical paths in PERT(program
evaluation and review technique) chart
analysis
Edge: job to be performed
Edge weight: time required to perform the job
A path: a sequence of jobs that must be
performed in a particular order
Critical path: a longest path through the DAG
u
58
COMP3111/3811 Algorithms 2
We can find a critical path by either
Negating thed edge weights
Run DAG-SHORTEST-PATH
Or
Run DAG-SHORTEST-PATH with
modification:
Initialize-single-source: replace by -
Relax: replace > by <
Lecture 9- 1/9-2
4. Dijkstras Algorithm
Non-negative edge weights
Maintain a subset of vertices, S V, for which we know
their true distance d[u] = (s,u).
Initially S =; set d[s]=0 and all others to .
Greedy approach:
O(logV)
DECREASE_KEY: O(logV)
O(ElogV)
7
a 0
3 2
7
3 2
Black: in S
a 0
Black: in S
59
7
a 0
3 2
Black: in S
Black: in S
10
7
3 2
3 2
2
2
Black: in S
Black: in S
7
0
7
3 2
7
3 2
3 2
2
2
Black: in S
Black: in S
60
7
0
3 2
2
7
Black: in S
Correctness
Proof (I)
Just prior to the insertion of u, consider the true
shortest path from s to u.
Because s S and u V - S, at some point this
path must first jump out of S.
Let (x, y) be the edge where it jumps out, so
that x S and y V - S.
(It might happen that x=s and/or y=u).
Proof (II): y u
We argue that y u after all
Since x S we have d[x]=(s,x). (Remember that
u was the first vertex added to S that violated
this criterion.)
Since we applied relaxation to x when it was
added, we would have set d[y] = d[x] + w(x,y) =
(s,y).
Since (x,y) is on a shortest path from s to u, it is
on a shortest path from s to y. Thus d[y] is now
correct.
By hypothesis, d[u] is not correct, so u and y
cannot be the same.
61
Conclusion of Proof
Correctness
Notice:
1. Definitions
ks
62
L(1) = W
Actual shortest path weights: no negative weight cycle
Matrix multiplication
O(n3)
63
O(n4)
L(2 log(n-1)) =
L(n-1)
(2 log(n-1) >=
O(n3logn)
n-1)
COMP3111/3811 Algorithms 2
Lecture 10- 1
64
O(n3)
65
COMP3111/3811 Algorithms 2
Lecture 10- 2
Maximum Flow
(Chapter 26)
0. Motivation
Flow network:
Flow network:
66
1. Flow Networks
flow network G=(V, E): a directed graph
Each edge (u, v) in E has nonnegative
capacity c(u,v)>0
If (u,v) not in E: we assume c(u,v)=0
Two distinguished vertices : a source s and
a sink t
Assume G is connected: every vertex v lies
on a path from s to t
Flow: given a flow network G, a flow f is a realvalued function f: VxV -> R that satisfies the
following properties
Capacity constraints: for all u, v in V, we
require f(u,v) <= c(u,v)
Skew symmetry: for all u, v in V, we require
f(u,v) = -f(v,u)
Flow conservation: for all u in V-{s,t}, we
require v in V f(u,v) = 0, for all v in V
67
Cancellation
2/3
1/2
1/3
0/2
2. Ford-Fulkerson method
|f| = f(s,V)
= f(V,V) f(V-s,V) : 3
Residual networks
= -f(V-s, V) : 1
Augmenting paths
= f(V, V-s) : 2
Cuts
68
Residual networks
Residual capacity cf(u,v) = c(u,v) - f(u,v)
Residual network Gf = (V, Ef)
Ef = {(u,v) in VxV: cf (u,v)>0}
Gf : flow network with capacity cf
[lemma] G=(V,E): flow network, f: flow of G
Gf: residual network of G induced by f
f: flow in Gf
Then, the flow sum f+f is a flow in G with value
|f+f| = |f|+|f|.
* Function VxV to R: (f+f)(u,v) = f(u,v) + f(u,v)
Augmenting paths
Augmenting path p: simple path from s to t in Gf
Residual capacity of p:
cf (p)= min{cf (u,v): (u,v) in p}
[lemma] G=(V,E): flow network, f: flow of G, p:
augmenting path in Gf, define fp: VxV -> R by
fp (u,v) = cf (p) if (u,v) on p
= -cf (p) if (v,u) on p
= 0 otherwise
Then fp is a flow in Gf with |fp|= cf (p)>0
[corollary] define f: VxV -> R by f= f + fp.
Then f is a flow of G with |f| = |f|+|fp| > |f|
[lemma]
f: flow of a flow network G, (S,T): a cut of G.
Then the net flow across (S,T) f(S,T) = |f|
<pf>
f(S,T) = f(S,V) f(S,S) : 3
= f(S,V) : 1
= f(s,V) + f(S-s,V) :3
= f(s,V) : flow conservation
Capacity c(S,T) = 12 + 14 = 26
Net flow f(S,T) = 12 + 11 - 4 = 19
= |f|
69
<pf>
|f| = f|S,T| = u in S v in T f(u,v)
<= u in S v in T c(u,v) = c(S,T)
Ford-Fulkerson algortihm
<pf> 2=>3
Suppose that Gf has no augmenting path
Define S={v in V| there is a path from s to v in Gf}
T= V-S
(S,T); a cut
For each vertices u in S and v in T, f(u,v)=c(u,v)
(otherwise, (u,v) in Gf, then v in S)
By lemma, |f| = f(S,T)=c(S,T)
70
71
[lemma]
G=(V,E): a bipartite graph with vertex partition V
=LUR
G=(V,E): its corresponding network
If M is a matching in G, then there is an integervalued flow f in G with value |f| = |M|.
Conversely, if f is an integer-valued flow in G,
then there is a matching M in G with
cardinality |M| =|f|.
COMP3111/3811 Algorithms 2
Lecture 11- 1
Amortized analysis
Time required to perform a sequence of data structure
operations is averaged over all the operations
performed
Show that the average cost of an operation is small
No probability involved
Amortized analysis
1. Aggregate analysis
n operations take T(n) time
Average cost of an operation: T(n)/n
Imprecise: dont get separate cost for each type of
operation
Amortized analysis
3. Potential method
stored work (accounting method) viewed
as potential energy
Most flexible & powerful view
2. Accounting method
Charge each operation an (invented) amortized cost
Amount not used stored in bank
Later operations can use stored work
Balance must not go negative
72
1. Aggregate analysis
A sequence of n operations : T(n) worst-case time
2. Accounting method
Aggregate analysis
Sequence of n operations: O(n)
# of POP: at most # of PUSH operations: O(n)
Average cost per each operation: O(1)
73
3. Potential method
ci = ci + (Di) (Di-1) = k- k = 0
Similarly, amortized cost of POP: 0
COMP3111/3811 Algorithms 2
Lecture 11- 2
1. Disjoint set
Disjoint set data structure: maintain a
collection S = {S1, S2, , Sk} of disjoint
dynamic sets
Each set is identified by a representative
Operations:
MAKE-SET(x): create a new set whose only
member is x (x not already be in other set)
UNION(x, y): unite the dynamic sets that
contain x and y, Sx and Sy, into a new set that
is the union of these two sets (destroy Sx, Sy)
FIND-SET(x): return a pointer to the
representative of the set containing x
74
Application
MSP algorithm: Kruskals algorithm
Finding the connected components of an
undirected graph
Representation
1. Linked list representation
2. Rooted tree representation: better time
complexity
75
n MAKE-SET: O(n)
n-1 UNION:
(i=1 to n-1) i = O(n2)
m = 2n-1 operations
Each operation: O(n)
amortized time complexity
3. Disjoint-set forests
2. Weighted-union heuristic
Faster implementation
Represent sets by rooted trees
Heuristics
A sequence of n-1 UNION: create a linear chain of n
nodes
Two heuristics: almost linear running time in total
number of m operations
1. Union by rank
Similar to weight-union heuristic
Make the root of tree with fewer nodes point to the
tree with more nodes
For each node, we maintain a rank: upper bound on
the height of the node
The root with smaller rank point to the root with
larger rank
76
Heuristics
FIND-SET(a)
2. Path compression
Simple & effective
Use it during FIND-SET operations
Make each node on the find path point
directly to the root
Pseudo code
1. UNION-by-rank
For each node x, rank[x]: upper bound on the height
of x (# of edges in the longest path between x and a
descendant leaf)
MAKE-SET: initial rank = 0
FIND-SET: rank unchanged
UNION
Roots with unequal rank: root of higher rank be
the parent of the root of lower rank
Roots with equal rank: arbitrarily choose one of
the roots as the parent and increment its rank
Pseudo code
Time complexity
2. Path compression
77
COMP3111/3811 Algorithms 2
Lecture 12- 1
Randomized Algorithms
78
79
COMP3111/3811 Algorithms 2
NP-Completeness
Lecture 12-2/13-1
NP-completeness
80
Polynomial-Time Algorithms
Are some problems solvable in polynomial
time?
Of course: every algorithm weve studied
provides polynomial-time solution to some
problem
We define P to be the class of problems
solvable in polynomial time
Are all problems solvable in polynomial time?
No: Turings Halting Problem is not
solvable by any computer, no matter how
much time is given
Such problems are clearly intractable, not in
P
NP-Complete Problems
The NP-Complete problems are an interesting
class of problems whose status is unknown
Computable, but .
No polynomial-time algorithm has been
discovered for an NP-Complete problem
But no superpolynomial lower bound has
been proved for any NP-Complete problem.
We call this the P = NP question
The biggest open problem in CS
An NP-Complete Problem:
Hamiltonian Cycles
An example of an NP-Complete problem:
A hamiltonian cycle of an undirected graph is a
simple cycle that contains every vertex
The hamiltonian-cycle problem: given a graph G,
does it have a hamiltonian cycle?
Cube?
Dodecahedron?
Grid graph?
a nave algorithm for solving the hamiltoniancycle problem: Running time?
Nondeterminism
Better
If a solution exists, computer always guesses it
One way to imagine it: a parallel computer that
can freely spawn an infinite number of processes
Have one processor work on each possible
solution
All processors attempt to verify that their
solution works
If a processor succeeds, then the whole
machine succeeds
P and NP
P = the set of problems that can be solved
in polynomial time
NP = the set of problems that can be solved
in polynomial time by a nondeterministic
computer
Notes:
1. both P and NP are sets of problems,
not sets of algorithms
2. NP stands for nondeterministic
polynomial time
P and NP
Summary so far:
P = problems that can be solved in polynomial time
NP = problems for which a solution can be verified
in polynomial time
Unknown whether P = NP (most suspect not)
Hamiltonian-cycle problem is in NP:
Cannot solve in polynomial time
Easy to verify solution in polynomial time
81
NP-Complete Problems
NP-Complete problems are the hardest problems
in NP:
If any one NP-Complete problem can be solved
in polynomial time
then
every NP-Complete problem can be solved in
polynomial time
and in fact every problem in NP can be solved
in polynomial time (which would show P = NP)
The Scene
All problems
Halting
problem
Ham Cycle
NP-complete
Trav Salesp
.
NP
sorting
Reduction
The crux of NP-Completeness is reducibility
Informally, a problem A can be reduced to
another problem B if any instance of A can
be easily rephrased as an instance of B,
the solution to which provides a solution to
the instance of A
What do you suppose easily means?
This rephrasing is called transformation
Intuitively: If A reduces to B, then A is no
harder to solve than B
Using Reductions
If A is polynomial-time reducible to B, we denote
this by A p B
Definition of NP-Complete:
A is NP-Complete if
A NP, and
all problems B in NP are reducible to A
Formally: A is NP-Complete if
A NP, and
B NP, B p A
If A p B and A is NP-Complete, then B is also
NP-Complete
This is the key idea that you should take away
Min sp tree
Reducibility
An example:
Problem A: Given a set of Booleans, is at least
one TRUE?
Problem B: Given a set of integers, is their sum
positive?
Transformation: (x1, x2, , xn) = (y1, y2, , yn)
where yi = 1 if xi = TRUE, yi = 0 if xi = FALSE
Another example:
Solving linear equations is reducible to solving
quadratic equations
How can we easily use a quadratic-equation
solver to solve linear equations?
Review: Tractability
Some problems are undecidable: no computer
can solve them
Turings Halting Problem
Other problems are decidable, but intractable
as the input grows large, it seems that we are
unable to solve them in reasonable time
Traveling salesperson
Hamilton cycle
Other problems are easy
Sorting
Minimum spanning tree
82
An Aside: Terminology
What is the difference between a problem
and an instance of that problem?
To formalize things, we will express
instances of problems as strings
How can we express a instance of the
hamiltonian cycle problem as a string?
To simplify things, we will worry only
about decision problems with a yes/no
answer
Many problems are optimization
problems, but we can often re-cast
those as decision problems
Proving NP-Completeness
What steps do we have to take to prove a
problem P is NP-Complete?
Pick a known NP-Complete problem Q
Reduce Q to P
Describe a transformation that maps
instances of Q to instances of P, s.t.
yes for P = yes for Q
Prove the transformation works
Prove it runs in polynomial time
prove P NP
83