Beruflich Dokumente
Kultur Dokumente
Jrg Endrullis
Vrije Universiteit Amsterdam
Department of Computer Science
Section Theoretical Computer Science
20092010
http://www.few.vu.nl/~tcs/ds/
The Lecture
Lectures:
lecture:
Jrg Endrullis
joerg@few.vu.nl
exercise classes:
Dennis Andriessen
dae400@few.vu.nl
Michel Degenhardt
ldt200@few.vu.nl
correctness
termination
efciency
Evaluation of Algorithms
Important measures:
time complexity
compiler
easier to analyse
requires implementation
logarithmic (log
2
n), linear (n), . . . , exponential (a
n
), . . .
Random Access Machine (RAM)
A Random Access Machine is a CPU connected to a memory:
primitive operations:
calling a method
array access
logn
n logn
n
2
n
3
a
n
logarithmic growth
linear growth
(n logn)-growth
quadratic growth
cubic growth
exponential growth (a > 1)
We recall the denition of logarithm:
log
a
b = c such that a
c
= b
Speed of Growth, Pictorial
x
f (x)
f (x) = logx
f (x) = x
f (x) = x logx
f (x) = x
2
f (x) = e
x
Speed of Growth
n 10 100 1000 10
4
10
5
10
6
log
2
n 3 7 10 13 17 20
n 10 100 1000 10
4
10
5
10
6
n logn 30 700 13000 10
5
10
6
10
7
n
2
100 10000 1
6
10
8
10
10
10
12
n
3
1000 1
6
1
9
10
12
10
15
10
18
2
n
1024 10
30
10
300
10
3000
10
30000
10
300000
Note that log
2
n growth very slow, whereas 2
n
growth explosive!
Time depending on Problem Size
Computation time assuming that 1 step takes 1s (0.000001s).
n 10 100 1000 10
4
10
5
10
6
logn < < < < < 0.00002s
n < < 0.001s 0.01s 0.1s 1s
n logn < < 0.013s 0.1s 1s 10s
n
2
< 0.01s 1s 100s 3h 1000h
n
3
0.001s 1s 1000s 1000h 100y 10
5
y
2
n
0.001s 10
23
y > > > >
Here < means fast (< 0.001s), and > more than 10
300
years.
A problem for which there exist only exponential algorithms are
usually considered intractable.
Search in Arrays
We analyse the worst case complexity of search.
Algorithm search(A, n, x):
Input: An array storing n 1 integers.
Output: Returns true if A contains x and false, otherwise.
for i = 0 to n 1 do
if A[i] == x then
return true
done
return false
1 + (n + 1) + n
1 + 1
(not taken for worst case)
1 + 1
1
Hence we have a worst case time complexity:
T(n) = 1 + (n + 1) + n 4 + 1
= 5 n + 3
Search in Arrays
We analyse the best case complexity of search.
Algorithm search(A, n, x):
Input: An array storing n 1 integers.
Output: Returns true if A contains x and false, otherwise.
for i = 0 to n 1 do
if A[i] == x then
return true
done
return false
1 + 1
1 + 1
1
Hence we have a best case time complexity:
T(n) = 5
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A, n, x):
Input: An array storing n integers in ascending order.
Output: Returns true if A contains x and false, otherwise.
low = 0
high = n 1
while low high do
mid = (low + high)/2|
y = A[mid]
if x < y then high = mid 1
if x == y then return true
if x > y then low = mid + 1
done
return false
1 2 2 4 6 7 8 11 13 15 16
low
high
mid
x = 8
y = 7
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A, n, x):
Input: An array storing n integers in ascending order.
Output: Returns true if A contains x and false, otherwise.
low = 0
high = n 1
while low high do
mid = (low + high)/2|
y = A[mid]
if x < y then high = mid 1
if x == y then return true
if x > y then low = mid + 1
done
return false
1 2 2 4 6 7 8 11 13 15 16
high
low mid
x = 8
y = 13
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A, n, x):
Input: An array storing n integers in ascending order.
Output: Returns true if A contains x and false, otherwise.
low = 0
high = n 1
while low high do
mid = (low + high)/2|
y = A[mid]
if x < y then high = mid 1
if x == y then return true
if x > y then low = mid + 1
done
return false
1 2 2 4 6 7 8 11 13 15 16
low
high
mid
x = 8
y = 8
Search in Sorted Arrays: Binary Search
We analyse the worst case complexity of binSearch.
low = 0
high = n 1
while low high do
. . .
done
return false
1
2
number of loops (1+
. . .)
1
The analyse the maximal number of loops L for minimal lists A:
L A
1 1 x = 2
2 1 2 x = 3
3 1 2 3 4 x = 5
4 1 2 3 4 5 6 7 8 x = 9
For a list A of length n we need (1 +log
2
n) loops in worst case.
Search in Sorted Arrays: Binary Search
We analyse the worst case complexity of binSearch.
low = 0
high = n 1
while low high do
mid = (low + high)/2|
y = A[mid]
if x < y then high = mid 1
if x == y then return true
if x > y then low = mid + 1
done
return false
1
2
1+(1+log
2
n) (1+
4
2
1 (false)
1 (false)
3 (true) )
1
Hence we have a worst case time complexity:
T(n) = 5 + (1 + log
2
n) 12
= 17 + 12 log
2
n
Search in Sorted Arrays: Binary Search
We analyse the best case complexity of binSearch.
low = 0
high = n 1
while low high do
mid = (low + high)/2|
y = A[mid]
if x < y then high = mid 1
if x == y then return true
if x > y then low = mid + 1
done
return false
1
2
1+
4
2
1 (false)
2 (true)
Hence we have a best case time complexity:
T(n) = 13
Summary: Counting Primitive Operations
Assumption (Random Access Machine):
n
2
additions, and
n multiplications.
Assume that we have computers A and B for which:
A be an algorithm,
T
C
(n) the time complexity on a (real) computer C.
For every computer C there exist factors a, b > 0 such that:
a T(n) T
C
(n) b T(n)
We can choose:
requires implementation
1
1
Hence we have a (worst case) time complexity:
T(n) O(1 + (n 1) 1 + 1)
= O(n)
Asymptotic estimation: prexAverage
We analyse the complexity of prexAverage.
Algorithm prexAverage(A, n):
Input: An array storing n 1 integers.
Output: An array B of length n such that for all i < n:
B[i ] =
1
i +1
(A[0] + A[1] +. . . + A[i ])
B = new array of length n
for i = 0 to n 1 do
sum = 0
for j = 0 to i do
sum = sum + A[j]
done
B[i] = sum/(i + 1)
done
return B
n
n
n 1
(1 + 2 +. . . + n)
1
n 1
1
The (worst case) time complexity is: T(n) O(n
2
)
Asymptotic estimation: prexAverage2
We analyse the complexity of prexAverage2.
Algorithm prexAverage2(A, n):
Input: An array storing n 1 integers.
Output: An array B of length n such that for all i < n:
B[i ] =
1
i +1
(A[0] + A[1] +. . . + A[i ])
B = new array of length n
sum = 0
for i = 0 to n 1 do
sum = sum + A[i]
B[i] = sum/(i + 1)
done
return B
n
1
n
1
1
The (worst case) time complexity is:
T(n) O(n)
Efciency for Small Problem Size
Asymptotic complexity (big-O-notation) speaks about big n:
data stored
main operations:
pop(): removes & returns the object on the top of the stack.
An error occurs (exception is thrown) if the stack is empty.
2
1
5
2
1
2
1
5
3
push(3) pop()
returns 5
auxiliary operations:
main operations:
auxiliary operations:
le systems, databases
organization charts
Tree Terminology
Ancestors:
A is parent of C
B is grandparent of H
Descendant:
C is child of A
H is grandchild of B
Substree:
Accessor operations:
Generic operations:
Query methods:
arithmetic expressions
decision processes
searching
A
B
D E
H I
C
F G
Binary Trees and Arithmetic Expressions
Binary trees can represent arithmetic expressions:
3
a
2
/
b 4
(postorder traversal can be used to evaluate an expression)
Binary Decision Trees
Binary tree can represent a decision process:
n number of nodes
e number of leafs
h height
Properties:
e = i + 1
n = 2e 1
h i
h (n 1)/2
e 2
h
h log
2
e
h log
2
(n + 1) 1
Binary Tree ADT
Additional methods:
a node is visited after its left and before its right subtree.
Application: drawing binary trees
y(v) = depth of v
inOrder(v):
if isInternal(v) then
inOrder(leftChild(v))
visit(v)
if isInternal(v) then
inOrder(rightChild(v))
A
B
D E
H I
C
F G
1
2
3
4
5
6
7
8
9
Inorder Traversal: Printing Arithmetic Expressions
A specialization of the inorder traversal for printing expressions:
printExpression(v):
if isInternal(v) then
print(()
inOrder(leftChild(v))
print(v.element())
if isInternal(v) then
inOrder(rightChild(v))
print())
+
3
a
2
/
b 4
Example: printExpression(v) of the above tree yields
((3 (a 2)) + (b/4))
Postorder Traversal: Evaluating Expressions
Postorder traversal for evaluating arithmetic expressions:
eval(v):
if isExternal(v) then
return v.element()
else
x = eval(leftChild(v))
y = eval(rightChild(v))
= operator stored at v
return x y
+
3
4 2
/
8 4
Euler Tour Traversal
3
5 2
/
8 4
L R
B
Vector ADT
A Vector stores a list of elements:
Additional methods:
rank/index, or
element (data)
rst
last
element 1 element 2 element 3 element 4
Singly Linked List: size
If we store the size on the variable size, then:
size():
return size
Running time: O(1).
If we do not store the size on a variable, then:
size():
s = 0
m = rst
while m != do
s = s + 1
m = m.next
done
return s
Running time: O(n).
Singly Linked List: insertAfter
insertAfter(n, o):
x = new node with element o
x.next = n.next
n.next = x
if n == last then last = x
size = size + 1
A B C
A B C
o
n
n
x
last
last
rst
rst
Running time: O(1).
Singly Linked List: insertFirst
insertFirst(o):
x = new node with element o
x.next = rst
rst = x
if last == then last = x
size = size + 1
A B C
A B C
o
rst
rst, x
Running time: O(1).
Singly Linked List: insertBefore
insertBefore(n, o):
if n == rst then
insertFirst(o)
else
m = rst
while m.next != n do
m = m.next
done
insertAfter(m, o)
(error check m != has been left out for simplicity)
We need to nd the node m before n:
push(o):
insertFirst(o)
pop():
o = rst().element
remove(rst())
return o
rst (top)
element 1 element 2 element 3 element 4
Queue with a Singly Linked List
We can implement a queue with a singly linked list:
enqueue(o):
insertLast(o)
dequeue():
o = rst().element
remove(rst())
return o
element (data)
N operations in 1s,
1 operation in Ns
Amortized running time:
(N 1 + N)/(N + 1) 2
That is: O(1) per operation.
operations
time
N
N
Amortization Techniques: Accounting Method
The accounting method uses a scheme of credits and debits
to keep track of the running time of operations in a sequence:
add(o) O(1)
i =1
k i O(n
2
)
Average costs for each operation in the sequence: O(n).
Array-based Implementation: Doubling the Size
Each time the array is full we double the size:
size(), isEmpty()
Applications of Priority Queues
Direct applications:
reexivity: x x
Comparator ADT
isEqual(x, y)
isComparable(x)
Sorting with Priority Queues
We can use a priority queue for soring as follows:
Performance:
Performance:
insertItem is O(n)
for singly/doubly linked list O(n) for nding the insert position
Simple implementation.
Efcient for:
small lists
while the current node has a left child, go to the left child
Time complexity is O(log
2
n) since the heap height is O(log
2
n).
(we walk at most at most once completely up and down again)
Heaps: Removal of the Root
The removal of the root consits of 3 steps:
Replace the root key with the key of the last node w.
start from the old last node (which has been remove)
while the current node has an right child which is not a leaf,
go to the right child
Time complexity is O(log
2
n) since the heap height is O(log
2
n).
(we walk at most at most once completely up and down again)
removed
Heap-Sort
We implement a priority queue by means of a heap:
Leafs and links between the nodes are not stored explicitly.
Vector-based Heap Implementation, continued
We can represent a heap with n keys by a vector of size n + 1:
2
5
9 7
6
0 1 2 3 4 5
2 5 6 9 7
take 2
h1
elements and turn them into heaps of size 1
size(), isEmpty()
keys(), elements()
Log File
A log le is a dictionary implemented based on an unsorted list:
Performance:
New operations:
closestKeyBefore(k)
closestElementBefore(k)
closestKeyAfter(k)
closestElementAfter(k)
Binary Search
Binary search performs ndElement(k) on sorted arrays:
Performance:
an element
B C
F
Data Structure for Binary Trees
A node is represented by an object storing:
an element
left child
right child
A
B C
D E
A
B C
D
E
Binary trees can also be represented by a Vector, see heaps.
Binary Search Tree
A binary search tree is a binary tree such that:
if k == k
if k < k
if k > k
we know
n(h) = 1 + n(h 1) + n(h 2)
> 2 n(h 2) since n(h 1) > n(h 2)
> 4 n(h 4)
> 8 n(h 6)
n(h) > 2
i
n(h 2i ) by induction
2
h/2
Thus h 2 log
2
n(h).
AVL Trees: Insertion
Insertion of a key k in AVL trees works in the following steps:
initial nd is O(logn)
initial nd is O(logn)
h(x) = x mod N
is a hash function for integer keys
hash function h
Problem: collisions
Keys k
1
, k
2
with h(k
1
) = h(k
2
) are said to collide.
0
1
2
3
.
.
.
(025-611-001, Mr. X)
(987-067-002, Brad Pit) (123-456-002, Dipsy)
?
Different possibilities of handing collisions:
chaining,
linear probing,
double hashing, . . .
Collisions continued
Usual setting:
hash function
lots of collisions
Every has value (cell in the hash table) has equal probabilty.
) = 535 h(b
) = 80 h(c
) = 618 h(d
) = 163
h(ab
) = 354 h(ba
) = 979 . . .
At least at rst glance they look random.
Hash Code Map and Compression Map
Hash function is usually specied as composition of:
compression map: h
2
: integers [0, . . . , N 1]
The hash code map is appied before the compression map:
h(x) = h
2
(h
1
(x)) is the composed hash function
The compression map usually is of the form h
2
(x) = x mod N:
Component sum:
Polynomial accumulation:
Mid-square method:
Random method:
reduced performance
Possible improvements/ modications:
Fixed increment c: h
i
(k) = h(k) + c i .
Changing directions: h
i
(k) = h(k) + c i (1)
i
.
Double hashing: h
i
(k) = h(k) + i h
(k).
Double Hashing
Double hashing uses a secondary hash function d(k):
N = 13
h(k) = k mod 13
d(k) = 7 (k mod 7)
k h(k) d(k) Probes
18 5 3 5
41 2 1 2
22 9 6 9
44 5 5 5, 10
59 7 4 7
32 6 3 6
31 5 4 5,9,0
73 8 4 8
0 1 2 3 4 5 6 7 8 9 10 11 12
18 41 22 44 59 32 31 73
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:
f (x)
1/(1 )
0.2 0.4 0.6 0.8 1
5
10
15
20
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:
small databases
compilers
browser caches
Universal Hashing
No hash function is good in general:
there always exist keys that are mapped to the same value
Hence no single hash function h can be proven to be good.
However, we can consider a set of hash functions H.
(assume that keys are from the interval [0, M 1])
We say that H is universal (good) if for all keys 0 i ,= j < M:
probability(h(i ) = h(j ))
1
N
for h randomly selected from H.
Universal Hashing: Example
The following set of hash functions H is universal:
,= j
= a i + b mod p and j
= a i + b mod p. Thus
every pair (i
, j
) with i
,= j
mod N = j
mod N is
1
N
.
Comparison AVL Trees vs. Hash Tables
Dictionary methods:
search insert remove
AVL Tree O(log
2
n) O(log
2
n) O(log
2
n)
Hash Table O(1)
1
O(1)
1
O(1)
1
1
expected running time of hash tables, worst-case is O(n).
Ordered dictionary methods:
closestAfter closestBefore
AVL Tree O(log
2
n) O(log
2
n)
Hash Table O(n + N) O(n + N)
Examples, when to use AVL trees instead of hash tables:
1. if you need to be sure about worst-case performance
2. if keys are imprecise (e.g. measurements),
e.g. nd the closest key to 3.24: closestTo(3.72)
Sorting Algorithms
We have already seen:
Selection-sort
Insertion-sort
Heap-sort
We will see:
Bubble-sort
Merge-sort
Quick-sort
We will show that:
worst-case:
(n 1) +. . . + 1 =
(n 1)
2
+ n 1
2
O(n
2
)
(caused by sorting an inverse sorted list)
best-case: O(n)
Bubble-sort is:
slow
in-place
Divide-and-Conquer
Divide-and-Conquer is a general algorithm design paradigm:
Conquer: merge S
1
and S
2
into a sorting of S
Algorithm mergeSort(S, C):
Input: a list S of n elements and a comparator C
Output: the list S sorted according to C
if size(S) > 1 then
(S
1
, S
2
) = partition S into size n/2| and n/2|
mergeSort(S
1
, C)
mergeSort(S
2
, C)
S = merge(S
1
, S
2
, C)
Merging two Sorted Sequences
Algorithm merge(A, B, C):
Input: sorted lists A, B
Output: sorted lists containing the elements of A and B
S = empty list
while A.isEmtpy() or B.isEmtpy() do
if A.rst().element < B.rst().element then
S.insertLast(A.remove(A.rst()))
else
S.insertLast(B.remove(B.rst()))
done
while A.isEmtpy() do S.insertLast(A.remove(A.rst()))
while B.isEmtpy() do S.insertLast(B.remove(B.rst()))
Performance:
2
i+1
n recursive calls
1 0
n
2 1
n/2
2
i
i n/2
i
. . . . . . . . .
depth nodes size
Thus the worst-case running time is O(n log
2
n).
Quick-Sort
Quick-sort of a list S with n elements works as follows:
= l and r
= r
while l
do
while l
and A[l
] p do l
= l
+ 1 (nd > p)
while l
and A[r
] p do r
= r
1 (nd < p)
if l
< r
then swap(A[l
], A[r
in-place, stable
Phase 1:
Phase 2:
no external comparator
Bucket-sort is a stable sorting algorithm.
Extensions:
k
i
is called the i -th dimension of the tuple
Example: (2, 5, 1) as point in 3-dimensional space
The lexicographic order of d tuples is recursively dened:
(x
1
, x
2
, . . . , x
d
) < (y
1
, y
2
, . . . , y
d
)
x
1
< y
1
(x
1
= y
1
(x
2
, . . . , x
d
) < (y
2
, . . . , y
d
))
That is, the tuples are rst compared by dimension 1, then 2,. . .
Lexicographic-Sort
Lexicographic-sort sorts a list of d-tuples in lexicographic order:
Let C
i
be comparator comparing tuples by i -th dimension.
43
10
, that is, 43 in decimal system (base 10)
101011
2
, that is, 43 in binary system (base 2)
1121
3
, that is, 43 represented base 3
For every base b 2 and every number m there exist unique
digits 0 d
0
, . . . , d
l
< b such that:
m = d
l
b
l
+ d
l 1
b
l 1
+. . . + d
1
b
1
+ d
0
b
0
and if l > 0 then d
l
,= 0.
Example
43 = 43
10
= 4 10
1
+ 3 10
0
= 101011
2
= 1 2
5
+ 0 2
4
+ 1 2
3
+ 0 2
2
+ 1 2
1
+ 1 2
0
= 1121
3
= 1 3
3
+ 1 3
2
+ 2 3
1
+ 1 3
0
Radix-Sort
Radix-sort is specialization of lexicographic-sort:
Then use radix-sort to sort in O(2 n), that is, O(n) time.
The Selection Problem
The selection problem:
Search:
T(n) b a n + T(
3
4
n)
Let m = 2 a n, then
T(n) 2 a n + T((3/4)n)
2 a n + 2 a (3/4) n + 2 a (3/4)
2
n . . .
= 8 a n O(n) (geometric series)
Quick-Select: Median of Medians
We can do selection in O(n) worst-case time.
Idea: recursively use select itself to nd a good pivot:
b if n 1
T(0.7 n) + T(0.2 n) + 2 b n if n > 1
We will see how to solve recurrence equations. . .
Fundamental Techniques
We have considered algorithms for solving special problems.
Now we consider a few fundamental techniques:
Devide-and-Conquer
Dynamic Programming
Divide-and-Conquer
Divide-and-Conquer is a general algorithm design paradigm:
merge sort
quick sort
Master method.
Iterative Substitution
Iterative substitution technique works as follows:
hope to nd a pattern
Example
T(n) =
1 if n 1
T(n 1) + 2 if n > 1
We start with T(n) and apply the recursive case:
T(n) = T(n 1) + 2
= T(n 2) + 4
= T(n k) + 2 k
For k = n 1 we reach the base case T(n k) = 1 thus:
T(n) = 1 + 2 (n 1) = 2 n 1
Merge-Sort Review
Merge-sort of a list S with n elements works as follows:
Conquer: merge S
1
and S
2
into a sorting of S
Let b N such that:
b if n 1
2 T(n/2) + b n if n > 1
We search a closed solution for the equation, that is:
b if n 1
2 T(n/2) + b n if n > 1
We assume that n is a power of 2: n = 2
k
(that is, k = log
2
n)
2 T(n/2
2
) + b (n/2)
+ b n
= 2
2
T(n/2
2
) + 2 b n
= 2
3
T(n/2
3
) + 3 b n
= 2
k
T(n/2
k
) + k b n
= n b + (log
2
n) b n
Thus T(n) = b (n + n log
2
n) O(n log
2
n).
The Recursion Tree Method
The recursion tree method is a visual approach:
b if n 2
3 T(n/3) + b n if n > 2
For a node with input size k: work at this node is b k.
1 0
n
3 1
n/3
3
i
i n/3
i
depth nodes size
Thus the work at depth i is 3
i
b n/3
i
= b n.
The height of the tree is log
3
n. Thus T(n) is O(n log
3
n).
Guess-and-Test Method
The guess-and-test method works as follows:
1 if n 1
T(n/2) + 2 n if n > 1
Guess: T(n) 2 n
1 if n 1
T(n/2) + 2 n if n > 1
New guess: T(n) 4 n
b if n 1
T(0.7 n) + T(0.2 n) + 2 b n if n > 1
Guess: T(n) 20 b n
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Master Method: Example 1
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = 4 T(n/2) + n
Solution: log
b
a = 2, thus case 1 says T(n) = n
2
.
Master Method: Example 2
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = 2 T(n/2) + n logn
Solution: log
b
a = 1, thus case 2 says T(n) = n log
2
n.
Master Method: Example 3
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = T(n/3) + n logn
Solution: log
b
a = 0, thus case 3 says T(n) = n logn.
Master Method: Example 4
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = 8 T(n/2) + n
2
Solution: log
b
a = 3, thus case 1 says T(n) = n
3
.
Master Method: Example 5
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = 9 T(n/3) + n
3
Solution: log
b
a = 2, thus case 3 says T(n) = n
3
.
Master Method: Example 6
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = T(n/2) + 1 binary search
Solution: log
b
a = 0, thus case 2 says T(n) = logn.
Master Method: Example 7
T(n) =
c if n < d
aT(n/b) + f (n) if n d
Theorem (The Master Theorem)
1. if f (n) is O(n
log
b
a
), then T(n) is (n
log
b
a
)
2. if f (n) is (n
log
b
a
log
k
n), then T(n) is (n
log
b
a
log
k+1
n)
3. if f (n) is (n
log
b
a+
), then T(n) is (f (n)), provided
af (n/b) f (n) for some < 1.
Example
T(n) = 2 T(n/2) + logn heap construction
Solution: log
b
a = 0, thus case 1 says T(n) = n.
Integer Multiplication
Algorithm to multiply two n-bit integers A and B:
B = B
h
2
n/2
+ B
) (B
h
2
n/2
+ A
)
= A
h
B
h
2
n
+ A
h
B
2
n/2
+ A
B
h
2
n/2
+A
B = B
h
2
n/2
+ B
+ A
B
h
) 2
n/2
+ A
= A
h
B
h
2
n
+ [(A
h
A
) (B
B
h
) + A
h
B
h
+ A
] 2
n/2
+A
only once)
The Greedy Method
An optimization problem consists of
b
i
= a positive benet
w
i
= a positive weight
Fractional Knapsack Problem
choose fractions x
i
w
i
of items with maximal total benet:
i S
b
i
x
i
w
i
(objective to maximize)
i S
x
i
W (constraint)
Example
You found a treasure with:
b,
w, W):
for each i S do
x
i
= 0
v
i
= b
i
/w
i
sort S such that elements are descending w.r.t. v
i
while W > 0 do
i = S.remove(S.rst())
x
i
= min(w
i
, W)
W = W x
i
done
The Fractional Knapsack Algorithm
Correctness: suppose there is a better solution A.
a start time s
i
a nish time f
i
(where s
i
< f
i
)
Goal: perform all tasks using a minimal number of machines
time
1 2 3 4 5 6 7 8 9
Machine 1
Machine 2
Machine 3
Task Scheduling Algorithm
Greedy choice:
s,
f ):
m = 0 number of machines
sort T such that elements are ascending w.r.t. s
i
while T.isEmpty() do
i = T.remove(T.rst())
if a machine j m has time for task i then
schedule i on machine j
else
m = m + 1
schedule i on machine m
done
Example
Given: a set T of tasks:
[1, 4], [1, 3], [2, 5], [3, 7], [4, 7], [6, 9], [7, 8]
(ordered by starting time)
time
1 2 3 4 5 6 7 8 9
Machine 1
Machine 2
Machine 3
Task Scheduling Algorithm
Correctness: suppose there is a better schedule.
Assume:
. . .
Dynamic Programming
Dynamic programming is a algorithm design paradigm:
b
i
= a positive benet
w
i
= a positive weight
0/1 Knapsack Problem
i T
b
i
(objective to maximize)
i T
w
i
W (constraint)
Thus now each item is either accepted or rejected entirely.
Example
You found a treasure with:
let S
k
be the the set of elements 1,. . . ,k
B[k 1, w], if w
k
> w
max(B[k 1, w], B[k 1, w w
k
] + b
k
) otherwise
That is, the best subset of S
k
with weight w is either:
i =0
A[r , i ] B[i , c]
takes O(d e f ) time.
(we multiply every row of A with every column of B)
Matrix Chain-Products
Matrix chain-product:
Compute A
0
A
1
A
n1
.
Let A
i
have dimension d
i
d
i +1
.
Problem: how to put parenthesize?
Example:
C is 100 5
D is 5 5
Then
A is 2 1
B is 1 2
C is 2 3
Then:
A B takes 4 operations
B C takes 6 operations
Thus the greedy methods picks (A B) C. However:
(A B) C takes 4 + 12 = 16 operations
A (B C) takes 6 + 6 = 12 operations
Thus the greedy method does not yield the optimal result.
Solution: Dynamic Programming
What are the subproblems?
best parenthesization of A
i
A
i +1
A
j
.
Let N[i , j ] be the number of operations for this subproblem.
The global optimum can be dened from optimal subproblems:
Recall that A
i
has dimension d
i
d
i +1
.
N[i , k] + N[k + 1, j ] + d
i
d
k+1
d
j +1
A
0
is 2 1, A
1
is 1 2, A
2
is 2 4, A
3
is 4 3.
We run the algorithm:
Nodes: U, X, Z
Electronic circuits.
Transportation networks:
ight network
Computer networks.
. . .
Graphs: Terminology
U is start-point of a
V is end-point of a
Edges:
a is outgoing edge of U
a is incoming edge of V
Degree of a vertex
(number of edges):
Self-loops:
j is a self-loop
V
U X Z
W
Y
a
c
d
b
e
f
g
h
i
j
Graphs: Terminology (continued)
Path:
sequence of alternating
vertices and edges
e.g. U c W e X g Y f W d V
Cycle:
element
Edge stores:
element
element
key: distance
element: vertex
O((n + m) logn)
(if we use the adjacency list structure)
Dijkstras algorithm is O(m logn) (graph is connected).
Why Dijkstras Algorithm Works
Dijkstras algorithm is based on the greedy method:
T is a tree
Spanning tree is minimal if the total edge weight is minimal.
Application:
Communication networks.
Transportation networks.
X
Y
Z
U
V
W
P
8
9
6
3
5
4
1 10
7
2
Cycle Property
Cycle property:
abc is a substring of S
acaa is a prex of S
ca is a sufx of S
Pattern Matching
The pattern matching problem:
text editors
search engines
biological reasearch
. . .
Brute-Force Algorithm
The brute-force algorithm:
Looking-glass heuristic:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
The last-occurence function can be computed in O(m+s) time:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Example
Example:
= {a, b, c, d}
P = abacab
c a b c d
L(c) 4 5 3 -1
a
b
a c a a
b
c a
d
a
b
a c a
b
a a
b b
a
b
a c a
b
a
b
1
a
b
a c a
b
b
b
2
a
a
3
a
c
4
a
b
a c a
b
c
b
5
a
b
a c a
b
d
b
6
a
b
a c a
b
b
b
7
a
a
8
c
c
9
a
a
10
b
b
11
a
a
12
Boyer-Moore: Analysis
The worst case time complexity of Boyer-Moore is O(n m + s)
either i increases by 1, or
e.g. alphabet A, B, C
ACBA = 00110100
More efcient encodings:
ACBA = 011100
The coding should be unambiguous. Bad example:
Alphabet A
1
, A
2
, . . . , A
n
.
Probabilities 0 p(A
i
) 1 for every letter A
i
.
Problem:
1i n
p(A
i
) |c(A
i
)|
should be minimal.
Huffman Algorithm
Huffman Algorithm: