Sie sind auf Seite 1von 113

CS6402 DESIGN AND ANALYSIS OF ALGORITHMS LTPC 3003

OBJECTIVES:

The student should be made to Learn the algorithm analysis techniques. Become familiar
with the different algorithm design techniques. Understand the limitations of Algorithm
power.
UNIT I INTRODUCTION 9
Notion of an Algorithm – Fundamentals of Algorithmic Problem Solving – Important
Problem Types – Fundamentals of the Analysis of Algorithm Efficiency – Analysis
Framework – Asymptotic Notations and its properties – Mathematical analysis for
Recursive and Non-recursive algorithms.

UNIT II BRUTE FORCE AND DIVIDE-AND-CONQUER 9


Brute Force – Closest-Pair and Convex-Hull Problems-Exhaustive Search – Traveling
Salesman Problem – Knapsack Problem – Assignment problem. Divide and conquer
methodology – Merge sort – Quick sort – Binary search – Multiplication of Large
Integers – Strassen’s Matrix Multiplication-Closest-Pair and Convex-Hull Problems.

UNIT III DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE 9


Computing a Binomial Coefficient – Warshall’s and Floyd’ algorithm – Optimal Binary
Search Trees – Knapsack Problem and Memory functions. Greedy Technique– Prim’s
algorithm- Kruskal’s Algorithm- Dijkstra’s Algorithm-Huffman Trees.

UNIT IV ITERATIVE IMPROVEMENT 9


The Simplex Method-The Maximum-Flow Problem – Maximm Matching in Bipartite
Graphs- The Stable marriage Problem.

UNIT V COPING WITH THE LIMITATIONS OF ALGORITHM POWER 9


Limitations of Algorithm Power-Lower-Bound Arguments-Decision Trees-P, NP and
NP-Complete Problems–Coping with the Limitations – Backtracking – n-Queens problem
– Hamiltonian Circuit Problem – Subset Sum Problem-Branch and Bound – Assignment
problem – Knapsack Problem – Traveling Salesman Problem- Approximation Algorithms
for NP – Hard Problems – Traveling Salesman problem – Knapsack problem.
TOTAL: 45
PERIODS

OUTCOMES

At the end of the course, the student should be able to: Design algorithms for various
computing problems. Analyze the time and space complexity of algorithms. Critically
analyze the different algorithm design techniques for a given problem. Modify existing
algorithms to improve efficiency.

TEXT BOOK:

1. Anany Levitin, “Introduction to the Design and Analysis of Algorithms”, Third


Edition, Pearson Education, 2012.

REFERENCES:

1.Thomas H.Cormen, Charles E.Leiserson, Ronald L. Rivest and Clifford Stein,


“Introduction to Algorithms”, Third Edition, PHI Learning Private Limited, 2012.
2. Alfred V. Aho, John E. Hopcroft and Jeffrey D. Ullman, “Data Structures and
Algorithms”,Pearson Education, Reprint 2006.
3. Donald E. Knuth, “The Art of Computer Programming”, Volumes 1& 3 Pearson
Education, 2009. Steven S. Skiena, “The Algorithm Design Manual”, Second Edition,
Springer, 2008.
4. http://nptel.ac.in/
UNIT INTRODUCTION 9
Notion of an Algorithm – Fundamentals of Algorithmic Problem Solving – Important
Problem Types – Fundamentals of the Analysis of Algorithm Efficiency – Analysis
Framework – Asymptotic Notations and its properties – Mathematical analysis for
Recursive and Non-recursive algorithms.

Algorithm
An algorithm is a sequence of unambiguous instructions for solving a problem,
i.e., for Obtaining a required output for any legitimate input in a finite amount of time.

Characteristics of an Algorithm

More precisely, an algorithm is a method or process to solve a problem satisfying


the following properties:

Finiteness
• terminates after a finite number of steps
Definiteness
• Each step must be rigorously and unambiguously specified.
-e.g.,”stir until lumpy”
Input
• Valid inputs must be clearly specified.
Output
• can be proved to produce the correct output given a valid input.
Effectiveness
• Steps must be sufficiently simple and basic.

Notation of an Algorithm
Fundamentals of Algorithmic Problem Solving

Understanding the problem


 Asking questions, do a few examples by hand, think about special cases, etc.
 The input to the algorithm is called instance of the problem. It is very
important to decide the range of inputs so that the boundary values of
algorithm get fixed.
 The algorithm should work correctly for all valid inputs.
Decision making
This step serves as a base for the actual design of algorithm.
 Capabilities of Computational devices
It is necessary to know the computational capabilities of devices on which
the algorithm will be running. Globally we can classify an algorithm from
execution point of view as sequential algorithm and parallel algorithm. For
solving complex problem it is essential to have proper choice of a
computational device which is space and time effiecient.
 choice for either exact or approximate problem solving method
The next important decision is to decide whether the problem is to be
solved correctly or approximately. If the problem needs to be solved
correctly then we need exact algorithm. Otherwise if the problem is so
complex that we won’t get the exact solution then in that situation we need
to choose approximation algorithm.
 Data structures
Data Structure and algorithm work together and these are interdependent.
The implementation of algorithm is possible with the help of algorithm
and data structure.
 Algorithmic Strategies
It is a general approach by which many problems can be solved
algorithmically.
Algorithm Design Techniques
 Brute Force : This is straight forward technique with naïve
approach.
 Divide-and-Conquer : The problem is divided into smaller
instances.
 Dynamic Programming : The results of smaller, reoccurring
instances are obtained to solve the problem.
 Greedy Technique : To solve the problem locally optimal
decisions are made.
 Backtracking : This method is based on the trial and error.
Specification of an algorithm
There are various ways by which we can specify an algorithm.
 Using Natural Language
 Pseudo Code
 Flowchart
Algorithmic Verification
Algorithmic verification means checking correctness of an algorithm. A common
method of proving the correctness of an algorithm is by using mathematical induction.
Analyzing an algorithm
– Time efficiency: how fast the algorithm runs
– Space efficiency: how much memory the algorithm needs.
Implementation of an algorithm

The implementation of an algorithm is done by suitable programming language.

Important Problem Types


Sorting - The sorting problem is to rearrange the items of a given list in nondecreasing
order
Searching - The searching problem deals with finding a given value, called a search key,
in a given set
String processing - A string is a sequence of characters from an alphabet. The searching
of string for a given word in a text call string matching.
Graph problems - A graph can be thought of as a collection of points called vertices and
edges. Basic graph algorithms include graph-traversal algorithms, shortest-path
algorithms, and topological sorting for graphs with directed edges
Combinatorial problems - Find the explicitly or implicitly of combinatorial object such
as a permutation, a combination or a subset that satisfies certain constraints.
Geometric Problems
Geometric algorithms deal with geometric objects such as points, lines, and
polygons. The ancient Greeks were very much interested in developing procedures (they
did not call them algorithms, of course) for solving a variety of geometric problems,
including problems of constructing simple geometric shapes—triangles, circles, and so
on—with an unmarked ruler and a compass.
Discuss algorithms for only two classic problems of computational geometry: the
closest-pair problem and the convex-hull problem. The closest-pair problem is self-
explanatory: given n points in the plane, find the closest pair among them. The convex-
hull problem asks to find the smallest convex polygon that would include all the points of
a given set.
Numerical Problems
Numerical problems, another large special area of applications, are problems that
involve mathematical objects of continuous nature: solving equations and systems of
equations, computing definite integrals, evaluating functions, and so on. The majority of
such mathematical problems can be solved only approximately. Another principal
difficulty stems from the fact that such problems typically require manipulating real
numbers, which can be represented in a computer only approximately
Fundamentals of the Analysis of Algorithm Efficiency
The efficiency of an algorithm can be decided by measuring the performance of an
algorithm. We can measure the performance of an algorithm by following factors.
Space Complexity
The space complexity can be defined as amount of memory required by an
algorithm to run. To compute space complexity we use two factors: constant and instant
characteristics.
S(p) = C + Sp
Where C is a constant i.e. fixed part and it denotes the amount of space taken by
instruction, variables and identifiers. And Sp is a space dependent upon instant
characteristics. This is a variable part whose space requirement depends on particular
problem instance.
Time Complexity
The time complexity of an algorithm is the amount of time required by an
algorithm to run. Executing time depends on many factors such as –
 System load
 Number of other programs running
 Hardware capacity
The time complexity is given in terms of frequency count.
Frequency count is a count denoting number of times of execution of statement.

Measuring an input size


The input size of any instance of a problem is defined to be the number of
elements needed to describe that instance.
Measuring Running Time
 Count the number of times an algorithm’s basic operation is executed.
 Basic operation: the operation that contributes the most to the total running time.
 For example, the basic operation is usually the most time-consuming operation in
the algorithm’s innermost loop.

Problem Statement Input Size Basic operation


Searching a key element List of n elements Comparison of key with
from the list of n elements. every element of list.
Performing matrix The two matrices Actual multiplication of
multiplication with order n x n the element in the
matrices.
Computing GCD of two Two numbers Division
numbers

Then we compute total number of time taken by this basic operation. We


can compute the running time of basic operation by following formula.
T (n) ≈ Cop ∗ C (n)
Where,
T (n): Running time of basic opertation
Cop : Time taken by the basic operation to execute
C (n): number of times the operation needs to be executed
Order of Growth
For order of growth, consider only the leading term of a formula and ignore the
constant coefficient. The following is the table of values of several functions important
for analysis of algorithms.

Worst-case, Best-case, Average case efficiencies

Algorithm efficiency depends on the input size n. And for some algorithms efficiency
depends on type of input. We have best, worst & average case efficiencies.
Worst-case efficiency: Efficiency (number of times the basic operation will be executed)
for the worst case input of size n. i.e. The algorithm runs the longest among all possible
inputs of size n.
Best-case efficiency: Efficiency (number of times the basic operation will be executed)
for the best case input of size n. i.e. The algorithm runs the fastest among all possible
inputs of size n.
Average-case efficiency: Average time taken (number of times the basic operation will
be executed) to solve all the possible instances (random) of the input. NOTE: NOT the
average of worst and best case
Asymptotic Notation
To choose the best algorithm, we need to check efficiency of each algorithm. The
efficiency can be measured by computing time complexity of each algorithm. Asymptotic
notation is a shorthand way to represent the time complexity.
Using Asymptotic Notation we can give time complexity as “fastest possible”,
“slowest possible” or “average time”. Various notations such as Ω , O , θ used are called
asymptotic notations.
Big oh Notation
The Big oh notation is denoted by ‘O’. It is a method if representing the upper
bound of algorithm’s running time. Using Big oh notation we can give longest amount of
time taken by the algorithm to complete.
Definition
Let f(n) and g(n) be two non-negative functions. Let n0 denotes some value of
input and n > n0 . Similarly c is some constant such that c > 0. We can write
f(n) ≤ c * g(n)
then f(n) is big oh of g(n). It is also denoted as f(n) € O(g(n)). In other words f(n)
is less than g(n) if g(n) is multiple of some constant c.
Omega

Notation
The Omega notation is denoted by ‘Ω’. It is a method if representing the lower
bound of algorithm’s running time. Using Omega notation we can give shortest amount
of time taken by the algorithm to complete.
Definition
Let f(n) and g(n) be two non-negative functions. Let n0 denotes some value of
input and n > n0 . Similarly c is some constant such that c > 0. We can write
f(n) ≥ c * g(n)
It is also denoted as f(n) € Ω (g(n)). In other words f(n) is greater than g(n) if g(n)
is multiple of some constant c.

Θ Notation
The theta notation is denoted by Θ . By this method the running time is between
upper bound and lower bound.
Definition
Let f(n) and g(n) be two non negative functions. There are two positive constants
namely c1 and c2 such that
c1 g(n) ≤ f(n) ≤ c2 g(n)
Properties of Order of Growth

1. If f1 (n) is order of g1 (n), and f2 (n) is order of g2 (n), then f1 (n) + f2 (n) ∈ O(

max(g1 (n), g2(n)).

2. Ploynomials of degree k ∈ Θ(nk ), that is, ak nk + ak−1 nk−1 + · · · + a0 ∈


Θ(nk ).

Exponential functions an have different orders of growth for different a’s.

order log n < order nk where k > 0 < order an where a > 1 < order n! <
order nn

Properties of Big Oh Notation


Following are some important properties of Big oh notation.
1. If there are two functions f1(n) and f2(n) such that f1(n) = O(g1(n)) and f2(n)
=O(g2(n)) then f1(n) + f2(n) = max(O(g1(n)) , O(g2(n))).
2. If f(n) = O(g(n)) and g(n) = O(h(n)) then f(n) = O(h(n)).
3. Any function can be said as an order of itself. That is f(n) = O(f(n)).
4. Any constant value is equivalent to O(1). That is C = O(1) where C is a Constant.
5. It
lim f(n) = C implies that f(n) has the same order of growth as g(n)
n-∞ g(n) then f(n) € θ(g(n)).
6. It
lim f(n) = 0 implies that f(n) has a smaller order of growth than
g(n),
n-∞ g(n) then f(n) € O(g(n)).
7. It
lim f(n) = ∞ implies that f(n) has a larger order of growth than
g(n),
n-∞ g(n) then f(n) € Ω (g(n)).
Mathematical Analysis of Nonrecursive Algorithms
Plan for Analyzing Nonrecursive Algorithms

 Decide on parameter n indicating input size.


 Identify algorithm’s basic operation(s).
 Determine worst, average, and best cases for input of size n.
 Set up a sum for the number of times the basic operation is executed.
 Simplify the sum using standard formulas and rules

Example 1:

Finding the largest element in a list of n numbers

algorithm MaxElement(A[0..n − 1])

// Returns the maximum value in an array

// Input: A nonempty array A of real numbers

// Output: The maximum value in A

maxval ← A[0]

for i ← 1 to n − 1 do

if A[i] > maxval then

maxval ← A[i]

return maxval

Analysis

1. Input Size = n
2. Basic operation

A[i] > maxval => Comparision

maxval ← A[i] => Assignment Operaion

3. C(n) – No. of times the basic operation is executed


Mathematical Analysis of R e cursive Algorithms
Steps in mathematical analysis of recursive algorithms:
 Decide on parameter n indicating input size
 Identify algorithm’s basic operation
 Determine worst, average, and best case for input of size n
 Set up a recurrence relation and initial condition(s) for C(n)-the number of
times the basic operation will be executed for an input of size n
(alternatively count recursive calls)
 Solve the recurrence to obtain a closed form or estimate the order of
magnitude of the solution
Important recurrence types:
 Linear: One (constant) operation reduces problem size by one.
T(n)= T(n-1) + c
T(1) = d
Solution: T(n) = (n-1)c + d
 Quadratic: A pass through input reduces problem size by one.
T(n) = T(n-1) + cn
T(1) = d
Solution: T(n) = [n(n+1)/2 – 1] c + d
 Logarithmic: One (constant) operation reduces problem size by half.
T(n) = T(n/2) + c
T(1) = d
Solution: T(n) = c log n + d
 n log n: A pass through input reduces problem size by half.
T(n) = 2T(n/2) + cn
T(1) = d
Solution: T(n) = cn log n + d n
Example 1
n! = n*(n-1)!
0! = 1
T(n) = T(n-1) + 1
T(1) = 1
Telescoping:
T(n) = T(n-1) + 1
T(n-1) = T(n-2) + 1
T(n-2) = T(n-3) + 1

T(2) = T(1 ) + 1
Add the equations and cross equal terms on opposite sides:
T(n) = T(1) + (n-1) = n
Example 2: Binary search
T(n) = T(n/2) + 1
T(n/2) = T(n/4) + 1

T(2) = T(1) + 1
Add the equations and cross equal terms on opposite sides:
T(n) = T(1) + log(n) = O(log(n))
Master Theorem: A general divide-and-conquer recurrence
T(n) = aT(n/b) + f (n), where f (n) Î Θ(nk)
10. a < bk : T(n) Î Θ(nk)
11. a = bk : T(n) Î Θ(nk log n )
12. a > bk : T(n) Î Θ(n (log b a))
Note: the same results hold with O instead of Θ.
UNIT II BRUTE FORCE AND DIVIDE-AND-CONQUER 9
Brute Force – Closest-Pair and Convex-Hull Problems-Exhaustive Search – Traveling
Salesman Problem – Knapsack Problem – Assignment problem. Divide and conquer
methodology – Merge sort – Quick sort – Binary search – Multiplication of Large
Integers – Strassen’s Matrix Multiplication-Closest-Pair and Convex-Hull Problems.

Brute Force
Brute force is a straightforward approach to solving a problem, usually directly based on
the problem statement and definitions of the concepts involved.
A brute-force algorithm to find the divisors of a natural number n would enumerate all
integers from 1 to the square root of n, and check whether each of them divides n without
remainder. A brute-force approach for the eight queens puzzle would examine all possible
arrangements of 8 pieces on the 64-square chessboard, and, for each arrangement, check
whether each (queen) piece can attack any other.
While a brute-force search is simple to implement, and will always find a solution if it
exists, its cost is proportional to the number of candidate solutions – which in many
practical problems tends to grow very quickly as the size of the problem increases.
Therefore, brute-force search is typically used when the problem size is limited, or when
there are problem-specific heuristics that can be used to reduce the set of candidate
solutions to a manageable size. The method is also used when the simplicity of
implementation is more important than speed.

Closest-Pair Problem
The closest-pair problem calls for finding the two closest points in a set of n points. It is
the simplest of a variety of problems in computational geometry that deals with proximity
of points in the plane or higher-dimensional spaces. Points in question can represent such
physical objects as airplanes or post offices as well as database records, statistical
samples, DNA sequences, and so on. An air-traffic controller might be interested in two
closest planes as the most probable collision candidates. A regional postal service
manager might need a solution to the closestpair problem to find candidate post-office
locations to be closed. One of the important applications of the closest-pair problem is
cluster analysis
in statistics. Based on n data points, hierarchical cluster analysis seeks to organize them in
a hierarchy of clusters based on some similarity metric. For numerical data, this metric is
usually the Euclidean distance; for text and other nonnumerical data, metrics such as the
Hamming distance (see Problem 5 in this section’s exercises) are used.

ALGORITHM //Finds distance between two closest points in the plane by brute force
//Input: A list P of n (n ≥ 2) points p1(x1, y1), . . . , pn(xn, yn)
//Output: The distance between the closest pair of points
d←∞
for i ←1 to n − 1 do
for j ←i + 1 to n do
d ←min(d, sqrt((xi
− xj )2 + (yi
− yj )2)) //sqrt is square root
return d
The basic operation of the algorithm is computing the square root. In the age of
electronic calculators with a square-root button, one might be led to believe that
computing the square root is as simple an operation as, say, addition or multiplication. Of
course, it is not. For starters, even for most integers, square roots are irrational numbers
that therefore can be found only approximately. Moreover, computing such
approximations is not a trivial matter. But, in fact, computing square roots in the loop can
be avoided! (Can you think how?) The trick is to realize that we can simply ignore the
square-root function and compare the values(xi − xj )2 + (yi − yj )2 themselves.We can do
this because the smaller a number of which we take the square root, the smaller its square
root, or, as mathematicians say, the square-root function is strictly increasing. Then the
basic operation of the algorithm will be squaring a number. The number of times it will
be executed can be computed as follows:
C(n) =n−1i=1nj=i+12 = 2n−1i=1(n − i) = 2[(n − 1) + (n − 2) + . . . + 1]= (n − 1)n ∈
_(n2).

Convex-Hull Problems
A region (set of points) in the plane is convex if every line segment between
two points in the region is also in the region. The convex hull of a finite set of points P is
the smallest convex region containing P.
Theorem: The convex hull of a finite set of points P is a convex polygon
whose vertices is a subset of P.
The convex hull problem is finding the convex hull given P.
Idea for Solving Convex Hull
Consider the straight line that goes through two points Pi and Pj . Suppose there are
points in P on both sides of this line. – This implies that the line segment between Pi and
Pj is not on the boundary of the convex hull.
Suppose all the points in P are on one side of the line (or on the line). – This implies that
the line segment between Pi and Pj is on the boundary of the convex hull.
Development of Idea for Convex Hull
The straight line through Pi = (xi, yi) and Pj = (xj, yj) can be defined by a
nonzero solution for:
axi + b yi = c
axj + b yj = c
One solution is a = yj − yi, b = xi − xj , and c = xiyj − yixj . The line segment from Pi to
Pj is on the convex hull if either ax + b y ≥ c or ax + b y ≤ c is true for all the points.
Brute force algorithm is Θ(n3).
Exhaustive Search
Exhaustive search requires searching all the possible solutions (typically combinatorial
objects) for the best solution.
Exhaustive search is simply a brute-force approach to combinatorial problems.

Traveling Salesman Problem


The traveling salesman problem (TSP) has been intriguing researchers for the
last 150 years by its seemingly simple formulation, important applications, and interesting
connections to other combinatorial problems. In layman’s terms, the problem asks to find
the shortest tour through a given set of n cities that visits each city exactly once before
returning to the city where it started. The problem can be conveniently modeled by a
weighted graph, with the graph’s vertices representing the cities and the edge weights
specifying the distances. Then the problem can be stated as the problem of finding the
shortest Hamiltonian circuit of the graph.
Knapsack Problem
Here is another well-known problem in algorithmics. Given n items of known
weights w1, w2, . . . , wn and values v1, v2, . . . , vn and a knapsack of capacity W, find
the most valuable subset of the items that fit into the knapsack.
Assignment problem
The assignment problem is one of the fundamental combinatorial optimization problems
in the branch of optimization or operations research in mathematics. It consists of finding
a maximum weight matching in a weighted bipartite graph.
A problem that can be solved by exhaustive search, there are n people who need to be
assigned to execute n jobs, one person per job. (That is, each person is assigned to exactly
one job and each job is assigned to exactly one person.) The cost that would accrue if the
ith person is assigned to the jth job is a known quantity C[i, j ] for each pair i, j = 1, 2, . . .
, n. The problem is to find an assignment with the minimum total cost.

Divide and Conquer


 Many recursive algorithms take a problem with a given input and divide it
into one or more smaller problems.
 This reduction is repeatedly applied until the solutions of smaller problems
can be found quickly.
 This procedure is called divide and conquer algorithm.

Basic Steps

 Divide and Conquer is the most well known algorithm design strategy.
 Divide the problem into two or more smaller sub problems.
 Conquer the sub problems by solving them recursively.
 Combine the solutions to the sub problems into the solutions
for the original problem.
 The best case for the recursion is sub problems of constant size. Analysis
can be done using recurrence equations.

A typical Divide and Conquer case

A problem of size n

Sub problem m1 of Sub problem m2 of


size n/2 size n/2

A solution of sub
A solution of sub
problem 2
problem 1

Solution to the
Original problem

Algorithm

Algorithm DAndC ( P )
Begin
If small ( P ) then
return S(P)
else
Divide P into smaller instances P1 , P2 ,………. Pk ,
Apply DAndC to each of these sub problems
return combine (DAndC(P1) , (DAndC(P2) ,…… (DAndC(Pk)
End
Efficiency Analysis of Divide and Conquer
 The computing time of Divide and Conquer(DAndC) on any input of size
n is described by the following recurrence relation.
T(n) = g(n) n is small
= T(n1 + n2 + ……+ nk) + f(n) otherwise
 T(n) is the time for Divide and Conquer on any input size n.
 g(n) is the time to compute the answer directly.
 f(n) is the time for dividing P and combining the solutions.
Divide and Conquer Recurrence Relation
 Suppose that a recursive algorithm divides a problem of size n into number
of sub problems where each sub problem of size n/b.
 T(n) be the number of operations required to solve the problem of size n.
T(n) = T(1) n=1
aT(n/b) + f(n) n>1
where,
T(n) - Time for size n
a - number of sub instances
n/b - time for size n/b
f(n) - time required for dividing the problem into
sub problems.
Merge Sort
 Merge sort is one of the external sorting technique.
 Merge sort algorithm follows divide and conquer strategy.
 Given a sequence of ‘n’ elements A[1],A[2],………A[N].
 The basic idea behind the merge sort algorithm is to split the list into two
sub lists A[1],…….A[N/2] and A[(N/2)+1],…….A[N].
 If the list has even length, split the list into equal sub lists.
 If the list has odd length, divide the list in two by making the first sub list
one entry greater than the second sub list.
 Then split both the sub list is to two and go on until each of the sub lists
are of size one.
 Finally, start merging the individual sub list to obtain a sorted list.
 Time complexity of merge sort is Θ(n log n).
Algorithm
Algorithm mergesort(A[0,1,……n-1,low,high)
Begin
if low < high then
mid <- low + high / 2
mergesort(A , low , mid)
mergesort(A, mid+1, high)
combine(A, low, mid, high)
end if
End
Algorithm combine(A[0,1,…..n-1,low,mid,high)
Begin
k <- low
i <- low
j <- mid + 1
while ( i <= mid and j < = high) do
{
if A[i] <= A[j] then
temp[k] <- A[i]
i <- i + 1
k <- k + 1
else
temp[k] <- A[j]
j <- j + 1
k <- k + 1
end if
}
while( i <= mid) do
{
temp[k] <- A[i]
i <- i + 1
k <- k + 1
}
while( j <= high) do
{
temp[k] <- A[j]
j <- j + 1
k <- k + 1
}
End
Analysis
The recurrence relation for the merge sort is
T(n) = T(n/2) + T(n/2) + cn
Where
T(n/2) - time taken by left sublist to get sorted
T(n/2) - time taken by right sublist to get sorted
cn - time taken for combining two sublists

Solving Recurrence Relation


The recurrence relation is
T(n) = 2T(n/2) + cn
Assume n = 2k
T(2k) = 2T(2k /2) + c. 2k
T(2k) = 2T(2k-1) + c. 2k
T(2k) = 2 [ 2T(2k-2) + c. 2k-1) ] + c. 2k
T(2k) = 22T(2k-2) + 2. c. 2k-1 + c. 2k
T(2k) = 22T(2k-2) + 2. c. 2k /2 + c. 2k
T(2k) = 22T(2k-2) + 2. c. 2k
T(2k) = 23T(2k-3) + 3. c. 2k
.
.
T(2k) = 2kT(2k-k) + k. c. 2k
T(2k) = 2kT(1) + k. c. 2k
T(2k) = k. c. 2k [ T(1) = 0 ]
We assume n = 2k
Take log on both sides, we can get

Hence the average , best and worst case complexity of merge sort is O(n log n).
Quicksort
Quick Sort, as the name suggests, sorts any list very quickly. Quick sort is not stable
search, but it is very fast and requires very less aditional space. It is based on the rule
of Divide and Conquer(also called partition-exchange sort). This algorithm divides the
list into three main parts :
1. Elements less than the Pivot element
2. Pivot element
3. Elements greater than the pivot element

Algorithm
void quicksort(int a[], int p, int r)
{
if(p < r)
{
int q;
q = partition(a, p, r);
quicksort(a, p, q);
quicksort(a, q+1, r);
}
}

int partition(int a[], int p, int r)


{
int i, j, pivot, temp;
pivot = a[p];
i = p;
j = r;
while(1)
{
while(a[i] < pivot && a[i] != pivot)
i++;
while(a[j] > pivot && a[j] != pivot)
j--;
if(i < j)
{
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
else
{
return j;
}
}
}

An Example:

Worst Case Time Complexity : O(n2)


Best Case Time Complexity : O(n log n)
Average Time Complexity : O(n log n)
Space Complexity : O(n log n)

Binary Search
 Binary Search uses divide and conquer strategy. Divide and conquer
consists of following major phases.
 Breaking the problem into several sub problems that are similar
to the original problem but smaller in size.
 Solve the sub problem recursively.
 Combine these solutions to sub problems to create a solution to
the original problem.
 Binary search is an efficient searching method.
 An element which is to be searched from the list of element stored in
array A[0,1,….n-1] is called key element.
 Let A[m] be the mid element of array A.
 There are three conditions that needs to be tested while searching the array
using this method.
o If key = A[m] then the desired element is present in the list.
o If key < A[m] then search the left sub list.
o If key > A[m] then search the right sub list.
 This can be represented as
A[0],………A[m-1],A[m],A[m+1],……………..A[n-1]

key < A[m] key key > A[m]


Algorithm
Algorithm Binarysearch(a[0,1,……n-1],key)
Begin
low <- 0
high <- n-1
while low < high do
{
m <- low+high / 2
if key = A[m] then
return m
else if key < A[m] then
high <- m-1
else
low <- m+1
}
End

Analysis
 The basic operation in binary search is comparison of search key with the
array elements.
 To analyze the efficiency of binary search based on number of times the
search key gets compared with the array elements.
 The comparison is also called a three way comparison because algorithm
makes the comparison to determine whether key is smaller , equal to (or)
greater than A[m].
 The worst case complexity of binary search given by
Cworst (n) = Cworst (n/2) + 1 n>1
Cworst (1) = 1
Where
Cworst (n/2) - time required to compare left (or) right sub list
1 - one comparison made with middle element

Solving Recurrence Relation


The recurrence relation is
Cworst (n) = Cworst (n/2) + 1 …..(1)
Assume n = 2k where k= 1,2,………
Substitute n = 2k in equation 1,
Cworst (2k) = Cworst (2k /2) + 1
Cworst (2k) = Cworst (2k-1) + 1
Using the Backward Substitution method
Cworst (2k) = Cworst (2k-2) + 2
Cworst (2k) = Cworst (2k-3) + 3
.
.
.
Cworst (2k) = Cworst (2k-k) + k
= Cworst (20) + k
=1+k
Cworst (2k) = 1 + k
We assume n = 2k
Take log on both sides, we can get
log2n

Time Complexity of Binary Search is O(log n)

Multiplication of Large Integers


 a , b are both n-digit integers
 If we use the brute-force approach to compute c = a * b, what is the time
efficiency?
 a = a1a0 and b = b1b0
 c=a*b
= (a110n/2 + a0) * (b110n/2 + b0)
=(a1 * b1)10n + (a1 * b0 + a0 * b1)10n/2 + (a0 * b0)
For instance: a = 123456, b = 117933:
Then c = a * b = (123*103+456)*(117*103+933)
=(123 * 117)106 + (123 * 933 + 456 * 117)103 + (456 * 933)

- - -

The asymptotic advantage of this algorithm notwithstanding, how practical is it? The
answer depends, of course, on the computer system and program quality implementing
the algorithm, which might explain the rather wide disparity of reported results. On some
machines, the divide-and-conquer algorithm has been reported to outperform the
conventional method on numbers only 8 decimal digits long and to run more than twice
faster with numbers over 300 decimal digits long—the area of particular importance for
modern cryptography. Whatever this outperformance “crossover point” happens to be on
a particular machine, it is worth switching to the conventional algorithm after the
multiplicands become smaller than the crossover point. Finally, if you program in an
object-oriented language such as Java, C++, or Smalltalk, you should also be aware that
these languages have special classes for dealing with large integers.

Strassen’s Matrix Multiplication


The principal insight of the algorithm lies in the discovery that we can find the
product C of two 2 × 2 matrices A and B with just seven multiplications as opposed to the
eight required by the brute-force algorithm . This is accomplished by using the following
formulas:

C00 C01   A00 A01   B00 B01 


C  * 
 10 C11   A10 A11   B10 B11 
 M  M 4 M 5  M 7 M3  M5 
 1
 M2  M4 M 1  M 3  M 2  M 6 

M1=(A00+A11)*(B00+B11)
M2=(A10+A11)*B00
M3=A00*(B01-B11)
M4=A11*(B10-B00)
M5=(A00+A01)*B11
M6=(A10-A00)*(B00+B01)
M7=(A01-A11)*(B10+B11)
Thus, to multiply two 2 × 2 matrices, Strassen’s algorithm makes seven multiplications
and 18 additions/subtractions, whereas the brute-force algorithm requires eight
multiplications and four additions.
M(n)  (n2.807)
A(n)  (n2.807)
In other words, the number of additions has the same order of growth as the number of
multiplications. This puts Strassen’s algorithm in _(nlog2 7), which is a better efficiency
class than _(n3) of the brute-force method.

Closest Pair Problem


The brute force algorithm checks the distance between every pair of points and keep track
of the min. The cost is O(n(n-1)/2), quadratic.
The general approach of a merge-sort like algorithm is to sort the points along the x-
dimensions then recursively divide the array of points and find the minimum. The only
trick is that we must check distance between points from the two sets. This could have
quadratic cost if we checked each point with the other. But, is there is only a finite
number of points then cost could be less.
Alogrithm
0. Initially sort the n points, Pi = (xi, yi) by their x dimensions.
1. Then recursively divide the n points, S1 = {P1,...,Pn/2} and S2 = {Pn/2+1,...,Pn}
so that S1 points are two the left of x = xn/2 and S2 are to the right of x = xn/2. Cost
is O(1) for each recursive call
2. Recursively find the closest pair in each
set, d1 of S1 and d2 for S2, d = min(d1, d2). Cost is O(1) for each recursive call.
Note that d is not the solution because the closest pair could be a pair
between the sets, meaning on from each set.These points must lie in the
vertical stripe described by x = xn/2-d and x = xn/2+d. Draw the diagram.

3. We must check all the S1 points lying in this strip to every S2 points in the strip,
and get closest distance dbetween
Note that there can be only 6 S2 points. Note the points must lie also [yi -
d, yi + d]. Illustrate the worst case. So the time for this step is Θ(6n/2) =
Θ(3n). Draw diagram showing the six points in S2 with respect to the point
in S1.
4. To accomplish this we also need to sort the points along the y dimensions. We
do not want to a sort from scratch for each recursive division. So we use a merge
sort approach and the cost is of maintaining the sort along y is O(n).
4. Then the minimum distance is minimum distance is min(d, dbetween)
The recursive relation is
T(n) = 2T(n/2) + M(n), where M(n) is linear in n.
Using Master's Theorem (a =2, b = 2, d = 1)
T(n) ε O(n lg n)
Note that it has been shown that the best that can be done is Ω(n lg n). So we
have found one of the best solutions.

Convex-Hull Problem
Recall the convex hull is the smallest polygon containing all the points in a set, S,
f n points Pi = (xi, yi). The set of vertices defines the polygon and the points of the
vertices are found in the original set of points.
Recall the brute force algorithm. Make all possible lines from pairs of points and then
check if the rest of the points are all on the same side of the line. How much? There
are n(n-1)/2 such lines and then we check with n-2 remaining points. So the cost is cubic.
Algorthim
1. Sort the set of points, S, by the x-dimension with ties resolved by the y-
dimension.
2. Identify the first and last points of the sort P1 and Pn
Note P1 and Pn are vertices of the hull
The ray P1Pn divides S into sets of points, by points left (S1) or right (S2)
of the line, defined later.
We need to find the upper and lower hulls. We'll do this recursively.
Note also that S1 or S2 could be empty sets.
3. For S1 find the Pmax which is the maximum distance from line P1Pn, tires can be
resolved by the point that maximizes the angle PmaxP1Pn.
Note that the ray P1Pmax divides points of S1 into left and right sets. The
left points are S11.
Also PmaxPn identifies the left points S12 of S1
Pmax is vertex of the hull
The points inside the triangle P1PmaxPn cannot be vertices of the hull
There are no points to the left of both P1Pmax and PmaxPn
5. Recursively find the upper hull of the union of P1, S11 and Pmax, and the union
of Pmax, S12, and Pn
6. Do the like to find the lower hull

We need to identify if point (x3, y3) is left or right of the ray defined by points (x1, y1) and
(x2, y2). We use the sign of the determinate
│x1 y1 1│
│x2 y2 1│
│x3 y3 1│
Which has value of the area of the triangle with sign determine by order of the three
points. The sign has the properties we need. Sorting along the x-dimensions
cost Θ(n lg n). Finding Pmax cost Θ(n). Cost of determining the sets S1, S2, S11,
and S12 are each Θ(n).
How many recursive call in the worst case? O(n).
The worst case cost is Θ(n2) which beats the brute force O(n3)

We expect the average case to do much better because of the divide and conquer
approach, much like quick sort does. In addition for any reasonable and random
distribution of points many points in the triangle are eliminated. In fact for randomly
chosen points in a circle the average case cost is linear.
UNIT III DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE 9
Computing a Binomial Coefficient – Warshall’s and Floyd’ algorithm – Optimal Binary
Search Trees – Knapsack Problem and Memory functions. Greedy Technique– Prim’s
algorithm- Kruskal’s Algorithm- Dijkstra’s Algorithm-Huffman Trees.

Definition
Dynamic programming (DP) is a general algorithm design technique for solving problems
with overlapping sub-problems. This technique was invented by American mathematician
“Richard Bellman” in 1950s.
Key Idea
The key idea is to save answers of overlapping smaller sub-problems to avoid re-
computation.
Dynamic Programming Properties
• An instance is solved using the solutions for smaller instances.
• The solutions for a smaller instance might be needed multiple times, so store their
results in a table.
• Thus each smaller instance is solved only once.
• Additional space is used to save time.
Dynamic Programming vs. Divide & Conquer
LIKE divide & conquer, dynamic programming solves problems by combining solutions
to sub-problems. UNLIKE divide & conquer, sub-problems are NOT independent in
dynamic programming.
Divide & Conquer Dynamic Programming

1. Partitions a problem into independent 1. Partitions a problem into


smaller sub-problems overlapping sub-problems

2. Doesn’t store solutions of sub-


2. Stores solutions of sub-
problems. (Identical sub-problems
problems: thus avoids
may arise - results in the same
calculations of same quantity
computations are performed
twice
repeatedly.)
3. Bottom up algorithms: in which
3. Top down algorithms: which logically the smallest sub-problems are
progresses from the initial instance explicitly solved first and the
down to the smallest sub-instances via results of these used to construct
intermediate sub-instances. solutions to progressively larger
sub-instances
Dynamic Programming vs. Divide & Conquer: EXAMPLE
Computing Fibonacci Numbers
1. Using standard recursive formula:

0 if n=0
F(n) = 1 if n=1
F(n-1) + F(n-2) if n >1
Algorithm F(n)
// Computes the nth Fibonacci number recursively by using its definitions
// Input: A non-negative integer n
// Output: The nth Fibonacci number
if n==0 || n==1 then
return n
else
return F(n-1) + F(n-2)
Algorithm F(n): Analysis
• Is too expensive as it has repeated calculation of smaller Fibonacci numbers.
• Exponential order of growth.

F(n)

F(n-1) + F(n-2)

F(n-2) + F(n-3) F(n-3) + F(n-4)

Using Dynamic Programming:


Algorithm F(n)
// Computes the nth Fibonacci number by using dynamic programming method
// Input: A non-negative integer n
// Output: The nth Fibonacci number
A[0] 0
A[1] 1
for i 2 to n do
A[i] A[i-1] + A[i-2]
return A[n]
Algorithm F(n): Analysis
• Since it caches previously computed values, saves time from repeated
computations of same sub-instance
• Linear order of growth
Rules of Dynamic Programming
1. Optimal Sub-Structure: An optimal solution to a problem contains optimal
solutions to sub-problems
2. Overlapping Sub-Problems: A recursive solution contains a “small” number of
distinct sub-problems repeated many times
3. Bottom Up Fashion: Computes the solution in a bottom-up fashion in the final
step
Three basic components of Dynamic Programming solution
The development of a dynamic programming algorithm must have the following three
basic components
1. A recurrence relation
2. A tabular computation
3. A backtracking procedure
Example Problems that can be solved using Dynamic Programming method
1. Computing binomial co-efficient
2. Compute the longest common subsequence
3. Warshall’s algorithm for transitive closure
4. Floyd’s algorithm for all-pairs shortest paths
5. Some instances of difficult discrete optimization problems like
Knapsack Problem
Traveling Salesperson Problem
Binomial Co-efficient
Definition:
The binomial co-efficient C(n,k) is the number of ways of choosing a subset of k
elements from a set of n elements.
1. Factorial definition
For non-negative integers n & k, we have
C(n,k) = n!/ k! (n-k)!
if k ∈
= (n(n-1)… (n-k+1))/ k(k-1)…1 {0,1,…,n}
And
C(n,k) = 0 if k > n

2. Recursive definition

1 if k=0
C(n,k) = 1 if n = k
C(n-1,k-1) + C(n-1, k) if n>k>0

Solution
Using crude Divide & Conquer method we can have the algorithm as follows:
Algorithm binomial (n, )
if k==0 || k==n
return 1
else
return binomial(n-1, k) + binomial(n-1, k-1)
Algorithm binomial (n, k): Analysis
• Re-computes values large number of times
• In worst case (when k=n/2), we have O(2n/n) efficiency
4,2

3,2 + 3,1

2,2 + 2,1 2,1 + 2,0


1,1 1,0 1,0

1,1

Using Dynamic Programming method: This approach stores the value of C(n,k) as they
are computed i.e. Record the values of the binomial co-efficient in a table of n+1 rows
and k+1 columns, numbered from 0 to n and 0 to k respectively.

Table for computing binomials is as follows:

0 1 2 … k-1 k

0 1
1 1 1
2

n-1 C(n-1, k-1) C(n-1, k)
n C(n, k)

Algorithm binomial (n, k)


// Computes C(n, k) using dynamic programming
// input: integers n≥k≥0
// output: The value of C(n, k)
for i 0 to n do
for j 0 to min (i, k) do

if j == 0 or j == i
then A[i, j]
else
A[i, j] A[i-1, j-1] + A[i-1, j]
return A[n, k]
Algorithm binomial (n, k): Analysis
• Input size: n, k
• Basic operation: Addition
• Let A(n, k) be the total number of additions made by the algorithm in computing
C(n,k)
• The first k+1 rows of the table form a triangle while the remaining n-k rows form
a rectangle. Therefore we have two parts in A(n,k).

Warshall’s Algorithm
• Directed Graph: A graph whose every edge is directed is called directed graph
OR digraph
• Adjacency matrix: The adjacency matrix A = {aij} of a directed graph is the
boolean matrix that has
1 - if there is a directed edge from ith vertex to the jth vertex
0 - Otherwise
• Transitive Closure: Transitive closure of a directed graph with n vertices can be
defined as the n-by-n matrix T={tij}, in which the elements in the ith row (1≤ i ≤
n) and the jth column(1≤ j ≤ n) is 1 if there exists a nontrivial directed path (i.e., a
directed path of a positive length) from the ith vertex to the jth vertex, otherwise
tij is 0.
The transitive closure provides reach ability information about a digraph.
Computing Transitive Closure:
• We can perform DFS/BFS starting at each vertex
• Performs traversal starting at the ith vertex.
• Gives information about the vertices reachable from the ith vertex
• Drawback: This method traverses the same graph several times.
• Efficiency : (O(n(n+m))
• Alternatively, we can use dynamic programming: the Warshall’s Algorithm
Underlying idea of Warshall’s algorithm:
• Let A denote the initial boolean matrix.
• The element r(k) [ i, j] in ith row and jth column of matrix Rk (k = 0, 1, …, n) is
equal to 1 if and only if there exists a directed path from ith vertex to jth vertex
with intermediate vertex if any, numbered not higher than k
• Recursive Definition:
• Case 1:
A path from vi to vj restricted to using only vertices from {v1,v2,…,vk} as
intermediate vertices does not use vk, Then
R(k) [ i, j ] = R(k-1) [ i, j ].
• Case 2:
A path from vi to vj restricted to using only vertices from {v1,v2,…,vk} as
intermediate vertices do use vk. Then
R(k) [ i, j ] = R(k-1) [ i, k ] AND R(k-1) [ k, j ].
We conclude:
R(k)[ i, j ] = R(k-1) [ i, j ] OR (R(k-1) [ i, k ] AND R(k-1) [ k, j ] )
NOTE:
• If an element rij is 1 in R(k-1), it remains 1 in R(k)
• If an element rij is 0 in R(k-1), it has to be changed to 1 in R(k) if and only if the
element in its row I and column k and the element in its column j and row k are
both 1’s in R(k-1)

Algorithm Warshall(A[1..n, 1..n])


// Computes transitive closure matrix
// Input: Adjacency matrix A
// Output: Transitive closure matrix R
R(0) A
for k 1 to n do
for i 1 to n do
for j 1 to n do
R(k)[i, j] R(k-1)[i, j] OR (R(k-1)[i, k] AND R(k-1)[k, j] )

return R(n)
Example:

Find Transitive closure for the given digraph using Warshall’s algorithm.

A C

D
B

Solution:

R(0) = A B C D
A 0 0 1 0
B 1 0 0 1
C 0 0 0 0
D 0 1 0 0
R(0) k=1 A B C D A B C D
Vertex 1 A 0 0 1 0 A 0 0 1 0
can be B 1 0 0 1 B 1 0 1 1
intermediate C 0 0 0 0 C 0 0 0 0
node D 0 1 0 0 D 0 1 0 0

R1[2,3]
= R0[2,3] OR
R0[2,1] AND R0[1,3]
= 0 OR ( 1 AND 1)
=1
R(1) k=2 A B C D A B C D
Vertex A 0 0 1 0 A 0 0 1 0
{1,2 } can B 1 0 1 1 B 1 0 1 1
be C 0 0 0 0 C 0 0 0 0
intermediate D 0 1 0 0 D 1 1 1 1
nodes R2[4,1]
= R1[4,1] OR
R1[4,2] AND R1[2,1]
= 0 OR ( 1 AND 1)
=1

R2[4,3]
= R1[4,3] OR
R1[4,2] AND R1[2,3]
= 0 OR ( 1 AND 1)
=1

R2[4,4]
= R1[4,4] OR
R1[4,2] AND R1[2,4]
= 0 OR ( 1 AND 1)
=1

R(2) k=3 A B C D A B C D
Vertex A 0 0 1 0 A 0 0 1 0
{1,2,3 } can B 1 0 1 1 B 1 0 1 1
be C 0 0 0 0 C 0 0 0 0
intermediate D 1 1 1 1 D 1 1 1 1
nodes
NO CHANGE
R(3) k=4 A B C D A B C D
Vertex A 0 0 1 0 A 0 0 1 0
{1,2,3,4 } B 1 0 1 1 B 1 1 1 1
can be C 0 0 0 0 C 0 0 0 0
intermediate D 1 1 1 1 D 1 1 1 1
nodes
R4[2,2]
= R3[2,2] OR
R3[2,4] AND R3[4,2]
= 0 OR ( 1 AND 1)
=1

R(4) A B C D
A 0 0 1 0 TRANSITIVE CLOSURE
B 1 1 1 1 for the given graph
C 0 0 0 0
D 1 1 1 1

Efficiency:
• Time efficiency is Θ(n3)
• Space efficiency: Requires extra space for separate matrices for recording
intermediate results of the algorithm.

All Pair Shortest Path Algorithm ( Floyd’s Algorithm )


 The all pair shortest path algorithm is invented by R.Floyd’s. So it is called as
Floyd’s algorithm.
 This algorithm is useful for finding the shortest path between every pair of
vertices of a graph.
 It works for both undirected and directed graphs.

Weighted Graph

 The weighted graph is a graph in which weight (or) distance are given along
the edges.
 The weighted graph can be represented by weighted matrix.
W[i][j]=0 if i=j
W[i][j]=∞ if there is no edge between the vertices
‘i’ and ‘j’
W[i][j]= weight of the edge

Concepts of Floyd’s Algorithm

 Floyd’s algorithm is for computing shortest path between every pair of


vertices of a graph.
 The graph may contain negative edges but it should not contain negative
cycles.
 This algorithm requires a weighted graph.
 The floyd’s algorithm computes the distance matrix of a weighted graph with
‘n’ vertices through a series of n by n matrices.

D(0), D(1), D(2), ……………., D(k-1),………. D(n)

 The series starts with D(0) with no intermediate vertex. D(0) is a matrix in
which Vi and Vj .
 In D(1) matrix, the shortest distance going through one intermediate vertex
with maximum path length of two edges is given continuing in this fashion.
We will compute D(n) containing the length of shortest path among all paths
that can use all ‘n’ vertices as intermediate.
 Finally, we can get all pair shortest path from matrix D(n)

Example

Find the All pair shortest path for the following graph.

5
1 1

8 2 1

Step 1:

 First compute the weighted matrix with no intermediate vertex.

1 2 3
(0)
D 1 0 8 5
2 2 0 ∞
3 ∞ 1 0

 D(0) is the weighted matrix (or) adjacency matrix for the given graph.

Step 2:

 Now the node 1 will be considered as the intermediate node. D(1) can be
calculated by using the intermediate vertex as 1.

1 2 3
D(1) 1 0 8 5
2 2 0 7
3 ∞ 1 0
 In this matrix there is no edge between the vertex 2 and 3. But using the
intermediate vertex we can travel from 2 to 3 with the shortest path is
213=2+5=7
 So we can replace the 2nd row and 3rd column by 7.

Step 3:

 Now the node 2 will be considered as the intermediate node. D(2) can be
calculated by using the intermediate vertex as 1 and 2.

1 2 3
D(2) 1 0 8 5
2 2 0 7
3 3 1 0

 The shortest path between vertices 3 and 1 is 3  2  1 = 3 using the


intermediate vertex 1.

Step 4:

 Now the node 3 will be considered as the intermediate node. D(3) can be
calculated by using the intermediate vertex as 1,2 and 3.

1 2 3
D(3) 1 0 6 5
2 2 0 7
3 3 1 0

 The shortest path between 1  2 is


1  2 =8
132=6
 The matrix D(3) is representing shortest path for all pair of vertices.

Algorithm

Algorithm AllPair(cost, A, n)
Begin
for i=1 to n do
for j=1 to n do
A[i,j] = cost[I,j]
end for
end for
for k=1 to n do
for i=1 to n do
for j=1 to n do
A[i,j] = min(A[i,j] , A[i,k] + A[k,j])
end for
end for
end for
End
Analysis
 The basic operation is
D [ i , j ] = min { D [ i , j ] , D [ i , k ] + D [ k , j ] }
 It has three nested for loops
n n n

C(n) = ∑ ∑ ∑ 1
k=1 j=1 i=1
C(n) = n3
 The time complexity of finding all pair shortest path is θ(n3)

0/1 Knapsack Problem

Definition
Given a set of n items of known weights w1,…,wn and values v1,…,vn and a knapsack of
capacity W, the problem is to find the most valuable subset of the items that fit into the
knapsack.
Knapsack problem is an OPTIMIZATION PROBLEM

Dynamic programming approach to solve knapsack problem


Step 1:
Identify the smaller sub-problems. If items are labeled 1..n, then a sub-problem would be to
find an optimal solution for Sk = {items labeled 1, 2, .. k}

Step 2:
Recursively define the value of an optimal solution in terms of solutions to smaller
problems.
Initial conditions:
V[ 0, j ] = 0 for j ≥ 0

V[ i, 0 ] = 0 for i ≥ 0

Recursive step:
max { V[ i-1, j ], vi +V[ i-1, j - wi ] }
V[ i, j ] = if j - wi ≥ 0
V[ i-1, j ] if j - wi < 0

Step 3:
Bottom up computation using iteration

Example

Apply bottom-up dynamic programming algorithm to the following instance of the


knapsack problem Capacity W= 5

Item # Weight (Kg) Value (Rs.)

1 2 3
2 3 4
3 4 5
4 5 6

Solution:
Using dynamic programming approach, we have:
Step Calculation Table
1 Initial conditions:
V[ 0, j ] = 0 for j ≥ 0 V[i,j] j=0 1 2 3 4 5
V[ i, 0 ] = 0 for i ≥ 0 i=0 0 0 0 0 0 0
1 0
2 0
3 0
4 0
2 W1 = 2,
Available knapsack capacity = 1 V[i,j] j=0 1 2 3 4 5
W1 > WA, CASE 1 holds: i=0 0 0 0 0 0 0
V[ i, j ] = V[ i-1, j ] 1 0 0
V[ 1,1] = V[ 0, 1 ] = 0 2 0
3 0
4 0
3 W1 = 2,
Available knapsack capacity = 2 V[i,j] j=0 1 2 3 4 5
W1 = WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3
vi +V[ i-1, j - wi ] } 2 0
V[ 1,2] = max { V[ 0, 2 ], 3 0
3 +V[ 0, 0 ] }
4 0
= max { 0, 3 + 0 } = 3
4 W1 = 2,
Available knapsack capacity = V[i,j] j=0 1 2 3 4 5
3,4,5 i=0 0 0 0 0 0 0
W1 < WA, CASE 2 holds: 1 0 0 3 3 3 3
V[ i, j ] = max { V[ i-1, j ], 2 0
vi +V[ i-1, j - wi ] } 3 0
V[ 1,3] = max { V[ 0, 3 ],
4 0
3 +V[ 0, 1 ] }
= max { 0, 3 + 0 } = 3
5 W2 = 3,
Available knapsack capacity = 1 V[i,j] j=0 1 2 3 4 5
W2 >WA, CASE 1 holds: i=0 0 0 0 0 0 0
V[ i, j ] = V[ i-1, j ] 1 0 0 3 3 3 3
V[ 2,1] = V[ 1, 1 ] = 0 2 0 0
3 0
4 0
6 W2 = 3,
Available knapsack capacity = 2 V[i,j] j=0 1 2 3 4 5
W2 >WA, CASE 1 holds: i=0 0 0 0 0 0 0
V[ i, j ] = V[ i-1, j ] 1 0 0 3 3 3 3
V[ 2,2] = V[ 1, 2 ] = 3 2 0 0 3
3 0
4 0
7 W2 = 3,
Available knapsack capacity = 3 V[i,j] j=0 1 2 3 4 5
W2 = WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4
V[ 2,3] = max { V[ 1, 3 ], 3 0
4 +V[ 1, 0 ] } 4 0
= max { 3, 4 + 0 } = 4
8 W2 = 3,
Available knapsack capacity = 4 V[i,j] j=0 1 2 3 4 5
W2 < WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4 4
V[ 2,4] = max { V[ 1, 4 ], 3 0
4 +V[ 1, 1 ] } 4 0
= max { 3, 4 + 0 } = 4

9 W2 = 3,
Available knapsack capacity = 5 V[i,j] j=0 1 2 3 4 5
W2 < WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4 4 7
V[ 2,5] = max { V[ 1, 5 ], 3 0
4 +V[ 1, 2 ] }
4 0
= max { 3, 4 + 3 } = 7

10 W3 = 4,
Available knapsack capacity = V[i,j] j=0 1 2 3 4 5
1,2,3 i=0 0 0 0 0 0 0
W3 > WA, CASE 1 holds: 1 0 0 3 3 3 3
V[ i, j ] = V[ i-1, j ] 2 0 0 3 4 4 7
3 0 0 3 4
4 0
11 W3 = 4,
Available knapsack capacity = 4 V[i,j] j=0 1 2 3 4 5
W3 = WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4 4 7
V[ 3,4] = max { V[ 2, 4 ], 3 0 0 3 4 5
5 +V[ 2, 0 ] }
4 0
= max { 4, 5 + 0 } = 5

12 W3 = 4,
Available knapsack capacity = 5 V[i,j] j=0 1 2 3 4 5
W3 < WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4 4 7
V[ 3,5] = max { V[ 2, 5 ], 3 0 0 3 4 5 7
5 +V[ 2, 1 ] }
4 0
= max { 7, 5 + 0 } = 7

13 W4 = 5,
Available knapsack capacity = V[i,j] j=0 1 2 3 4 5
1,2,3,4 i=0 0 0 0 0 0 0
W4 < WA, CASE 1 holds: 1 0 0 3 3 3 3
V[ i, j ] = V[ i-1, j ] 2 0 0 3 4 4 7
3 0 0 3 4 5 7
4 0 0 3 4 5
14 W4 = 5,
Available knapsack capacity = 5 V[i,j] j=0 1 2 3 4 5
W4 = WA, CASE 2 holds: i=0 0 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] } 2 0 0 3 4 4 7
V[ 4,5] = max { V[ 3, 5 ], 3 0 0 3 4 5 7
6 +V[ 3, 0 ] }
4 0 0 3 4 5 7
= max { 7, 6 + 0 } = 7

Maximal value is V [ 4, 5 ] = 7/-

What is the composition of the optimal subset?


The composition of the optimal subset if found by tracing back the computations
for the entries in the table.
Step Table Remarks
1
V[i,j] j=0 1 2 3 4 5 V[ 4, 5 ] = V[ 3, 5 ]
i=0 0 0 0 0 0 0
1 0 0 3 3 3 3 ITEM 4 NOT included in the
2 0 0 3 4 4 7 subset
3 0 0 3 4 5 7
4 0 0 3 4 5 7
2
V[i,j] j=0 1 2 3 4 5 V[ 3, 5 ] = V[ 2, 5 ]
i=0 0 0 0 0 0 0
1 0 0 3 3 3 3 ITEM 3 NOT included in the
2 0 0 3 4 4 7 subset
3 0 0 3 4 5 7
4 0 0 3 4 5 7
3
V[i,j] j=0 1 2 3 4 5 V[ 2, 5 ] ≠ V[ 1, 5 ]
i=0 0 0 0 0 0 0
1 0 0 3 3 3 3 ITEM 2 included in the subset
2 0 0 3 4 4 7
3 0 0 3 4 5 7
4 0 0 3 4 5 7
4 Since item 2 is included in the knapsack:
Weight of item 2 is 3kg, therefore,
remaining capacity of the knapsack is
(5 - 3 =) 2kg V[ 1, 2 ] ≠ V[ 0, 2 ]

V[i,j] j=0 1 2 3 4 5 ITEM 1 included in the subset


i=0 0 0 0 0 0 0
1 0 0 3 3 3 3
2 0 0 3 4 4 7
3 0 0 3 4 5 7
4 0 0 3 4 5 7
5 Since item 1 is included in the knapsack: Optimal subset: { item 1, item 2 }
Weight of item 1 is 2kg, therefore,
remaining capacity of the knapsack is Total weight is: 5kg (2kg + 3kg)
(2 - 2 =) 0 kg. Total profit is: 7/- (3/- + 4/-)

Efficiency:
• Running time of Knapsack problem using dynamic programming algorithm is: O(
n*W)
• Time needed to find the composition of an optimal solution is: O( n + W )
Memory Function
• Memory function combines the strength of top-down and bottom-up approaches
• It solves ONLY sub-problems that are necessary and does it ONLY ONCE.
The method:
• Uses top-down manner.
• Maintains table as in bottom-up approach.
• Initially, all the table entries are initialized with special “null” symbol to indicate
that they have not yet been calculated.
• Whenever a new value needs to be calculated, the method checks he
corresponding entry in the table first:
• If entry is NOT “null”, it is simply retrieved from the table.
• Otherwise, it is computed by the recursive call whose result is then recorded in the
table.
Algorithm:
Algorithm MFKnap( i, j ) if
V[ i, j] < 0
if j < Weights[ i ]
value MFKnap( i-1, j )
else
value max {MFKnap( i-1, j ),
Values[i] + MFKnap( i-1, j - Weights[i] )}
V[ i, j ] value return
V[ i, j]
Example:
Apply memory function method to the following instance of the knapsack problem
Capacity W= 5

Item # Weight (Kg) Value (Rs.)

1 2 3
2 3 4
3 4 5
4 5 6
Computation Remarks
1 Initially, all the table entries are initialized
with special “null” symbol to indicate that V[i,j] j=0 1 2 3 4 5
they have not yet been calculated. Here i=0 0 0 0 0 0 0
null is indicated with -1 value. 1 0 -1 -1 -1 -1 -1
2 0 -1 -1 -1 -1 -1
3 0 -1 -1 -1 -1 -1
4 0 -1 -1 -1 -1 -1
2 MFKnap( 4, 5 )
V[ 1, 5 ] = 3
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 )
V[i,j] j=0 1 2 3 4 5
5 + MFKnap( 2, 1 )
i=0 0 0 0 0 0 0
MFKnap( 2, 5 )
1 0 -1 -1 -1 -1 3
2 0 -1 -1 -1 -1 -1
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 ) 3 0 -1 -1 -1 -1 -1
0 3
4 0 -1 -1 -1 -1 -1
MFKnap( 0, 5 ) 3 + MFKnap( 0, 3 )
0 3+0

3 MFKnap( 4, 5 )
V[ 1, 2 ] = 3
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 )
V[i,j] j=0 1 2 3 4 5
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 )
i=0 0 0 0 0 0 0
1 0 -1 3 -1 -1 3
2 0 -1 -1 -1 -1 -1
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 )
3
3 0 -1 -1 -1 -1 -1
3 0
4 0 -1 -1 -1 -1 -1
MFKnap( 0, 2 ) 3 + MFKnap( 0, 0 )
0 3+0

4 MFKnap( 4, 5 )
V[ 2, 5 ] = 7

MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 ) V[i,j] j=0 1 2 3 4 5


i=0 0 0 0 0 0 0
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 ) 1 0 -1 3 -1 -1 3
7
3 2 0 -1 -1 -1 -1 7
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 ) 3 0 -1 -1 -1 -1 -1
3 3 4 0 -1 -1 -1 -1 -1
5 MFKnap( 4, 5 )
V[ 2, 1 ] = 0
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 ) V[ 3, 5 ] = 7
5
7
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 )
V[i,j] j=0 1 2 3 4 5
7 0
i=0 0 0 0 0 0 0
1 0 0 3 -1 -1 3
MFKnap( 1, 1 )
2 0 0 -1 -1 -1 7
0
3 0 -1 -1 -1 -1 7
MFKnap( 0, 1 )
4 0 -1 -1 -1 -1 -1
0
6 7
MFKnap( 4, 5 ) V[ 4, 5 ] = 7
7 6
V[i,j] j=0 1 2 3 4 5
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 ) i=0 0 0 0 0 0 0
7 0 1 0 0 3 -1 -1 3
2 0 0 -1 -1 -1 7
3 0 -1 -1 -1 -1 7
4 0 -1 -1 -1 -1 7

The composition of the optimal subset if found


by tracing back the computations for the
entries in the table as done with the early
knapsack problem

Conclusion:
Optimal subset: { item 1, item 2 }

Total weight is: 5kg (2kg + 3kg)


Total profit is: 7/- (3/- + 4/-)

Efficiency:
• Time efficiency same as bottom up algorithm: O( n * W ) + O( n + W )
• Just a constant factor gain by using memory function
• Less space efficient than a space efficient version of a bottom-up algorithm

53
Optimal Binary Search Tree
 We are searching a word from a dictionary. For every required word we are
looking in the dictionary then it becomes a time consuming process.
 To perform this lookup more efficiently we can build the binary search tree of
common words as key elements.
 We can make the binary search tree efficient by arranging frequently used words
nearer to the root and less frequently used words away from the root.
 Searching process is an arrangement of BST is more simplified as well as
efficient.
 The optimal binary search tree technique is invented for this approach.
 The element having more probability of appearance should be placed nearer to the
root of the BST.
 The element with lesser probability should be placed away from the root.
 The BST created with such kind of arrangement is called as an Optimal Binary
Search Tree.
 Let [ a1, a2,…. an] be the set of identifiers such that a1 ≤ a2 ≤ a3
 Let P(i) be the probability with which we can search for ai (successful search ).
 Let qi be the probability of searching an element such that ai < x < ai+1 where
0 ≤ i ≤ n . ( unsuccessful search )
 The tree which is build with optimum cost from
n n
∑ P(i) and ∑ qi is called OBST.
i=1 i=1
 To obtain the OBST for the key values using dynamic programming, we will
compute the cost of the tree Cij and the Root of the tree Rij.
 Formula for calculating C [ i , j ]
j

C [ i , j ] = min { C [ i , k-1 ] + C [ k+1 , j ] } + ∑ Ps where 1 ≤ i ≤ j ≤ n

s=i
 Assume that
C[i,i–1]=0 ¥ i ranging from 1 to n + 1

54
C [ i , i ] = Pi where 1 ≤ i ≤ n
 Optimum Binary Search Tree consists of two tables. Such as,
o Cost table
o Root table
 The cost table should be constructed in this fashion. Assume n = 3

0 1 2 3 ( j ranging from 0 to n )
i ranging 1 0 P1
from 1 to 2 0 P2
n+1 3 0 P3
4 0
 The root table should be constructed in this fashion.
1 2 3
1 i
2 i
3 i
 Fill up R [ i , i ] by i
 Fill up R [ i , j ] by k value which is obtained as minimum.
 The table should be filled up diagonally.
Example
Find the OBST for the following nodes
do if int while
Probabilities : 0.1 0.2 0.4 0.3
Solution
First, number out the nodes ( do, if, int, while ) as 1, 2, 3, 4 respectively.
There are 4 nodes. So n=4
1 2 3 4
Probabilities : 0.1 0.2 0.4 0.3

55
Step 1
Initial Table
0 1 2 3 4
1
0 0.1
2 0 0.2
0 0.4
3
4 0 0.3
0
5
C[ 1 , 0 ] = 0
C[ 2 , 1 ] = 0
C[ 3 , 2 ] = 0 C[ i , i-1 ] = 0 & C[ n +1, n ] = 0
C[ 4 , 3 ] = 0
C[ 5 , 4 ] = 0

C[ 1 , 1 ] = 0.1
C[ 2 , 2 ] = 0.2 C [ i , i ] = P[i]
C[ 3 , 3 ] = 0.4
C[ 4 , 4 ] = 0.3

Root Table 1 2 3 4
1
1
2 2
3 3

4 4

R [ 1, 1 ] = 1
R [ 2, 2 ] = 2 R[i,i]=i
R [ 3 , 3] = 3
R [ 4, 4 ] = 4
Compute C[ i , j ]
j

C [ i , j ] = min { C [ i , k-1 ] + C [ k+1 , j ] } + ∑ Ps where 1 ≤ i ≤ j ≤ n

56
s=i
Step 2

Compute C [ 1 , 2 ]
k can be either 1 ( or ) 2
Let i = 1 , j = 2

C [ 1, 0 ] + C [ 2, 2 ] + P[1] + P[2] when k = 1


0 + 0.2 + 0.1 + 0.2 = 0.5
C [ 1 , 2 ] = C [ 1, 1 ] + C [ 3, 2 ] + P[1] + P[2] when k = 2
0.1 + 0 + 0.1 + 0.2 = 0.4
C [ 1 , 2 ] = 0.4 with k = 2
Cost Table C [ 1 , 2 ] = 0.4 & R [ 1, 2 ] = 2

Compute C [ 2 , 3 ]
k can be either 2 ( or ) 3
Let i = 2 , j = 3

C [ 2, 1 ] + C [ 3, 3 ] + P[2] + P[3] when k = 2


0 + 0.4 + 0.2 + 0.4 = 1.0
C [ 2 , 3 ] = C [ 2, 2 ] + C [ 4, 3 ] + P[2] + P[3] when k = 3
0.2 + 0 + 0.2 + 0.4 = 0.8

C [ 2 , 3 ] = 0.8 with k = 3
Cost Table C [ 2 , 3 ] = 0.8 & R [ 2, 3 ] = 3

Compute C [ 3 , 4 ]
k can be either 3 ( or ) 4
Let i = 3 , j = 4

C [ 3, 2 ] + C [ 4, 4 ] + P[3] + P[4] when k = 3


0 + 0.3 + 0.4 + 0.3 = 1.0
C [ 3 , 4 ] = C [ 3, 3 ] + C [ 5, 4 ] + P[3] + P[4] when k = 4
0.4 + 0 + 0.4 + 0.3 = 1.1

C [ 3 , 4 ] = 1.0 with k = 3
Cost Table C [ 3 , 4 ] = 1.0 & R [ 3, 4 ] = 3

Now the cost table and Root table is updated as

57
0 1 2 3 4 1 2 3 4
1 1
0 0.1 0.4 1 2
2 0 0.2 0.8 2 2 3
0 0.4 1.0
3 3 3 3
4 0 0.3
0 4
5 4

Step 3

Compute C [ 1 , 3 ]
k can be either 1 , 2 ( or ) 3
Let i = 1 , j = 3

C [ 1, 0 ] + C [ 2, 3 ] + P[1] + P[2] + P[3] when k = 1


0 + 0.8 + 0.1 + 0.4 + 0.2 = 1.5
C [ 1 , 3 ] = C [ 1, 2 ] + C [ 4, 3 ] + P[1] + P[2] + P[3] when k = 2
0.1 + 0.4 + 0.1 + 0.2 + 0.4 = 1.2
C [ 1, 2 ] + C [ 4, 3 ] + P[1] + P[2] + P[3] when k = 3
0.4 + 0 + 0.2 + 0.1 + 0.4 = 1.1

C [ 1 , 3 ] = 1.1 with k = 3
Cost Table C [ 1 , 3 ] = 1.1 & R [ 1, 3 ] = 3

Compute C [ 2 , 4 ]
k can be either 2 , 3 ( or ) 4
Let i = 2 , j = 4

C [ 2, 1 ] + C [ 3, 4 ] + P[2] + P[3] + P[4] when k = 2


0 + 1.0 + 0.2 + 0.4 + 0.3 = 1.9
C [ 2 , 4 ] = C [ 2, 2 ] + C [ 4, 4 ] + P[2] + P[3] + P[4] when k = 3
0.2+ 0.3 + 0.2 + 0.4 + 0.3 = 1.4
C [ 2, 3 ] + C [ 5, 4 ] + P[2] + P[3] + P[4] when k = 4
0.8 + 0 + 0.2 + 0.4 + 0.3 = 1.7

C [ 2 , 4 ] = 1.4 with k = 3
Cost Table C [ 2 , 4 ] = 1.4 & R [ 2, 4 ] = 3
58
Now the cost table and Root table is updated as
0 1 2 3 4 1 2 3 4
1 1
0 0.1 0.4 1.1 1 2 3
2 0 0.2 0.8 1.4 2 2 3 3
0 0.4 1.0
3 3 3 3
4 0 0.3
0 4
5 4
Step 4

Compute C [ 1 , 4 ]
k can be either 1 , 2 , 3 ( or ) 4
Let i = 1 , j = 4

C [ 1, 0 ] + C [ 2, 4 ] + P[1] + P[2] + P[3] + P[4] k = 1


0 + 1.4 + 0.1 + 0.2 + 0.4 + 0.3 = 2.4
C [ 1 , 4 ] = C [ 1, 1 ] + C [ 3, 4 ] + P[1] + P[2] + P[3] + P[4] k = 2
0.1 + 0.1 + 1.0 + 0.2 + 0.3 + 0.4 = 2.1
C [ 1, 2 ] + C [ 4, 4 ] + P[1] + P[2] + P[3] + P[4] k = 3
0.4 + 0.3 + 0.1 + 0.2 + 0.4 + 0.3 = 1.7
C [ 1, 3 ] + C [ 5, 4 ] + P[1] + P[2] + P[3] + P[4] k = 4
1.1 + 0 + 0.1 + 0.2 + 0.4 + 0.3 = 2.3

C [ 1 , 4 ] = 1.7 with k = 3
Cost Table C [ 1 , 4 ] = 1.7 & R [ 1, 4 ] = 3

Now the cost table and Root table is updated as


0 1 2 3 4 1 2 3 4
1 1
0 0.1 0.4 1.1 1.7 1 2 3 3
2 0 0.2 0.8 1.4 2 2 3 3
0 0.4 1.0
3 3 3 3
4 0 0.3
0 4
5 4

59
Step 5
To build a tree R [ 1 , n ] = R [ 1 , 4 ] = 3 becomes Root

1 2 3 4
do if int while

key= 3 , Hence int becomes Root of OBST

Tk

Ti , k-1 T k+1,j

Here i = 1 , j = 4 and k = 3
R(1,4) key = 3 ( int )

key = 2 ( if ) R(1,2) R(4,4) key = 4 ( while )

R(1,4)
key = 3 ( int )

key = 2 ( if ) R(1,2) R(4,4) key = 4 ( while )

key=1( do) int


R(1,1)

if while

do 60
The tree can be with Optimum Cost C [ 1 , 4 ] = 1.7

Algorithm
Algorithm OBST(P[1,2,……..,n])
Begin
for i <- 1 to n
C[i,i-1] <- 0
C[i,i] <- P[i]
R[I,i] <- i
end for
for i <- 1 to n+1
for j <- 0 to n
min <- ∞
for k <- i to j
if (C[i,k-1] + C[k+1,j]) < min
min <- C[i,k-1] + C[k+1,j]
kmin <- k
end if
end for
R[i,j] <- k
sum <- P[i]
for s <- i+1 to j
sum <- sum + P[i]
C[i,j] <- min + sum
end for
end for
for i <- 1 to n+1
for j <- 0 to n
write Cost[i,j]
end for
end for
for i <- 1 to n

61
for j <- 1 to n
write Root[i,j]
end for
end for
End
Analysis
 The basic operation of OBST algorithm is computation of C[i,j] by finding the
min valued k. The basic operation is located with in three nested for loops. Hence
the Time complexity can be O(n3).
Time Complexity – O(n3)
Space Complexity – O(n2)
Greedy Technique
A greedy algorithm is an algorithm that follows the problem solving heuristic of making the
locally optimal choice at each stage[1]with the hope of finding a global optimum. In many
problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a
greedy heuristic may yield locally optimal solutions that approximate a global optimal solution
in a reasonable time.
For example, a greedy strategy for the traveling salesman problem (which is of a high
computational complexity) is the following heuristic: "At each stage visit an unvisited city
nearest to the current city". This heuristic need not find a best solution but terminates in a
reasonable number of steps; finding an optimal solution typically requires unreasonably many
steps. Inmathematical optimization, greedy algorithms solve combinatorial problems having the
properties of matroids.
Prim’s algorithm
Start with tree T1 consisting of one (any) vertex and “grow” tree one vertex at a time to
produce MST through a series of expanding subtrees T1, T2, …, Tn
On each iteration, construct Ti+1 from Ti by adding vertex not in Ti that is closest to those
already in Ti (this is a “greedy” step!)
Stop when all vertices are included
Prim's Algorithm constructs a minimal spanning tree (MST) in a connect graph or component.

62
Minimal Spanning Tree
A minimal spanning tree of a weighted graph is a spanning tree that has minimal of sum of edge
weights.
Prim's Algorithm solves the greedy algorithm using the greedy technique. It builds the spanning
tree by adding the minimal weight edge to a vertex not already in the tree.
Algorithm Prim(G)
VT ← {v0}
ET ← {} // empty set
for i ← 1 to |V| do
find the minimum weight edge, e* = (v*, u*) such that v* is in VT and u is in V- VT
VT ← VT union {u*}
ET ← ET union {e*}
return ET

63
64
Kruskal's Algorithm
 Start with T = EMPTY SET
 Keep track of connected components of graph with edges T
 Initially components are single nodes
 At each stage, add the cheapest edge that connects two nodes not already connected
The algorithm begins by sorting the graph’s edges in nondecreasing order of their
weights. Then, starting with the empty subgraph, it scans this sorted list, adding the next edge on
the list to the current subgraph if such an inclusion does not create a cycle and simply skipping
the edge otherwise.
Algorithm Kruskal(G)
//Kruskal’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = _V, E_
//Output: ET , the set of edges composing a minimum spanning tree of G
sort E in nondecreasing order of the edge weights w(ei1) ≤ . . . ≤ w(ei|E|)
ET←∅ ; ecounter ←0 //initialize the set of tree edges and its size
k←0 //initialize the number of processed edges
while ecounter < |V| − 1 do
k←k + 1
if ET
{eik
} is acyclic
ET←ET{eik
}; ecounter ←ecounter + 1
return ET
Applying Prim’s and Kruskal’s algorithms to the same small graph by hand may create the
impression that the latter is simpler than the former.

65
66
Dijkstra's Algorithm
One of the main reasons for the popularity of Dijkstra's Algorithm is that it is one of the most
important and useful algorithms available for generating (exact) optimal solutions to a large class
of shortest path problems. The point being that this class of problems is extremely important
theoretically, practically, as well as educationally.
Indeed, it is safe to say that the shortest path problem is one of the most important generic
problem in such fields as OR/MS, CS and artificial intelligence (AI). One of the reasons for this
is that essentially any combinatorial optimization problem can be formulated as a shortest path
problem. Thus, this class of problems is extremely large and includes numerous practical
problems that have nothing to do with actual ("genuine") shortest path problems.
New classes of genuine shortest path problem are becoming very important these days in
connection with practical applications of Geographic Information Systems (GIS) such as on line
computing of driving directions. It is not surprising therefore that, for example, Microsoft has a
research project on algorithms for shortest path problems.
Consider the best-known algorithm for the single-source shortest-paths problem, called
Dijkstra’s algorithm.4 This algorithm is applicable to undirected and directed graphs with
nonnegative weights only. Since in most applications this condition is satisfied, the limitation has
not impaired the popularity of Dijkstra’s algorithm. Dijkstra’s algorithm finds the shortest paths
to a graph’s vertices in order of their distance from a given source. First, it finds the shortest path
from the source to a vertex nearest to it, then to a second nearest, and so on.

67
68
Huffman Trees

Huffman coding is a lossless data compression algorithm. The idea is to assign variable-
legth codes to input characters, lengths of the assigned codes are based on the frequencies of
corresponding characters. The most frequent character gets the smallest code and the least
frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit
sequences) are assigned in such a way that the code assigned to one character is not prefix of
code assigned to any other character. This is how Huffman Coding makes sure that there is no

69
ambiguity when decoding the generated bit stream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d,
and their corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity
because code assigned to c is prefix of codes assigned to a and b. If the compressed bit stream is
0001, the de-compressed output may be “cccd” or “ccb” or “acd” or “ab”.

Huffman’s algorithm
Step 1 Initialize n one-node trees and label them with the symbols of the alphabet given. Record
the frequency of each symbol in its tree’s root to indicate the tree’s weight. (More generally, the
weight of a tree will be equal to the sum of the frequencies in the tree’s leaves.)

Step 2 Repeat the following operation until a single tree is obtained. Find two trees with the
smallest weight (ties can be broken arbitrarily, but see Problem 2 in this section’s exercises).
Make them the left and right subtree of a new tree and record the sum of their weights in the root
of the new tree as its weight.
A tree constructed by the above algorithm is called a Huffman tree. It defines—in the manner
described above—a Huffman code.

EXAMPLE Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence
frequencies in a text made up of these symbols: symbol A B C D _ frequency 0.35 0.1 0.2 0.2
0.15

70
71
UNIT IV ITERATIVE IMPROVEMENT 9
The Simplex Method-The Maximum-Flow Problem – Maximm Matching in Bipartite Graphs-
The Stable marriage Problem.

INTRODUCTION:
Algorithm design technique for solving optimization problems
 Start with a feasible solution
 Repeat the following step until no improvement can be found:
 change the current feasible solution to a feasible solution with a better value of the
objective function
 Return the last feasible solution as optimal
 Note: Typically, a change in a current solution is “small” (local search)
 Major difficulty: Local optimum vs. global optimum
Important Examples
 Simplex method
 Ford-Fulkerson algorithm for maximum flow problem
 Maximum matching of graph vertices
 Gale-Shapley algorithm for the stable marriage problem

Linear Programming
 Linear programming (LP) problem is to optimize a linear function of several variables
subject to linear constraints:
maximize (or minimize) c1 x1 + ...+ cn xn
subject to ai1x1+ ...+ ain xn ≤ (or ≥ or =) bi ,
i = 1,...,m , x1 ≥ 0, ... , xn ≥ 0
The function z = c1 x1 + ...+ cn xn is called the objective function;
constraints x1 ≥ 0, ... , xn ≥ 0 are called non-negativity constraints

72
Example

maximize 3x + 5y
subject to x+ y ≤4
x + 3y ≤ 6
x ≥ 0, y ≥ 0

x + 3y = 6

( 0, 2 )
( 3, 1 )

x
( 0, 0 ) ( 4, 0 )

x+y=4
Feasible region is the set of points defined by the constraints
Geometric solution
Extreme Point Theorem Any LP problem with a nonempty bounded feasible region has an
optimal solution; moreover, an optimal solution can always be found at an extreme point of the
problem's feasible region.

maximize 3x + 5y
subject to x+ y ≤4
x + 3y ≤ 6
x ≥ 0, y ≥ 0

73
y

( 0, 2 )
( 3, 1 )

x
( 0, 0 ) ( 4, 0 )
3x + 5y = 20
3x + 5y = 14

3x + 5y = 10
Possible outcomes in solving an LP problem
 has a finite optimal solution, which may not be unique
 unbounded: the objective function of maximization (minimization) LP problem is
unbounded from above (below) on its feasible region
 infeasible: there are no points satisfying all the constraints, i.e. the constraints are
contradictory
The Simplex Method
 Simplex method is the classic method for solving LP problems, one of the most
important algorithms ever invented
 Invented by George Dantzig in 1947 (Stanford University)
 Based on the iterative improvement idea:
Generates a sequence of adjacent points of the problem’s feasible region with improving values
of the objective function until no further improvement is possible
Outline of the Simplex Method
Step 0 [Initialization] Present a given LP problem in standard form and set up initial tableau.

74
Step 1 [Optimality test] If all entries in the objective row are nonnegative — stop: the
tableau represents an optimal solution.
Step 2 [Find entering variable] Select (the most) negative entry in the objective row.
Mark its column to indicate the entering variable and the pivot column.
Step 3 [Find departing variable]
• For each positive entry in the pivot column, calculate the θ-ratio by dividing that row's
entry in the rightmost column by its entry in the pivot column.
(If there are no positive entries in the
pivot column — stop: the problem is unbounded.)
• Find the row with the smallest θ-ratio, mark this row to indicate the departing variable
and the pivot row.
Step 4 [Form the next tableau]
• Divide all the entries in the pivot row by its entry in the pivot column.
• Subtract from each of the other rows, including the objective row, the new pivot row
multiplied by the entry in the pivot column of the row in question.
• Replace the label of the pivot row by the variable's name of the pivot column and go
back to Step 1.
Example of Simplex Method
maximize
z = 3x + 5y + 0u + 0v
subject to
x+ y+ u =4
x + 3y + v =6
x≥0, y≥0, u≥0, v≥0

75
x y u v x y u v x y u v
2 1 x
u 1 1 1 0 4 u 0 1 2 1 0 3/2 1/3 3
3 3
1 1 y 0 1 1/2
v 1 3 0 1 6 y 1 0 2 1/2 1
3 3
4 5 0 0 2 1 14
3 5 0 0 0 0 0 10
3 3

basic feasible sol. basic feasible sol. basic feasible sol.


(0, 0, 4, 6) (0, 2, 2, 0) (3, 1, 0, 0)
z=0 z = 10 z = 14

Standard form of LP problem


• must be a maximization problem
• all constraints (except the non-negativity constraints) must be in the form of linear
equations
• all the variables must be required to be nonnegative
Thus, the general linear programming problem in standard form with m constraints and n
unknowns (n ≥ m) is maximize c1 x1 + ...+ cn xn
subject to ai1x1+ ...+ ain xn = bi , , i = 1,...,m, x1 ≥ 0, ... , xn ≥ 0
Every LP problem can be represented in such form
Example
maximize 3x + 5y maximize 3x + 5y + 0u + 0v
subject to subject to
x+ y≤4 x+ y+ u =4
x + 3y ≤ 6 x + 3y + v = 6 x≥0,
y≥0 x≥0, y≥0, u≥0, v≥0
Variables u and v, transforming inequality constraints into equality constrains, are called slack
variables

76
Basic feasible solutions
A basic solution to a system of m linear equations in n unknowns (n ≥ m) is obtained by setting n
– m variables to 0 and solving the resulting system to get the values of the other m variables.
The variables set to 0 are called nonbasic;
the variables obtained by solving the system are called basic.
A basic solution is called feasible if all its (basic) variables are nonnegative.
Example x+ y+u =4
x + 3y + v = 6
(0, 0, 4, 6) is basic feasible solution
(x, y are nonbasic; u, v are basic)
Simplex Table
maximize
z = 3x + 5y + 0u + 0v
subject to
x+ y+ u =4
x + 3y + v =6
x≥0, y≥0, u≥0, v≥0

77
The Maximum-Flow Problem
Problem of maximizing the flow of a material through a transportation network (e.g., pipeline
system, communications or transportation networks)
Formally represented by a connected weighted digraph with n vertices numbered from 1 to n
with the following properties:
• contains exactly one vertex with no entering edges called the source (numbered 1)
• contains exactly one vertex with no leaving edges, called the sink (numbered n)
• has positive integer weight uij on each directed edge (i.j), called the edge
capacity, indicating the upper bound on the amount of the material that can be
sent from i to j through this edge
In other words, the total amount of the material entering an intermediate vertex
must be equal to the total amount of the material leaving the vertex. This condition
is called the flow-conservation requirement.

78
79
Example:

80
Maximum Matching in Bipartite Graphs

A matching in a Bipartite Graph is a set of the edges chosen in such a way


that no two edges share an endpoint. A maximum matching is a matching of maximum
size (maximum number of edges). In other words, a matching is maximum if any edge is
added to it, it is no longer a matching. There can be more than one maximum matchings
for a given Bipartite Graph.

Our goal is to find the maximum matching in a graph. Note that a maximal matching can
be found very easily — just keep adding edges to the matching until no more can be
added.
We will construct a 1directed graph G0(V0, E0), in which V0 which contains all the
nodes of V along with a source nodes and a sink node t. For every edge in E, we add a
directed edge in E0 from X to Y . Finally we add a directed edge from s to all nodes in X
and from all nodes of Y to t. Each edge is given unit capacity.

81
82
The Stable marriage Problem
In mathematics, economics, and computer science, the stable marriage problem (SMP) is
the problem of finding a stable matching between two sets of elements given a set of
preferences for each element. A matching is a mapping from the elements of one set to the
elements of the other set. A matching is stable whenever it is not the case that both:
a. some given element A of the first matched set prefers some given element B of the second
matched set over the element to which A is already matched, and
b. B also prefers A over the element to which B is already matched

83
In other words, a matching is stable when there does not exist any alternative pairing (A, B) in
which both A and B are individually better off than they would be with the element to which
they are currently matched.
The stable marriage problem is commonly stated as:
Given n men and n women, where each person has ranked all members of the opposite
sex with a unique number between 1 and n in order of preference, marry the men and
women together such that there are no two people of opposite sex who would both rather
have each other than their current partners. If there are no such people, all the marriages
are "stable".
A marriage matching M is a set of n (m, w) pairs whose members are selected from disjoint n-
element sets Y and X in a one-one fashion, i.e., each man m from Y is paired with exactly one
woman w from X and vice versa.
This algorithm guarantees that:
Everyone gets married
Once a woman becomes engaged, she is always engaged to someone. So, at the end, there
cannot be a man and a woman both unengaged, as he must have proposed to her at some
point (since a man will eventually propose to everyone, if necessary) and, being
unengaged, she would have had to have said yes.
The marriages are stable
Let Alice be a woman and Bob be a man who are both engaged, but not to each other.
Upon completion of the algorithm, it is not possible for both Alice and Bob to prefer each
other over their current partners. If Bob prefers Alice to his current partner, he must have
proposed to Alice before he proposed to his current partner. If Alice accepted his
proposal, yet is not married to him at the end, she must have dumped him for someone
she likes more, and therefore doesn't like Bob more than her current partner. If Alice
rejected his proposal, she was already with someone she liked more than Bob.

84
Stable marriage algorithm
Input: A set of n men and a set of n women along with rankings of the women
by each man and rankings of the men by each woman with no ties
allowed in the rankings
Output: A stable marriage matching
Step 0 Start with all the men and women being free.
Step 1 While there are free men, arbitrarily select one of them and do the following:
Proposal The selected free man m proposes to w, the next woman on his preference list (who is
the highest-ranked woman who has not rejected him before). Response If w is free, she accepts
the proposal to be matched with m. If she is not free, she compares m with her current mate. If
she prefers to him, she accepts m’s proposal, making her former mate free; otherwise, she simply
rejects m’s proposal, leaving m
free.
Step 2 Return the set of n matched pairs.

85
UNIT V COPING WITH THE LIMITATIONS OF ALGORITHM POWER 9

Limitations of Algorithm Power-Lower-Bound Arguments-Decision Trees-P, NP and NP-


Complete Problems–Coping with the Limitations – Backtracking – n-Queens problem –
Hamiltonian Circuit Problem – Subset Sum Problem-Branch and Bound – Assignment problem –
Knapsack Problem – Traveling Salesman Problem- Approximation Algorithms for NP – Hard
Problems – Traveling Salesman problem – Knapsack problem.

Limitations of Algorithm Power


Lower Bound Arguments
Lower bound: an estimate on a minimum amount of work needed to solve a given problem
Examples:
 Number of comparisons needed to find the largest element in a set of n numbers
 Number of comparisons needed to sort an array of size n
 Number of comparisons necessary for searching in a sorted array
 Number of multiplications needed to multiply two n-by-n matrices
 Lower bound can be
o an exact count
o an efficiency class ()
 Tight lower bound: there exists an algorithm with the same efficiency as the lower
bound

Problem Lower bound Tightness


sorting (comparison-based) (nlog n) yes
searching in a sorted array (log n) yes
element uniqueness (nlog n) yes
n-digit integer multiplication (n) unknown
multiplication of n-by-n (n2) unknown
matrices

86
Methods for Establishing Lower Bounds
 Trivial lower bounds
 Information-theoretic arguments (decision trees)
 Adversary arguments
 Problem reduction

Trivial Lower Bounds


Trivial lower bounds: based on counting the number of items that must be processed in input and
generated as output
Examples
 Finding max element -- n steps or n/2 comparisons
 Polynomial evaluation
 Sorting
 Element uniqueness
 Hamiltonian circuit existence
Conclusions
 May and may not be useful
 Be careful in deciding how many elements must be processed

Decision Trees
Decision tree — A convenient model of algorithms involving
Comparisons in which:
 Internal nodes represent comparisons
 Leaves represent outcomes (or input cases)
Decision tree for 3-element insertion sort

87
Decision Trees and Sorting Algorithms
 Any comparison-based sorting algorithm can be represented by a decision tree (for each fixed
n)
 Number of leaves (outcomes)  n!
 Height of binary tree with n! leaves  log2n!
 Minimum number of comparisons in the worst case  log2n! for any comparison-based
sorting algorithm, since the longest path represents the worst case and its length is the height
 log2n!  n log2n (by Sterling approximation)
 This lower bound is tight (mergesort or heapsort)

Ex. Prove that 5 (or 7) comparisons are necessary and sufficient for sorting 4 keys (or 5 keys,
respectively).

Adversary Arguments
Adversary argument: It’s a game between the adversary and the (unknown) algorithm. The
adversary has the input and the algorithm asks questions to the adversary about the input. The
adversary tries to make the algorithm work the hardest by adjusting the input (consistently). It
wins the “game” after the lower bound time (lower bound proven) if it is able to come up with
two different inputs.

88
Example 1: “Guessing” a number between 1 and n using yes/no questions (Is it larger than x?)

Adversary: Puts the number in a larger of the two subsets generated by last question

Example 2: Merging two sorted lists of size n


a1 < a2 < … < an and b1 < b2 < … < bn

Adversary: Keep the ordering b1 < a1 < b2 < a2 < … < bn < an in mind and answer comparisons
consistently

Claim: Any algorithm requires at least 2n-1 comparisons to output the above ordering (because it
has to compare each pair of adjacent elements in the ordering)
Ex: Design an adversary to prove that finding the smallest element in a set of n elements
requires at least n-1 comparisons.

Lower Bounds by Problem Reduction


Fact: If problem Q can be “reduced” to problem P, then Q is at least as easy as P, or equivalent, P
is at least as hard as Q.

Reduction from Q to P: Design an algorithm for Q using an algorithm for P as a subroutine.


Idea: If problem P is at least as hard as problem Q, then a lower bound for Q is also a lower
bound for P.
Hence, find problem Q with a known lower bound that can be reduced to problem P in question.
Example: P is finding MST for n points in Cartesian plane, and Q is element uniqueness
problem (known to be in (nlogn))

Reduction from Q to P: Given a set X = {x1, …, xn} of numbers (i.e. an instance of the
uniqueness problem), we form an instance of MST in the Cartesian plane: Y = {(0,x1), …,
(0,xn)}. Then, from an MST for Y we can easily (i.e. in linear time) determine if the elements in
X are unique.

89
Classifying Problem Complexity
Is the problem tractable, i.e., is there a polynomial-time (O(p(n)) algorithm that solves it?
Possible answers:
 yes (give example polynomial time algorithms)
 no
o because it’s been proved that no algorithm exists at all (e.g., Turing’s halting
problem)
o because it’s been be proved that any algorithm for it would require exponential
time
 unknown. How to classify their (relative) complexity using reduction?

Problem Types: Optimization and Decision


 Optimization problem: find a solution that maximizes or minimizes some objective
function
 Decision problem: answer yes/no to a question
Many problems have decision and optimization versions.
E.g.: traveling salesman problem
 optimization: find Hamiltonian cycle of minimum length
 decision: find Hamiltonian cycle of length  L
Decision problems are more convenient for formal investigation of their complexity.
Class P
P: the class of decision problems that are solvable in O(p(n)) time, where p(n) is a polynomial of
problem’s input size n
Examples:
 Searching
 Element uniqueness
 Graph connectivity
 Graph acyclicity
 Primality testing (finally proved in 2002)

90
Class NP
NP (nondeterministic polynomial): class of decision problems whose proposed solutions can be
verified in polynomial time = solvable by a nondeterministic polynomial algorithm
A nondeterministic polynomial algorithm is an abstract two-stage procedure that:
 Generates a solution of the problem (on some input) by guessing
 Checks whether this solution is correct in polynomial time
By definition, it solves the problem if it’s capable of generating and verifying a solution on one
of its tries
Why this definition?
 led to development of the rich theory called “computational complexity”

Example: CNF satisfiability


Problem: Is a boolean expression in its conjunctive normal form (CNF) satisfiable, i.e., are there
values of its variables that make it true?
This problem is in NP. Nondeterministic algorithm:
 Guess a truth assignment
 Substitute the values into the CNF formula to see if it evaluates to true
Example: (A | ¬B | ¬C) & (A | B) & (¬B | ¬D | E) & (¬D | ¬E)
Truth assignments:
ABCDE
0 0 0 0 0
. . .
1 1 1 1 1
Checking phase: O(n)
What other problems are in NP?
 Hamiltonian circuit existence
 Partition problem: Is it possible to partition a set of n integers into two disjoint subsets
with the same sum?
 Decision versions of TSP, knapsack problem, graph coloring, and many other
combinatorial optimization problems.
 All the problems in P can also be solved in this manner (but no guessing is necessary), so

91
we have:
 P  NP
 Big (million dollar) question: P = NP ?
NP-Complete Problems
A decision problem D is NP-complete if it is as hard as any problem in NP, i.e.,
 D is in NP
 every problem in NP is polynomial-time reducible to D

Cook’s theorem (1971): CNF-sat is NP-complete

Other NP-complete problems obtained through polynomial- time reductions from a known NP-
complete problem

Examples: TSP, knapsack, partition, graph-coloring and hundreds of other problems of

92
combinatorial nature

P = NP ? Dilemma Revisited
 P = NP would imply that every problem in NP, including all NP-complete problems,
could be solved in polynomial time
 If a polynomial-time algorithm for just one NP-complete problem is discovered, then
every problem in NP can be solved in polynomial time, i.e. P = NP

 Most but not all researchers believe that P  NP , i.e. P is a proper subset of NP. If P
 NP, then the NP-complete problems are not in P, although many of them are very
useful in practice.

Coping with the Limitations

Introduction:

Backtracking & Branch-and-Bound are the two algorithm design techniques for solving
problems in which the number of choices grows at least exponentially with their instance size.
Both techniques construct a solution one component at a time trying to terminate the process as
soon as one can ascertain that no solution can be obtained as a result of the choices already
made. This approach makes it possible to solve many large instances of NP-hard problems in an
acceptable amount of time.

93
Both Backtracking and branch-and-bound uses a state-space tree-a rooted tree whose
nodes represent partially constructed solutions to the problem. Both techniques terminate a node
as soon as it can be guaranteed that no solution to the problem can be obtained by considering
choices that correspond to the node’s descendants.
Backtracking

Backtracking constructs its state-space tree in the depth-first search fashion in the
majority of its applications. If the sequence of choices represented by a current node of the state-
space tree can be developed further without violating the problem’s constraints, it is done by
considering the first remaining legitimate option for the next component. Otherwise, the method
backtracks by undoing the last component of the partially built solution and replaces it by the
next alternative.

A node in a state-space tree is said to be promising if it corresponds to a partially


constructed solution that may still lead to a complete solution; otherwise it is called non
promising. Leaves represent either nonpromising dead ends or complete solutions found by the
algorithms.
N – Queens Problem
 N – Queens problem is solved by using backtracking.
 The problem is to place n queens on n * n chess board.
 The following constraints are used to place queens on the chess board.
o No two queens should be placed on the same diagonal
o No two queens should be placed on the same column
o No two queens should be placed on the same row

2 Quenns Problem

 The 2 - Queens problem is not solvable. Because 2-Queens can be placed on 2 * 2


chessboard as

94
Q Q Q
Q

Illegal Illegal

Q Q
Q Q

Illegal Illegal

There is no solution when the value of n = 1 , 2 and 3

4 - Quenns Problem

 The solution for the 4 – Queens problem is easily obtained using the backtracking
algorithm.
 The aim of the problem is to place 4 – Queens on the chessboard in such a way that none
of the queens hit each other.
Solving Procedure

Step 1:

 Let us assume the queens are placed in the row wise. The 1st Q is placed I 1st row at (1,1)

Q1

Step 2:

 The second Queen has to be placed in 2nd row. It is not possible to place the Q2 at the
following places.
95
(2,1) – placing the queens in the same column
(2,2) – placing the queens in the same diagonal.
So the queen can be placed in (2,3)

Q1
Q2

Step 3:

 The third Queen has to be placed in 3rd row. It is not possible to place the Q3 at the
following places.
(3,1) – placing the queens in the same column
(3,2) – placing the queens in the same diagonal.
(3,3) – placing the queens in the same column
(3,4) – placing the queens in the same diagonal.

Q1
Q2
X X X X

Backtracking to the previous solution

 So this tells that Q3 can not be placed at 3rd row and it will not given a solution. So
backtracking to the previous step ie) step 2. So instead of placing the Q2 at (2,3), Q2 can
be placed in (2,4).

Q1
Q2

96
Step 4:

 The third Queen has to be placed in 3rd row. It is not possible to place the Q3 at the
following places.
(3,1) – placing the queens in the same column
(3,3) – placing the queens in the same diagonal.
(3,4) – placing the queens in the same column
So the queen has to be placed in (3,2)

Q1
Q2
Q3

Step 5:

 The fourth Queen has to be placed in 4th row. It is not possible to place the Q4 at the
following places.
(4,1) – placing the queens in the same column
(4,2) – placing the queens in the same column
(4,3) – placing the queens in the same diagonal.
(4,4) – placing the queens in the same column

Q1
Q2
Q3
X X X X
Backtracking to the previous solution

 So this tells that Q4 can not be placed at 4th row and it will not given a solution. So
backtracking to the previous step ie) step 4. So instead of placing the Q3 at (3,2), Q3 can
be placed in (3,3) (or) (3,4), that is also not possible because it comes in a same diagonal
and same column. So again backtracking to the previous step of step 4 ie) step3 and do

97
the same process. Finally Q3 is placed in (3,1)

Q1
Q2
Q3

Step 6:

Q1
Q2
Q3
Q4

Basic Terminologies
Solution Space
 Tuples that satisfy the constraints
 The solution space can be organized into a tree
State Space
 State Space is the set of all paths from root node to other nodes.
State Space Tree
 State Space Tree is the tree organization of the solution space.
 In backtracking technique while solving a given problem a tree is constructed based on
the choices made.
 Such a tree with all possible solutions is called State Space Tree.
Promising and Non-Promising Node
 A node in State Space Tree is said to be promising
 The node which are not promising for solution in a state space tree is called non –
promising node.

98
Q1 = 1 Q1 = 3
1
Q1= 2

2 6
Q2 = 3 Q2 = 4 10 Q2 = 1

3 4 7 11

Not a Q3 = 2 Q3 = 1 Q4 = 2
Solution
5 8 12

Not a Q4 = 3 Q4 = 2
Solution
9 13

Solution Solution

8 – Queens Problem
 The solution for the 8 Queens problem is easily obtained using backtracking algorithm.
 The aim of the problem is to place 8 queens on 8 * 8 chessboard in such a way that none
of queens hit each other.
 The following constraints are used to place queens on the chess board.
 No two queens should be placed on the same diagonal
 No two queens should be placed on the same column
 No two queens should be placed on the same row

 Let us assume two queens are placed at positions ( i , j ) and ( k , l ) then, the two queens
are said to be in same diagonal, if the following conditions are satisfied.
i+j=k+1
i–j=k–1
 The above two equations can be rearranged as follows
i-k=1-j
i–k=j–1
 Combining these two equations
Abs ( i – k ) = Abs ( j - 1 )

99
 If this condition is true, then the queens are in the same diagonal.
 In any stage, we are unable to place a queen then we have to backtrack and change the
position of the previous queen.
 This is repeated till we place all the 8 queens on the board.
Algorithm
Algorithm Nqueens(k,n)
Begin
for i=1 to n do
if place(k,i) then
x[k] = I
if ( k = n) then
write (x[1:n])
else
Nqueens(k+1,n)
end for
End
Algorithm place(k,i)
Begin
for j=1 to k-1 do
if x[j] = i
return false
else
return true
end for
End
Time complexity of 8 Queen problem is O(k-1)

100
Q1

Q1
Q2

Q1
Q2
Q3

101
Q1
Q2
Q3
Q4

Q1
Q2
Q3
Q4
Q5

Q1
Q2
Q3
Q4

102
Q5
Q6

Q1
Q2
Q3
Q4
Q5
Q6
Q7

Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8

103
Q1 = 1 1

Q1 = 2 Q1 = 3

2
9 14
Q2 = 3 Q2 = 4 Q2 = 6

3 15
11
Q3 = 2
Q3 = 6 Q3 = 1
16
4 11 Q4 = 7
Q4 = 3 Q4 = 3

5 12 17 Q5 = 1
Q5 = 7 Q5 = 5

6 13 18
Q6 = 2 Not a
Q6 = 2 Solution

7 Q7 = 4 19
Q7 = 8

8 a
Not 20 Q8 = 5

Solution

21
Solution
Analysis
Time complexity of 8 Queen’s problem is O ( k – 1 )

104
Sum of Subsets
 The sum of subsets is solved using the backtracking method.
Problem Definition
 Given n distinct positive numbers (weights) Wi, 1 ≤ i ≤ n, and m, find all subsets of the
Wi whose sum is m.
 The element Xi of the solution vector is either one (or) zero depending on whether the
weight Wi is included or not.
Xi = 1 , Wi is included
Xi = 0 , Wi is not included
Solution to Sum of Subsets Problem
 Sort the weights in ascending order.
 The root of the space tree represents the starting point, with no decision about the given
elements.
 The left and right child represents inclusion and exclusion of Xi in a set.
 The node to be expanded, check it with the following condition.
k
Σ Wi Xi + Wi +1 ≤ m
i=1
 The bounded node can be identified with the following condition.
 The choice for the bounding function is Bk (X1,……… Xk ) =true iff
k n
Σ Wi Xi + Σ Wi ≥m
i=1 i=k+1
 Backtrack the bounded node and find alternative solution.
 Thus a path from the root to a node on the ith level of the tree indicates which of the
first i numbers have been included in the subsets represented by that node.
 We can terminate the node as non promising if either of the 2
ingratiation holds
S’ + Wi +1 > m (where S’ in too large)
S’ + Wi +1 < m (where S’ in too small)
k

105
S’ = Σ Wi Xi
i=1
Constraints
Implicit Constraints
 No two elements in the subset can be same.
Explicit Constraints
 The tuple values can be any of the value between ‘1’ and ‘n’ and needed to be ascending
order.
Procedure
 Let ‘S’ be a set of elements and ‘m’ be the expected sum of subsets.
Step 1
Start with an empty set.
Step 2
Add to the subset, the next element from the list
Step 3
If the subset is having sum ‘m’ then stop with that subset as the solution.
Step 4
If the subset if not feasible (or) if we have reached the end of the set then
backtrack through the subset until we find the most suitable value.
Step 5
If the subset is feasible, then repeat step 2.
Step 6
If we have visited, all the elements without finding a suitable subset and no
backtracking is possible then stop without solution.
Example
Consider a set S = {3, 5, 6, 7} m=15. Solve it for obtaining sum of subsets?
Subset Sum Action
3 3 < 15 Add next element
3, 5 8 < 15 Add next element
3, 5, 6 14 < 15 Add next element

106
3, 5, 6, 7 21 > 15 Backtrack ( sum exceeds )
Condition satisfies. Subset = sum.
3,5,7 15 = 15
Solution obtained

Solution set X = { 1, 1, 0, 1 }
Algorithm
Algorithm sumofsubset(s,m,n)
Begin
solution := 0
for i <- 1 to n
solution <- solution + Wi
if ( solution > m )
solution <- solution – Wi
Xi <- 0
end if
Xi <- 1
end for
write “ The tuples X[1………n]
End

Branch-and-Bound

It is an algorithm design technique that enhances the idea of generating a state-space tree
with the idea of estimating the best value obtainable from a current node of the decision tree: if
such an estimate is not superior to the best solution seen up to that point in the processing, the
node is eliminated from further consideration.

A feasible solution, is a point in the problem’s search space that satisfies all the
problem’s constraints, while an optimal solution is a feasible
solution with the best value of the objective function compared to backtracking branch-and-
bound requires 2 additional items:

107
1) A way to provide, for every node of a state-space tree a bound on the best value of the
objective function on any solution that can be obtained by adding further components to
the partial solution represented by the node.
2) The best value of the best solution seen so far.

If this information is available, we can compare a node’s bound with the value of the best
solution seen so far: if the bound value is not better than the best solution seen so far- i.e., not
smaller for a minimization problem and not larger for a maximization problem- the node is
nonpromising and can be terminated because no solution obtained from it can yield a better
solution than the one already available.

In general, we terminate a search path at the current node in a state-space tree of a branch
& bound algorithm for any one of the following three reasons:

1) The value of the node’s bound is not better than the value of the best solution seen so far.
2) The node represents no feasible solutions because the constraints of the problem are
already violated.

3) The subset of feasible solutions represented by the node consists of a simple point. Compare
the value of the objective function for this feasible solution with that of the best solution seen
so far and update the latter with the former if the new solution is better.

Assignment problem
It is a problem where each job will be assigned to each person. And no 2 jobs can be
assigned to same person and no 2 person should be assigned with the same job.

Select one element in each row of the cost matrix C so that:


• no two selected elements are in the same column
• the sum is minimized

108
Example
Job 1 Job 2 Job 3 Job 4
Person a 9 2 7 8
Person b 6 4 3 7
Person c 5 8 1 8
Person d 7 6 9 4

Lower bound: Any solution to this problem will have total cost
at least: 2 + 3 + 1 + 4 (or 5 + 2 + 1 + 4)

Knapsack problem
Given a items of known weights Wi and values Vi, i=1,2,........n and a knapsack of
capacity W, find the most valuable subset of the items that items that fit in the knapsack.

109
Traveling salespersons problem
Visit all the vertices with a low cost yields the optimal solution.

State-Space tree with the list of vertices in a node specifies a beginning part of the Hamiltonian
circuits represented by the node

The lower bound is obtained as lb = s/2 ; where s is the sum of the distance of the n cities.

i.e., [ [ (1+5) + (3+6) + (1+2) + (3+5) +(2+3) ]/2 ] =16, which is the optimal tour.

110
Approximation Algorithms for NP-hard problems

It is the combinatorial problems which fall under NP-Hard problems. Accuracy ratio and
performance ratio has to be calculated. Nearest-neighbour algorithm and Twice-around-the-tree
algorithm. Greedy algorithm is used for the continuous knapsack problem for the fractional
version.
Approximation algorithms are often used to find approximation solutions to difficult
problems of combinatorial optimization. Theperformance ratio is the principal metric for
measuring the accuracy of such approximation algorithms.Apply a fast (i.e., a polynomial-time)
approximation algorithm to get a solution that is not necessarily optimal but hopefully close to it

Accuracy measures:
accuracy ratio of an approximate solution
sar(sa) = f(sa) / f(s*) for minimization problems r(sa) = f(s*) /
f(sa) for maximization problems

where f(sa) and f(s*) are values of the objective function f for the approximate solution sa and
actual optimal solution s*,performance ratio of the algorithm A the lowest upper bound of r(sa)
on all instances.The nearest-neighbor is a greedy method for approximating a solution to the
traveling salesman problem. The performance ratio is unbounded above, even for the important
subset of Euclidean graphs.
Starting at some city, always go to the nearest unvisited city, and, after visiting all the
cities, return to the starting one
Note: Nearest-neighbor tour may depend on the starting city
Accuracy: RA = ∞ (unbounded above) – make the length of AD arbitrarily large in the
above example
Twice-around-the-tree is an approximation algorithm for TSP with the performance ratio
of 2 for Euclidean graph. The algorithm is based on modifying a walk around a MST by
shortcuts.

111
Stage 1: Construct a minimum spanning tree of the graph (e.g., by Prim’s or Kruskal’s
algorithm)
Stage 2: Starting at an arbitrary vertex, create a path that goes twice around the tree
and returns to the same vertex
Stage 3: Create a tour from the circuit constructed in Stage 2 by making shortcuts to avoid
visiting intermediate vertices more than once

Note: RA = ∞ for general instances, but this algorithm tends to


produce better tours than the nearest-neighbor algorithm

112
A sensible greedy algorithm for the knapsack problem is based on processing an input’s
items in descending order of their value-to-weight ratios. For continuous version, the algorithm
always yields an exact optimal solution.
Greedy Algorithm for Knapsack Problem

Step 1: Order the items in decreasing order of relative values: v1/w1≥… ≥ vn/wn
Step 2: Select the items in this order skipping those that don’t fit into the knapsack

Example: The knapsack’s capacity is 16


item weight value v/w
1 2 $40 20
2 5 $30 6
3 10 $50 5
4 5 $10 2

Accuracy
 RA is unbounded (e.g., n = 2, C = m, w1=1, v1=2, w2=m, v2=m)
 yields exact solutions for the continuous version
Approximation Scheme for Knapsack Problem
Step 1: Order the items in decreasing order of relative values: v1/w1≥… ≥ vn/wn
Step 2: For a given integer parameter k, 0 ≤ k ≤ n, generate all subsets of k items or
less and for each of those that fit the knapsack,add the remaining items in decreasing
order of their value to weight ratios
Step 3: Find the most valuable subset among the subsets generated in Step 2 and return it
as the algorithm’s output
• Accuracy: f(s*) / f(sa) ≤ 1 + 1/k for any instance of size n
• Time efficiency: O(knk+1)
• There are fully polynomial schemes: algorithms with polynomial running time as
functions of both n and k

113

Das könnte Ihnen auch gefallen