Sie sind auf Seite 1von 207

Design and Analysis of Algorithms

1. Introduction
1.1 Notion of Algorithm
1.2 Fundamentals of Algorithmic Problem Solving
1.3 Important Problem Types
2. Analysis of Algorithm Efficiency
2.1 Analysis Framework
2.2 Asymptotic Notations and Basic Efficiency Classes
2.3 Mathematical Analysis of Non Recursive Algorithms
2.4 Mathematical Analysis of the Recursive Algorithms
2.5 Example - Fibonacci Numbers
3. Brute Force
3.1 Selection Sort
3.2 Bubble Sort
3.3 Sequential Search
3.4 Brute Force String Matching
3.5 Closes-Pair and Convex- Hull Problems by Brute-Force String
Matching
3.6 Exhaustive Search
4. Divide and Conquer
4.1 Merge Sort
4.2 Quick Sort
4.3 Binary Search
4.4 Binary Tree Traversals
4.5 Strassens Matrix Multiplication
4.6 Closest-Pair and Convex-Hull Problem
5. Decrease and Conquer
5.1 Insertion Sort
5.2 Depth-First Search and Breadth-First Search
5.3 Topological Sorting
6. Transform and Conquer
6.1 Presorting
6.2 Horners Rule

6.3 Binary Exponentiation


7. Space and Time Tradeoff
7.1 Sorting by Counting
7.2 Input Enhancement in String Matching
8. Dynamic Programming
8.1 Computing Binomial Coefficient
8.2 Warshalls Algorithm
8.3 Floyds Algorithm
8.4 Optimal Binary Search Trees
8.5 Knapsack Problem
8.6 Memory Functions
9. Greedy Technique
9.1 Prims Algorithm
9.2 Kruskals Algorithm
9.3 Dijkstras Algorithm
9.4 Huffman Tree
10. Limitation of Algorithm Power
10.1 Lower- Bound Arguments
10.2 Decision Tree
10.3 P, NP and NP-Complete Problems
11. Backtracking
11.1 4-Queen, 8-Queen and n-Queens problem
11.2 Hamiltonian Circuit
11.3 Sum of Subset problem
12. Branch and Bound
12.1 Assignment
12.2 Knapsack Problem
12.3 Traveling Salesman
13. Approximation Algorithms for NP-hard Problems
13.1 Traveling Salesman
13.2 Knapsack Problem

Design and Analysis of Algorithms


1. Introduction

1.1 Notion of Algorithm


The word algorithm comes from the name of a Persian author, Abu Jafar
Mohammed ibn Musa al Khowarizmi. The term algorithm refers to a
method that can be used by a computer to solve a problem. Reason for
studying algorithms is their usefulness in developing analytical skills.
Algorithms can be seen as a special kind of solutions to problems not
answers but precisely defined procedure for getting answers. Specific
algorithm design techniques can be interpreted as problem solving strategies
that can be useful regardless of whether a computer is involved.

Definition: Algorithm is a finite set of instructions for solving a problem.


It must satisfy the following criteria
1. Input: Zero or more input supplied.
2. Output: At least one output is produced.
3. Definiteness: Each instruction is clear and unambiguous.
4. Finiteness: The algorithm should terminate after a finite number of
steps.
5. Effectiveness: Every instruction must be very basic so that a
person using paper and pencil can carry it out in a finite amount of
time and it also must be feasible.
This definition can be illustrated by a simple diagram in Fig 1.1

Fig 1.1 Notion of Algorithm

An example illustrating the notion of algorithm: Euclids algorithm for


computing gcd(m,n)

gcd(m,n) = gcd(n, m mod n)

Ex: gcd (60,24) = gcd (24, 60 mod 24) = gcd (24, 12) = gcd (12, 24 mod 12)
= gcd (12, 0) = 12

Step 1: Start
Step 2: If n is equal to zero then go to step 6
Step 3: Divide m by n and assign the value of the remainder to rem
Step 4: Assign the value of n to m and the value of rem to n
Step 5: go to Step 2
Step 6: Print the value of m
Step 7: Stop

We can express the same Euclids algorithm in a psuedocode

ALGORITHM: Euclid ( m, n )
// Computes gcd ( m, n ) by Euclids algorithm
// Input: Two positive integers, not both zero integers m and n
// Output: Greatest common divisor of m and n
while n 0 do
r m mod n
m n
nr
return m

1.2 Fundamentals of Algorithmic Problem Solving


We now briefly discuss a sequence of steps one typically goes through
in designing and analyzing an algorithm in Fig 1.2

Fig 1.2

Algorithm Design and Analysis Process

1. Understanding the problem


Before designing an algorithm, understand the problem because
correct algorithm should work for all legitimate inputs
It is important to specify exactly the range of inputs the
algorithm needs to handle. Otherwise it may fail on some
boundary value.

2. Ascertaining the capabilities of a computational device


Most of the available computers are based on Von Neumann
architecture called Random Access Machines. The instructions
in these machines are executed one after the other, one
operation at a time. Algorithms designed to be executed on
such machines are called sequential algorithms.

Parallel algorithms can be developed for parallel machines.


The instructions in these machines are executed concurrently.

3. Choosing between exact and approximate problem solving


There are some problems that cannot be solved exactly.
Ex: Square root of a number.
There are some problems that can be solved exactly.
Ex: Traveling salesman problem.

4. Deciding an exact Data Structures


Some algorithms do not demand any structure in representing
their input but others depend on structuring and restructuring
data specifying a problem instance. Ex: Trees.
Data structures remain crucially important for both design and
analysis of algorithms.
Data structure + Algorithm = Program.

5. Designing an algorithm
An algorithm design technique is a general approach to solving
problems algorithmically that is applicable to variety of
problems from different areas of computing.
These design techniques provide guidance for designing
algorithms for new problems, i.e. problems for which there is
no known satisfactory algorithm.
Algorithm design technique makes it possible to classify
algorithms according to an underlying design idea, therefore
they can serve as a natural way to both categorize and study
algorithms.

6. Methods of Specifying an Algorithm


There are different ways of specifying an algorithm
Flowchart: is a collection of connected geometric shapes
containing descriptions of the algorithms steps
Pseudocode: is a mixture of a natural language and
programming language like constructs. Use an arrow <-

for the assignment operation and two slashes for // for


comments.
Program: Algorithm written in a particular computer
language.

It

can

be

considered

as

algorithms

implementation.

7. Proving an algorithms Correctness


To prove that the algorithm gives a correct result for every
legitimate input in a finite amount of time. We refer to this
process as algorithm validation.
Purpose of validation is to assure us that this algorithm will
work correctly independently of the issues concerning the
programming language.
A common technique for proving correctness is to use
mathematical induction because an algorithms iterations
provide a sequence of steps needed for such proofs.
If the result is incorrect, the algorithm has to be redesigned.

8. Analyzing an Algorithm
Analysis of algorithms or performance analysis refers to the
task of determining how much computing time and storage an
algorithm requires.
There are two kinds of algorithm efficiency. They are time
efficiency and space efficiency.
Time efficiency indicates how fast the algorithm runs.

Space efficiency indicates how much extra memory the


algorithm requires.
Simple algorithms are easier to understand and program,
usually contain fewer bugs.
Algorithm for a problem must be designed in more general
terms.
Algorithms must be designed that can handle a range of inputs.
If we are not satisfied with algorithms efficiency, simplicity or
generality, need to redesign the algorithm.

9. Coding an Algorithm
Now the program can be written. This phase is referred to as
program proving or sometimes called program verification.
Once a program is written we should know how to test a
program. Testing program consist of two phases: debugging
and profiling (or performance measurement).
Debugging is the process of executing programs on sample data
sets to determine whether faulty results occur and correct them
if so. Debugging can point to the presence of errors but not to
their absence.
Profiling or performance measurement is the process of
executing a correct program on data sets and measuring the
time and space it takes to compute the results.
Verification for the inputs should be provided while
implementing algorithms.

The analysis is based on timing the program on several inputs


and then analyzing the results obtained.
The validity of program is established by testing and
debugging.

1.3 Important Problem Types


These problems are used to illustrate different algorithm design techniques
and methods of algorithm analysis. The important problem types are:
1. Sorting
2. Searching
3. String Processing
4. Graph problems
5. Combinatorial problems
6. Geometric problems
7. Numerical problems

1.

Sorting
Arranging the items in either ascending or descending order is
called sorting
Usually lists of numbers, characters, character strings and records
will have to be sorted
Sorting is useful in many areas. The most important is searching
E.g.: Arranging the names of students in the attendance register in
alphabetical order.
Two properties of sorting need special attention

Sorting algorithm is said to be stable if it preserves the


relative order of any two equal elements in its input.
Example: if an input list contains 2 equal elements in
positions I and j where I<j, then in sorted list they have to in
positions I and j respectively such that I<j.
Second feature of sorting algorithm is amount of extra
memory algorithm requires. An algorithm is said to be in
place if it requires no extra memory. There are sorting
algorithms that are in place and that are not.
To rearrange the items of a given list in ascending order.
Input: A sequence of n numbers < a1, a2, , an >
Output: A reordering of the input sequence < a1, a2, ,
an > so that aI aj whenever i < j
Instance: The sequence <5, 3, 2, 8, 3>
Few examples of Sorting Algorithms are:
Selection sort
Insertion sort
Merge sort

2.

Searching
Finding a given value, called search key, in a given list is called
Searching
E.g.: Searching a telephone number in a telephone directory.
Searching has to be considered in conjunction with two other
operations: addition to and deletion from the data set of an item. In

such situations data structures and algorithms should be chosen to


strike a balance among requirements of each operation.
Few examples of Searching algorithms are
Linear Search
Binary Search

3.

String Processing
A string is a sequence of characters
Different types of strings are:
Text Strings Consists of letters, numbers and special
characters. E.g.: apple, 123, 1@xyz
Bit Strings Consists of zeros and ones.

E.g.: 1010,

111000111, 001
Some of the important string processing problems are:
String matching: Searching for a given word in a text
String concatenation: Adding one string to the end of
another string
String copy: Copying one string to another

4.

Graph problems
A graph is a collection of points called vertices, some of which are
connected by line segments called edges
Graphs are used to model a wide variety of real-life applications,
including transportation and communication networks, project
scheduling, and games
Basic graph algorithms include:

Graph traversal algorithms: To visit all the vertices of a graph


Shortest-path algorithms: Best route between two vertices in a
graph
Some graph problems cannot be solved in real time if the input size is
large. Some of them are:
Traveling Salesman Problem (TSP): Finding the shortest tour

through n cities that visits every city exactly once


Graph coloring problem: Assigning the smallest number of
colors to vertices of a graph so that no two adjacent vertices are
the same color. This problem arises in several applications such
as event scheduling; if the events are represented by vertices
that are connected by an edge if and only if corresponding
events cannot be scheduled in the same time.

5.

Combinatorial problems
a combinatorial object such as a permutation, a combination or a
subset that satisfies certain constraints and has some desired property
They are the most difficult problems because of the following reasons:
The number of combinatorial objects grows extremely fast with
a problems size
There are no known algorithms for solving most such problems
exactly in an acceptable amount of time
The traveling salesman problem and graph coloring problem are
examples of combinatorial problems

6.

Geometric problems
They deal with geometric objects such as points, lines, and polygons
Some of the geometric problems include constructing simple
geometric shapes like triangles, circle etc., closest-pair problem,
convex hull problem etc.
They find applications in computer graphics, robotics, and
tomography

7.

Numerical problems
Solving mathematical objects of continuous nature such as solving
equations and systems of equations, computing definite integrals,
evaluating functions and so on
Most of such problems can be solved only approximately
They play important role in scientific and engineering applications

Design and Analysis of Algorithms


2. Analysis of Algorithm Efficiency

2.1 Analysis Framework


An algorithm will have to be analyzed for two kinds of efficiency:
Time efficiency: It indicates how fast an algorithm in question
runs
Space efficiency: It deals with the extra space the algorithm
requires

While analyzing an algorithm, the emphasis is on time efficiency than


space efficiency because of the following reasons:
With the advancement in electronic technology, the space
efficiency of an algorithm is not of much concern but the time
required by an algorithm has not diminished to the same extent
Research experience has shown that for most problems,
spectacular progress can be achieved in speed than in space
The framework for analyzing the time and the space efficiency is as
follows:

1. Measuring an Inputs Size


An algorithms efficiency is measured as a function of some
parameter n indicating the algorithms input size because almost all
algorithms run longer on larger inputs
Example:
Sorting, Searching algorithms - Input size is the size of the list
Algorithm to evaluate a polynomial p(x) = anxn + + a0 Input
size can be the polynomials degree or the number of coefficients
Algorithm to compute the product of two n x n matrices Input
size can be the order of the matrix n or the total number of
elements N in the matrices
Spell-checking algorithm - If the algorithms examines individual
characters of its input, then the input size will be the number of
characters and the algorithm examines words, then the input size
will be the number of words in the input
Algorithms involving properties of numbers Input size can be the
number of bits in the numbers binary representation :

b = log2 n + 1

2. Units for measuring Running Time


Using some standard unit of time measurement like a second, a
millisecond and so on
Drawbacks:
Depends on the speed of a particular computer
Depends on the quality of a program implementing the
algorithm
Depends on the compiler used in generating the machine code
Clocking the actual running time is difficult
Count the number of times each of the algorithms operations is
executed
Drawbacks:
Excessively difficult and usually unnecessary
Count the number of times the most important operation of the
algorithm, called as the basic operation is executed. Basic operation is
usually the most time-consuming operation in the algorithms
innermost loop. e.g.: In sorting an searching algorithms the basic
operation is the key comparison operation
The running time and the basic operation can be related by the
following equation
T (n) cop * C (n)
`

Where, cop - the time of execution of algorithms basic operation on a


particular computer
C (n) Number of times the basic operation is executed

T (n) Running time of a program implementing the algorithm


The above equation makes it possible to answer certain questions
without actually knowing the value of cop
Example: Assuming C(n) = n (n 1), how much longer will the
algorithm run if the input size is doubled
C (n) = n (n 1) = n2 -1/2 n n2
And therefore
(2n)2

T (2n)

cop * C (2n)

-------

--------------- ----------

T (n)

cop * C (n)

= 4

n2

3. Orders of Growth
It specifies the effect on an algorithms performance by the variation
in the input size
Example: A twofold increase in the input size, n will cause the
logarithmic functions to increase by a value of 1, linear function by
twofold, quadratic function by fourfold and so on
The counts order of growth for large input sizes is important than
small input sizes because, a difference in running times on small
inputs is not what really distinguishes efficient algorithms from
inefficient ones
The logarithmic functions are the functions growing the slowest,
whereas, the exponential and factorial functions grow so fast that their
values become astronomically large even for small values of n.

4. Worst-Case, Best-Case and Average-Case efficiencies

The running time of an algorithm not only depends on the input size
but also on the nature of the input on certain occasions
Depending on the nature of the input an algorithm can have three
kinds of efficiencies
Worst-Case efficiency
It is the efficiency of an algorithm when it runs the longest
among all possible inputs of size n
To determine this, find the inputs that yield the largest value of
the basic operations count
It bounds the running time from above
Best-Case efficiency
It is the efficiency of an algorithm when it runs the fastest
among all possible inputs of size n
To determine this, find the inputs that yield the smallest value
of the basic operations count
It bounds the running time from below
Average-Case efficiency
It is the efficiency of an algorithm on random inputs
To determine this, some assumptions about possible inputs of
size n have to be made
It is important because there are many algorithms for which the
average-case efficiency is much better than the worst-case
efficiency would lead us to believe
Example: Consider the sequential search algorithm given below
ALGORITHM SequentialSearch (A[0, ...,n 1], K)
i0

while i < n and A[i] K do


ii+1
if i < n
return i
else
return 1
Worst-case efficiency: It occurs when the key element is not
found in the list or when the key element is found in the last
location of the list C worst (n) = n
Best-case efficiency: It occurs when the key element is
found in the first location of the list C best (n) = 1
Average-case efficiency: Assumptions are
a. The probability of a successful search is equal to p
b. The probability of the first match occurring in the ith
position of the list is the same for every i
In case of a successful search, the probability of the first
match occurring in the ith position of the list is p/n for every
i, and the number of comparisons is i and in case of an
unsuccessful search, the number of comparisons is n with the
probability of search being (1 p)
Therefore,
C avg (n) = [1 x p/n + 2 x p/n + + i x p/n + + n x p/n] +
n x (1 p)
= p/n [1 + 2 + + i + + n] + n x (1 p)
= p/n [n (n + 1) / 2] + n (1 p)
= p (n + 1) / 2 + n (1 p)

Note: If search is successful, p = 1 and the average number of


comparisons made will be equal to (n + 1) / 2. If the search is
unsuccessful, p = 0 and the number of comparisons made will
be equal to n
Amortized efficiency:
It applies not to a single run of an algorithm but rather to a
sequence of operations performed on the same data structure
This approach was discovered by the American computer
scientist Robert Tarjan
It is important because in some situations a single operation can
be expensive, but the total time for an entire sequence of n such
operations is always significantly better than the worst-case
efficiency of that singe operation multiplied by n

2.2 Asymptotic Notations and Basic Efficiency Classes


There are three asymptotic notations: O (big oh), (big omega), (big
theta)

Definition: O notation
A function t(n) is said to be in O(g(n)), denoted by t(n) O(g(n)), if t(n) is
bounded above by some constant multiple of g(n) for all large n, i.e., if there
exist some positive constant c and some nonnegative integer n0 such that t(n)
cg(n) for all n n0

The definition is illustrated in Fig 2.1

Fig 2.1 Big-oh notation: t(n) O(g(n))


Examples: n O(n2), 25n + 10 O(n2)
1) Let f(n) = 100n + 5 and O(n2) be worst case for algorithm.
Now we have to prove the constraint.
f(n) c*g(n)

for all n n0

To prove:
100 * n + 5 O(n2)
100 * n + 5 100 * n + n

for all n 5

101 * n
101 * n2
Thus C = 101 and n0 5

2)

Let f(n) = (3 * n) + 3
O(n) be worst case for the algorithm.
To prove:

(3 * n) + 3 O(n)
(3 * n) + 3 (3 * n) + n for all n 3
4n.

Thus C = 4 and n0 = 3.
The definition gives us a lot of freedom in choosing specific values
for constants c and n0

Definition 2:

notation

A function t(n) is said to be in (g(n)), denoted by t(n) (g(n)), if t(n) is


bounded below by some positive constant multiple of g(n) for all large n, ie,
if there exist some positive constant c and some nonnegative integer n0 such
that t(n) cg(n) for all n n0

The definition is illustrated in Fig 2.2

Fig 2.2 Big-omega notation: t (n) (g(n))

Examples for Omega notation:


1) n2 belongs to (n2)
let n3 be for function f(n) of an algorithm.
(n2) be the lower bound or best case for that algorithm.
To prove the constraint
f(n) C * g(n)

for all n n0

To Prove: n3 (n2)
n3 n2

for all n 0
for all n 0

thus C = 1 n0 = 0.

2) 100 * n + 6 belongs to (n)


To prove: 100 * n + 6 (n)
100 * n + 6 100 * n

for all n 0

Thus C = 100 and n0 = 0.


The function g(n) is only lower bound on f(n).

Definition 3: notation
A function t(n) is said to be in (g(n)), denoted by t(n) (g(n)), if t(n) is
bounded both above and below by some positive constant multiples of g(n)
for all large n, ie, if there exist some positive constant c1 and c2 some
nonnegative integer n0 such that c2g(n) t(n) c1g(n) for all n n0

The definition is illustrated in Fig 2.2

Fig 2.3 Big-theta notation: t (n) (g(n))

Examples for notation:


1) * n * (n - 1) belongs to (n2)
We have to prove that f(n) lies between both bounds C1 * g(n) f(n) C2 *
g(n)
First we shall prove right inequality that is for upper bound, i.e.
f(n) C2 * g(n)
* n * (n - 1) * (n2) * n * (n2)
* (n2)

for all n 0

Thus C1 in this case is and n0 is 0.


Now left inequality
C1 * g(n) f(n)

or

f(n) C1 * g(n)

* n * (n - 1) (n2) 1 / 2 * n
* n2 1 / 2 * n * 1 / 2 * n
* n2 1 / 4 * n2

for all n 2

* n2
C2 =

and

C1 = 1 / 2

n0 = 2
C2 = 1 / 4 and n0 = 2.

1. Using the Property involving the Asymptotic Notations

Using the formal definitions of the asymptotic notations we can prove


their general properties.

Theorem 1: If t1(n) O(g1(n)) and t2(n) O(g2(n)) then t1(n) + t2(n)


O(max{ g1(n ), g2(n)})

Proof:
t1(n) O(g1(n)) t1(n) c1 g1(n) for all n n1

(1)

t2(n) O(g2(n)) t2(n) c2 g2(n) for all n n2

(2)

(1) + (2) gives


t1(n) + t2(n) c1 g1(n) + c2 g2(n)
c3 g1(n) + c3 g2(n)

[because c3 = max{ c1 , c2 } ]

c3 [ g1(n) + g2(n) ]
2 c3 max { g1(n) , g2(n) }
t1(n) + t2(n) O( max { g1(n) , g2(n) })

2. Using Limits for Comparing Orders of Growth

To compare the orders of growth of two specific functions, a


convenient method is to compute the limit of the ratio of two functions.
Three principle cases may arise:

limn
_

t(n)/g(n) =

T(n) has smaller order of growth than g(n)

T(n) has the same order of growth than g(n)


T(n) has larger order of growth than g(n)

For computing limits use LHospitals rule

If limn f(n) = lim n g(n) = , the derivatives f, g exist, then

Example 1: Compare the orders of growth of n (n 1) and n2


Lim
n

n (n 1)
1 lim
-------------- = --- n
n2
2
= 1 / 2 lim
n

n2 - n
------n2
(1 1 / n) = 1 / 2

Basic Efficiency Classes

Class

Name

Comments

constant

Short of best-case efficiencies, typically


goes to infinity when its input size grows
infinitely large.

Log n

logarithmic

Typically, a result of cutting a problem's


size by a constant factor on each iteration of
the algorithm.

linear

Algorithms that scan a list of size n (e.g.,


sequential search) belong to this class.

n log n

n-log-n

Many

divide-and-conquer

algorithms,

including mergesort and quicksort in the


average case, fall into this category.
n2

quadratic

Typically,

characterizes

efficiency

of

algorithms with two embedded loops.


Elementary sorting algorithms and certain
operations on n-by-n matrices are standard

examples.
n3

cubic

Typically,

characterizes

efficiency

of

algorithms with three embedded loops.


Several nontrivial algorithms from linear
algebra fall into this class.
2n

exponential

Typical for algorithms that generate all


subsets of an n-element set. Often, the term
"exponential" is used in a broader sense to
include this and faster orders of growth as
well.

n!

factorial

Typical for algorithms that generate all


permutations of an n-element set.

2.3 Mathematical Analysis of Nonrecursive Algorithms


The general plan for analyzing efficiency of nonrecursive algorithms:
1. Decide on a parameter(s) indicating an input's size.
2. Identify the algorithm's basic operation.
3. Check whether the number of times the basic operation is executed
depends only on the input size. If it also depends on some additional
property, determine the worst-case, average-case, and best-case
complexities separately.
4. Find out C(n) [the number of times the algorithm's basic operation is
executed]
5. Using standard formulas establish the order of growth.

Example 1: Finding the maximum element in a list of n elements


ALGORITHM MaxElement(A[0,.. ,n - 1])
//Determines the value of the largest element in a given array.
//Input: An array A[0,..,n 1]
//Output: The value of the largest element in A.
maxval A[0]
for i 1 to n - 1 do
if A[i] > maxval
maxval = A[i]
return maxval

Explanation:
Assume the first element in the array to be the largest element and
assume it to maxval
maxval A [0]
In the for loop compare the maxval with the remaining elements of the
array one after the other and overwrite the maxval if any of the array
elements happen to be greater than maxval
Since the execution of the for loop is completed, return the largest
value that is stored in maxval
Analysis:
Input Size: - Number of elements in the array, n.
Basic Operation: - Key Comparison operation
if A[i] > maxval.

Let C(n) be the number of times the key comparison operation


performed.
The number of times the basic operation is executed depends only on
the input size; n. i.e., the key comparison operation is executed always
irrespective of the position of the largest element in the array. Hence
the efficiency is same for all the three cases.

C(n) =
Example 2: Finding the Uniqueness problem, to check whether all the
elements in a given array are distinct
ALGORITHM

DistinctElements (A[0..n - 1])

//Checks whether all the elements in a given array are distinct


//Input: An array A [0,...,n - 1]
//Output: Returns "true" if A contains distinct elements, otherwise
//Returns "false.
for i 0 to n 2 do
for j i + 1 to n - 1 do
if A[i] = A [j] return false
return true

Explanation:
In the first pass it compares the first element with all the remaining
elements in the array (starting from the second element) and returns
false if any of the elements is equal to the first element.
In the second pass it compares the second element with all the
remaining elements in the array (starting from the third element) and
returns false if any of the elements is equal to the second element.

The above process is repeated until the element in the position


previous to the last position is compared with the element in the last
position.
If all the elements in the array are different, then the nested for loops
does not return false, in which case a true is returned indicating that
all the elements in the array are distinct

Analysis:
Input Size: - The number of elements in the array, n.
Basic Operation: - Key Comparison operation if A[i] = A [j]
The number of times the key comparison operation is performed
depends on the positions of the equal elements, hence the algorithm
has best case, worst case and average case
Let C(n) be the number of times the key comparison operation
performed.
Best Case:
It occurs when the first two elements in the array are equal.
Hence the key comparison operation will be executed only
once.
i.e. C(n) = 1 (1)
It occurs when the last two elements in the array are equal.
Hence the key comparison operation will be executed
maximum number of times.
n-2

n-1

i=0

j=i+1

C(n)=

i.e.

n-2
=

n 1 (i + 1) + 1
i=0

n-2
=

ni1
i=0

= (n 0 - 1) + (n 1 - 1) + + (n (n 2) 1)
= (n 1) + (n 2) + + 1
= (n 1) (n 1 + 1) / 2
= n (n 1) / 2
C(n) (n2)

Example 3: Finding the Matrix multiplication


ALGORITHM

MatrixMultiplication (A[0...n 1, 0...n - 1]), B[0...n 1,

0...n - 1])
//Multiplies two square matrices of order n by the definition-based
//algorithm
//Input: Two n-by-n matrices A and B
//Output: Matrix C = AB
for i 1 to n do
for j 1 to n do
C [i, j] 0.0

for k 1 to n do
C [i, j ] C [i, j] + A [i, k] * B [k, j]
return C

Explanation:
The i loop indicates the row, j loop indicates the column and k loop
acts as an index of the column in the first matrix and index of the row
in the second matrix that are to be multiplied.
In the k loop, the matrix product is found as follows:
When k = 0, A[i, k] * B[k, j] multiplies the element in the first
column of the ith row with the element in the first column of the
jth column.
When k = 1, A[i, k] * B[k, j] multiplies the element in the
second column of the ith row with the element in the second
column of the jth row and adds it to partial product stored in
c[i, j].
During each iteration of the k loop the corresponding elements
in the ith row and the jth column are multiplied and product
obtained is added to the partial product produced so far, which
is stored in C [i, j].
Execution of the k loop for every value of i and j, computes one
corresponding value of the product matrix C.

Analysis:
Input Size: - Order of the matrix, n.
Basic Operation: - The multiplication operation A[i, k] * B[k, j]

Note: we can consider addition operation also as the basic operation


but since the multiplication operation is slower and less efficient
compared to addition operation, the multiplication operation is
considered as the basic operation for the analysis purpose.
Let C (n) be the number of times the multiplication operation
performed.
The number of times the multiplication operation is executed depends
only on the input size, n. Hence all the three cases are same.
n

C(n) =

i=1 j=1 k=1

i=1 j=1

n
=

n2

i=1
= n3
C(n) (n3)

2.4 Mathematical Analysis of Recursive Algorithms

To analyze the efficiency of recursive algorithms, we use recurrence


relations or recurrences.
An equation that defines the function of a variable n, implicitly as a
function of its value at another point (e.g.: n 1, n 2, etc) is called as
a recurrence relation.
Recurrence relations play an important role not only in analysis of
algorithms but also in some areas of applied mathematics.

The general plan for analyzing efficiency of Recursive algorithms:


1. Decide on a parameter (or parameters) indicating an input's size.
2. Identify the algorithm's basic operation.
3. Check whether the number of times the basic operation is executed
can vary on different inputs of the same size; if it can, the worst-case,
average-case, and best-case efficiencies must be investigated
separately.
4. Set up a recurrence relation, with an appropriate initial condition, for
the number of times the basic operation is executed.
5. Solve the recurrence or at least ascertain the order of growth of its
solution.

Example 1: To find the factorial of a given number using function F(n) = n!


for an arbitrary non-negative integer.
n! = 1 * . * ( n 1 ) * n = ( n 1 ) ! * n
0! = 1 by definition.

ALGORITHM: Factorial(n)
//Computes n! recursively.

for n 1.

//Input: A positive integer n.


//Output: Value of n!
if n = 0
return 1
else
return F(n 1) * n

Explanation:
If the given input is 0, it returns the result 1
If the given input is any positive number other than 0, then recursively
computes F(n) = F(n 1) * n

Analysis:
Input Size: - The number n whose factorial is to be found.
Basic operation: Multiplication operation F(n 1) * n
Let C(n) denote the number of times the multiplication operation is
executed.
F(n) is computed using the formula F(n) = F(n 1) * n. Since we
dont know how many times the multiplication operation is performed
within the call F(n 1), let us denote it by C(n 1) and after finding
out F(n 1), one multiplication is performed to find F(n). Therefore,
the following recurrence relation gives the number of times the basic
operation is executed.
C(n) = C(n 1) + 1

for all n > 0

If n = 0 then we are not performing any multiplication operation.


Hence the initial condition C(0) = 0.
Now, solving the recurrence C(n) = C(n 1) + 1, we get
C(n) = C(n - 1) + 1
substitute C(n - 1) = C(n - 2) + 1
= [ C(n - 2) + 1 ] + 1 = C(n - 2) + 2
= [ C(n - 3) + 1 ] + 1 + 1 = C(n - 3) + 3
=
=
= C(n i) + i
=
When i = n, we get
= C(n n) + n
= C(0) + n
=n
C(n) (n)

Example 2: The Tower of Hanoi problem


ALGORITHM: Hanoi(source, temp, destination, n)
//moves the disk recursively from source peg to destination peg using an
//auxiliary peg.
//Input: Number of disks, n, to be moved from source to destination.
//Output: The movement of disks from source to destination
if n = 1
move the disk from source to destination
return

else
Hanoi (source, destination, temp, n 1)
Move nth disk from source to destination
Hanoi (temp, source, destination, n - 1)

Fig 2.1
Explanation:
Moves n 1 disks recursively form the source peg to the auxiliary peg
using destination peg as temporary
Moves the nth disk from source to the destination
Moves the n 1 disks recursively which are in the auxiliary peg to the
destination peg using source as the temporary
The above three steps are shown in the Fig 2.1 above.

Analysis 1:
Input size: The number of disks, n.
Basic operation: Disk move operation from one peg to another

Let C(n) be the number of times the disk is moved


The number of disk moves in the first recursive call from the source
peg to the auxiliary peg can be represented by C(n 1). After moving
the n 1 disks from the source to the auxiliary one disk move
operation is required to move the nth disk directly from the source to
the destination. Similarly the number of disk moves in the second
recursive call from the auxiliary peg to the destination peg can be
represented by C(n 1).
Therefore, C(n) = C(n 1) + 1 + C(n 1)
When there is only one disk, it is directly moved from the source peg
to the destination peg. Hence the initial condition is C(1) = 1.
Now, solving the recurrence C(n) = C(n 1) + 1, we get
C(n) = C(n 1) + 1 + C(n 1)
= 2 C(n 1) + 1
= 2 [ 2 C(n 2) + 1 ] + 1
= 22 C(n 2) + 2 + 1
= 22 [ 2 C(n 3) + 1 ] + 2 + 1
= 23 C(n 3) + 22 + 21 + 20
=
=
= 2i C(n i) + 2i-1 + + 22 + 21 + 20
=
=
When i = n 1, we get
= 2n-1 C(1) + 2n-2 + + 22 + 21 + 20
= 2n-1 + 2n-2 + + 22 + 21 + 20

n-1
= 2i
i=0
= 2n 1 (2n)
Analysis 2:
The number of times the recursive calls made and hence the number
of times the disk is moved can also be represented by a complete
binary tree as shown in the Fig 2.2 below.
We know that the number of nodes in a complete binary tree is 2n 1.
Hence the efficiency of Tower of Hanoi (2n)

Fig. 2.2
Example 3: Counting the binary digits

Explanation:
To convert a decimal integer to its equivalent binary form, we divide
the integer by 2 and each such division gives us one binary digit.
Hence in the else part of the algorithm the integer n is divided by 2,
count of the number of binary digits is incremented by one and the
divided number is passed as an argument to the function recursively
until the integer becomes equal to 1.
When n becomes equal to 1 it constitutes the last binary digit of n and
hence 1 is returned to add 1 to the accumulated count of the number of
binary digits.

Analysis:
Input size: The integer n.
Basic operation: Addition operation BinRec(n / 2) + 1
Let C(n) be the number of times the division operation is
performed.
Let C(n / 2) represent the number of times the addition operation is
performed in the recursive call BinRec(n / 2). One more addition
operation is performed to add one after returning from the
recursive call. Hence, C(n) = C(n / 2) + 1.
When n becomes 1, no addition operation is performed. Hence the
initial condition is C(1) = 0.
Now, solving the recurrence C(n) = C(n / 2) + 1, we get

C(n) = C(n / 2) + 1
Let n = 2k

Therefore, k = log2n
C(2k) = C(2k-1) + 1
= C(2k-2) + 2
= C(2k-3) + 3
=
=
=
= C(2k-i) + I
=
=
=
when i = k, we get
= C(2k-k) + k
= C(1) + k
=k
we know that n = 2k and k = log2n
= log2 n
C(n) (log n)

2.4 Example: Fibonacci Numbers

The Fibonacci sequence starts with the numbers 0, 1 and the


successive fibonacci numbers are generated by adding the previous
two Fibonacci numbers.
The first few numbers is the Fibonacci sequence is given below: 0, 1,
1, 2, 3, 5, 8, 13, 21, 34, 55

If the first two Fibonacci numbers are represented as F(0) = 0 and F(1)
= 1, the Fibonacci sequence can be generated by the recurrence F(n) =
F(n-1) + F(n-2).
The Fibonacci recurrence and the recurrence used to analyze the
efficiency of the recursive algorithm to find the nth Fibonacci number
are examples of a homogeneous second-order linear recurrence with
constant coefficients. Hence it is necessary to know how to solve a
homogeneous

second-order

linear

recurrence

with

constant

coefficients.
Homogeneous second-order linear recurrence with constant coefficients
A recurrence of the form ax( n ) + bx ( n 1 ) + cx ( n 2 ) = 0 is
called as a homogenous second-order linear recurrence with constant
coefficients
Homogeneous Since ax (n) + bx (n 1) + cx (n 1) is equal to
zero
Second-order The elements x (n) and x (n 2) are two positions
apart
Linear The terms are a linear combination of the unknown terms
To solve this recurrence, a quadratic equation with the same
coefficients as the recurrence, called as a characteristic equation is
used
Characteristic equation: ar2 + br + c = 0
If r1 and r2 are the roots of the characteristic equation, then the
solution to the recurrence is obtained as follows
If r1 and r2 are real and distinct, then
x (n) = r1n + r2n
If r1 and r2 are equal, then

x (n) = rn + n rn
If r1 = r2 = u iv, then
x ( n ) = n ( cosn + sinn )
Where, and are two arbitrary constants,
= (u2 + v2) and
= arctan (v / u)

Solving the recurrence F(n) = F(n-1) + F(n-2):


F(n) = F(n-1) + F(n-2)
F(n) - F(n-1) - F(n-2) = 0
where a = 1, b = -1, c = -1
Characteristic equation: r2 r 1 = 0
1 (1+4)
r

= --------------

1 5
=

----------

1 + 5
r1 = ---------2

1 - 5
r2 = ---------2
The roots are real and distinct, hence
F ( n ) = (( 1 + 5 ) / 2) n + (( 1 - 5 ) / 2) n
Since F (0) = 0,

(( 1 + 5 ) / 2)0 + (( 1 - 5 ) / 2)0 = 0

+ = 0
Since F (1) = 1,

---------- (1)

(( 1 + 5 ) / 2) 1 + (( 1 - 5 ) / 2) 1 = 1
( 1 + 5 ) + ( 1 - 5 ) = 2

---------- (2)

From equation (1), = -


Substituting the value of in equation (2), we get,
- (1 + 5) + (1 - 5) = 2
(-1 - 5 + 1 - 5) = 2
(-25) = 2
= - 1 / 5
Since = - ,

= 1 / 5

Therefore,
F(n) = 1 / 5 ((1 + 5) / 2) n - 1 / 5 ((1 - 5) / 2) n
= 1 / 5 (n - n)
Where, = (1 + 5) / 2 (known as golden ratio)
= (1 - 5) / 2
Example: To find the nth Fibonacci number
ALGORITHM: Fib(n)
if n 1
return n
else
return [Fib(n 1) + Fib(n 2)]

Explanation:
If n is equal to 0 or 1 it is returned as it is since it is the 0th and 1st
Fibonacci number respectively

If n is greater than 1, then the nth Fibonacci number is found


recursively using the recurrence Fib(n) = Fib(n 1) + Fib(n 2) and
returned

Analysis:
Input size: The number n.
Basic operation: Addition operation Fib(n 1) + Fib(n 2)
Let C(n) be the number of times the addition operation is executed.
Let C(n 1) represent the number of times the addition operation is
executed in the recursive call Fib(n 1) and C(n 2) represent the
number of times the addition operation is executed in the recursive
call Fib(n 2).
After finding Fib(n 1) and Fib(n 2) one more addition is
performed to find the nth Fibonacci number. Hence, the recurrence is
C(n) = C(n 1) + C(n 2) + 1.
When n is equal to 0 or 1, no addition operation is performed. Hence
the initial conditions are C(0) = 0 and C(1) = 0.
Now, solving the recurrence C(n) = C(n - 1) + C(n 2) + 1, we get
C(n) = C(n - 1) + C(n 2) + 1
C(n) C(n 1) C(n 2) = 1
Since the RHS = 1, it is an inhomogeneous second-order linear
recurrence with constant coefficients.

To make it homogeneous, we add and subtract 1 to the LHS


C(n) C(n 1) C(n 2) 1 + 1 = 1
C(n) C(n 1) C(n 2) 1 + 1 1 = 0
C(n) + 1 C(n 1) 1 C(n 2) 1 = 0

[C(n) + 1] - [C(n 1) + 1] [C(n 2) + 1] = 0

------ ( 1 )

Let, b(n) = C(n) + 1

------ ( 2 )

Therefore, b(n 1) = C(n 1) + 1


b(n 2) = C(n 2) + 1
Substituting b(n) in equation (1), we get,
b(n) - b(n 1) - b(n 2) = 0
with initial conditions b(0) = 1 and b(1) = 1
This recurrence is same as the Fibonacci recurrence
F(n) F(n - 1) F(n - 2) = 0
With initial conditions F(0) = 0 and F(1) = 1.
Both the recurrences are similar except that b(n) starts with two
ones and thus runs one step ahead of F(n).
So, b(n) = F(n + 1)

------ ( 3 )

From ( 2 ) & ( 3 )

C(n) = F(n + 1) 1
= 1 / 5 (( 1 + 5 ) / 2) n+1 - 1 / 5 (( 1 - 5 ) / 2) n+1 1
= 1 / 5 {(( 1 + 5 ) / 2) n+1 - (( 1 - 5 ) / 2) n+1 } 1
= 1 / 5 [ n+1 - n+1 ] - 1

Design and Analysis of Algorithms


3. Brute Force

Brute force is a straightforward approach to solve a problem based on


the problems statement and definitions of the concepts involved.

Advantages:
It is applicable to a wide variety of problems.
For some problems like sorting, searching, it yields reasonable
algorithms of at least some practical value with no limitation on
instance size.
The expense of designing a more efficient algorithm may be
unjustifiable if only a few instances of a problem need to be solved.
Even if too inefficient in general, a brute force algorithm can still be
useful for small size instances of a problem.

Examples:
1. Computing gcd(m,n)
2. Computing n!
3. Multiplying two matrices
4. Searching for a key of a given value in a list
5. Sorting the numbers or alphabets in ascending or descending order
6. String Matching

3.1. Selection Sort

Explanation:
During the first pass, scan the array to find its smallest element and
swap it with the first element.
During the second pass start with the second element, scan the
elements to the right of it to find the smallest among them and swap it
with the second elements.
Generally, on pass i (0 i n-2), find the smallest element in A[i..n-1]
and

swap

it

with

A[i]:

A[0] . . . A[i-1] | A[i], . . . , A[min], . . ., A[n-1]


in their final positions

the last n i elements

Example:

89

1 pass

45

17

68

45

90

68

29

90

34

29

17

34

89

2 pass

17

29

68

90

3 pass

17

29

34

90

4 pass

17

29

34

45

5 pass

17

6 pass

17

29

29

34

34

45

45

45

34

68

89

68

89

90

45

68

89

68

89

90

89

90

Analysis:
Input Size: The number of elements in the array, n.
Basic operation: Key comparison operation if (a[j] < a[pos]).
Let C(n) be the number of times the key comparison operation is
executed.
The key comparison operation is executed in the innermost for loop
and is given by

n-2

n-1

C(n) =
i=0

j=i+1

n-1

n-2

= (n 1) - (i + 1) + 1

i=0

(n 1 i)
i=0

= (n 1) + (n 2) + + 1 = n * (n 1) / 2
= (n2 n) / 2 (n2).
Note: The number of key swaps is only (n) or more precisely n - 1.

3.2 Bubble Sort


Algorithm: bubble sort(A[0 . n 1])
//Algorithm sorts array A[0 . n 1]
//Input:An array A[0 . n 1] of orderable elements
//Output: Array A[0 . n 1] sorted in ascending order
for i 0 to n - 2 do
for j 0 to n 2 i do
if A[j + 1] < A[j]
swap A[j] and A[j + 1]

Explanation:
Compare the adjacent elements of the list and exchange them if they
are out of order

During the first pass, the largest element is bubbled up to the last
position of the list.
The next pass bubbles up the second largest element and so on

A[0],...... A[j] A[j + 1] ........A[n i 1] | A[n i] ... A[n 1]


in their final positions

Example:
89 45

68

90

29

34

17

45 89

68

90

29

34

17

45 68

89

90

29

34

17

45 68

89

90

29

34

17

45 68

89

29

90

34

17

45 68

89

29

34

90

17

45 68

89

29

34

17

90

Pass 1 End.

45 68

89

29

34

17

90

45 68

89

29

34

17

90

45 69

89

29

34

17

90

45 68

29

89

34

17

90

45 68

29

34

89

17

90

45 68

29

34

17

89

90

End of pass 2, and so on.

Analysis:
Input Size: The number of elements of the array, n.
Basic operation: Key comparison operation if A[j + 1] < A[j]
Let C(n) be the number of times the key comparison operation is
executed.
The key comparison operation is executed in the innermost for loop
and is given by

n-2 n-2-i
C( n ) =
i=0

j=0

= (n 1) + (n 2) + (n 3) + (n 4) + .. + 1
= n * (n 1) / 2
C(n) (n2)
Note: The number of key swaps depends on the input. For the worst case of
decreasing arrays, it is same as number of key comparisons.

3.3 Sequential Search


Algorithm Sequential search (A[0 n] , k)
//Algorithm implements sequential search with a search key as a sentinel
//Input: An array of n elements and a search key k.
//Output: The position of the first element in A[0 . n 1] whose value is
equal to k, or 1 if no such element is found.
A[n] k
i0
while A[i] k do
ii+1
if i < n
return i
else
return 1

Explanation:
The algorithm simply compares successive elements of a given list
with a given search key until either a match is encountered (successful
search) or the list is exhausted without finding a match (unsuccessful
search).
Analysis:
Input size: Number of elements in the list, n.
Basic operation: Key comparison operation A[i] k
Let C(n) be the number of times the key comparison operation is
executed.
The number of times the key comparison operation is executed not
only depends on the input size but also on the position of the search
key in the given list. Hence the algorithm has worst-case, best-case
and average-case efficiencies.
Worst-case: It occurs when the key element is not found in
the list or when the key element is found in the last location
of the list.
n
C worst (n) = 1

= n 0 + 1 = n + 1 (n)

i=0
Best-case: It occurs when the key element is found in the
first location of the list.
0
C best (n) = 1
i=0

= 0 0 + 1 = 1 (1)

Average-case: Assumptions are


c. The probability of a successful search is equal to p.
d. The probability of the first match occurring in the ith
position of the list is the same for every i.
In case of a successful search, the probability of the first
occurring in the ith position of the list is p/n for every

match

i, and the number of comparisons is I and in case of an


unsuccessful search, the number of comparisons is n with the
probability of search being (1 p).
Therefore,
C avg (n) = [1 x p/n + 2 x p/n + + i x p/n + + n x p/n] +
n x (1 p)
= p/n [1 + 2 + + i + + n] + n x (1 p)
= p/n [n (n + 1) / 2] + n (1 p)
= p (n + 1) / 2 + n (1 p)
Note: If search is successful, p = 1 and the average number of
comparisons made will be equal to (n + 1) / 2. If the search is
unsuccessful, p = 0 and the number of comparisons made will
be equal to n

3.4 Brute Force String Matching


Given a string of n characters called the text, and a string of m
characters called pattern (m <= n), find a substring of text that
matches a pattern.We have to find i, index of leftmost character of the
first matching substring in the text such that
ti = p0, . . . , ti+j = pj, . . . , ti+m-1 = pm-1

Text T
Pattern P

t0 . . . ti . . . ti+j . . . ti+m-1 . . . tn-1


p0

pj

pm-1

Explanation:
Align the pattern against first n characters of the text.
Start matching corresponding pairs of characters from left to right
until either all the m pairs of characters match or a mismatching pair
is encountered.

If a mismatching pair is encountered, the pattern is shifted one


position to the right and character comparisons are resumed, starting
again with the first character of the pattern and its counterpart in the
text.

Example:
Text: NOBODY NOTICED HIM
Pattern: NOT
NOBODY _NOTICED_HIM

NOT
NOT
NOT
NOT
NOT
NOT
NOT
NOT
Analysis:
Input size: The number of characters in the text, n, and the number of
characters in the pattern, m.
Basic operation: Co mparison operation P[j] = T [i + j]
Let C (n) be the number of times the comparison operation is
executed.
The number of times the comparison operation is executed depends on
the position of the pattern in the text. Hence the algorithm has worstcase, best-case and average-case efficiencies.
Worst-case:
It occurs if the following happens throughout the length of the
text: the m 1 pairs of the corresponding characters of the
text and pattern match, but a mismatch occurs at the mth pair.
n-m m-1
C worst (n) =

n-m

1 = m = m (n m + 1)

i=0 j=0
= mn m2 + m

i=0

mn (mn)

Best-case:
It occurs when the first m 1 character of the text match with
the corresponding characters of the pattern.

C best (n) =

m-1

1 = m = m (0 0 + 1)

i=0 j=0

i=0

= m (m)
Average-case:
It is shown that for random inputs, the efficiency is linear
C avg (n) = (n + m) (n)
3.5 Closes-Pair and Convex- Hull Problems by Brute-Force
String Matching

Problem statement:
Find the two Closest points in a set of n points (in the twodimensional Cartesian plane), where distance between two points pi
and pj is given by d( pi, pj ) = Sqrt ( ( xi xj )2 + ( yi yj )2 )

Explanation:

Computes

the

distance

between

every

pair

of

distinct

points in the nested for loops.


Dmin keeps track of the shortest distance found so far among the
distances found between any two points.
Returns the indexes of the points for which the distance is the
smallest.

Analysis:
Input size: Number of points, n.
Basic operation: Finding the square in the expression d Sqrt ( ( xi
xj )2 + ( yi yj )2 ) .
Let C (n) be the number of times, square is found.
n-1
C( n ) =
i=1

n-1

j=i+1

i=1

n
1
j=i+1

n-1

i=1

j=i+1

= ni + n-i
i=1

i=1

= [ ( n 1 ) + ( n 2 ) + .. + 0 ] + [ ( n 1 ) + ( n 2 ) + .. + 0
]

= (n(n1))/2 + (n(n1))/2
= (2n (n 1)) / 2 (n2)

Convex hull
Proble m statement:

Given n points, find out the smallest convex polygon enclosing all n
points in the plane.
The convex hull of a set s of points is the smallest convex set
containing s.
Issues:
Finding the extreme points that will serve as the vertices of the
polygon.
Knowing which pairs of points need to be connected to form the
boundary of the convex hull.
Solution
A line segment connecting two points pi and pj of a set of n points is a
part of its convex hulls boundary if and only if all the other points of
the set lie on the same side of the straight line through these two
points.
Repeating the above test for every pair of points gives a list of line
segments that constitute the boundary of the convex hull.
Straight line through two points (x1, y1) and (x2, y2) is defined by the
equation ax + by = c
Where, a = y2 y1
B = x1 x2 and
C = x1y2 y1x2
To check whether all points lie on the same side of the line, check
whether the sign of the expression ax + by = c is same at each of these
points.
Analysis:
Input size: Number of points, n.

Basic operation: Evaluating the expression ax + by = c


Let C (n) be the number of times the expression is evaluated.
The number of straight lines that can be framed out of n points is
equal to (n (n 1)) / 2.
Since, the sign of the straight line is to be found out for each of the (n
(n 1)) / 2,
C (n) = (((n (n 1)) / 2) * n (n2)

3.6 Exhaustive Search


It is a brute force solution to a problem involving search for an
element with a special property, usually among combinatorial objects
such a permutations, combinations, or subsets of a set.
Most of the co mbinatorial problems are optimization problems that
require finding an element that maximizes or minimizes some desired
characteristic. E.g.: paths length.
Exhaustive search suggests generating each and every element of the
problems domain, selecting those of them that satisfy the proble ms
constraints, and then finding a desired ele ment.
Exhaustive search algorithms run in a realistic a mount of time only on
very small instances.

Method:
Construct a way of listing all potential solutions to the problem in a
systematic manner such that:
All solutions are eventually listed.
No solution is repeated.

Evaluate solutions one by one, perhaps disqualifying infeasible ones


and keeping track of the best one found so far.
When search ends, announce the winner.

We illustrate exhaustive search by applying it to three important problems:


Traveling Salesman Problem
Knapsack Problem
Assignment Problem

Proble m 1: The Traveling salesman problem


Proble m Statement:
Given n cities with known distances between each pair, find the
shortest tour that passes through all the cities exactly once before
returning to the starting city.
It requires finding the shortest Hamiltonian circuit in a weighted
connected graph.

Solution:
Generate the permutations of the n 1 cities leaving the city fro m
which the salesman has to start and eventually return. E.g.: If there are
four cities a, b, c, d and the starting city is a, then, generate
permutations of b, c and d, say b c d, b d c etc.
Prefix and append the starting city to each one of the generated
permutations. In the above example add a at the beginning and end
of the permutation: a b c d a, a b d c a etc.
Example:

2
2

a
8
8

5
5

b
3
3

7
7

4
4

Tour

C os t

abcda

2 + 3 + 7 + 5 = 17

abdca

2 + 4 + 7 + 8 = 21

acbda

8 + 3 + 4 + 5 = 20

acdba

8 + 7 + 4 + 2 = 21

adbca

5 + 4 + 3 + 8 = 20

adcba

5 + 7 + 3 + 2 = 17

Analysis:
Input size: The number of cities, n.
Basic operation: Generating the permutation of the intermediate n 1
Cities.
Let C (n) be the number of permutations generated.
Since we have to generate the permutations of n 1 cities,
C (n) = (n 1) ! (n!)
A close look at the generated permutations reveals that two of the
permutations are same except for the direction in which the salesman

travels. Hence, C (n) = ((n 1) ! / 2). But order of growth still belongs
to factorial class.

Problem 2: The Knapsack Problem

Proble m Statement:
Given n ite ms with:
Weights: w1, w2, . . . , wn
Values:

v1,

v2, . . . , vn

And a knapsack of capacity W


Find the most valuable subset of the ite ms that fit into the knapsack.
Solution:
Generate all the subsets of the set of n items given.
Compute the total weight of each subset.
Identify the feasible subsets, i.e. those subsets whose total value does
not exceed the maximum capacity.
Find a subset of the largest value among them.

Example:
Knapsack capacity W = 16

ite m weight

value

$20

$30

10

$50

$10

Subset

Total weight

Total value

{ 1}

$20

{ 2}

$30

{ 3}

10

$50

{ 4}

$10

{1,2}

$50

{1,3}

12

$70

{1,4}

$30

{2,3}

15

$80

{2,4}

10

$40

{3,4}

15

$60

{ 1, 2, 3}

17

not feasible

{ 1, 2, 4}

12

$60

{ 1, 3, 4}

17

not feasible

{ 2, 3, 4}

20

not feasible

{ 1, 2, 3, 4}

22

not feasible

Analysis:
Input size: Number of ite ms, n.
Basic operation: Generating the subset.
Let C (n) be the number of times the subset is generated.
Since there are 2n subsets of n elements,
C (n) = 2n (2n)

Problem 3: The Assignment Problem

Proble m Statement:
There are n people who need to be assigned to n jobs, one person per
job. The cost of assigning person i to job j is C[i,j].

Find an

assignment that minimizes the total cost.


Solution:
Represent the costs incurred by assigning ith person to the jth job in the
form of a cost matrix.
Select one element in each row of the matrix so that all selected
elements are in different columns and the total sum of the selected
elements is the smallest possible.
The feasible solution can be described as a n-tuple {j1jn} in whic h
the ith component, i = 1n, indicates the column of the element
selected in the ith row.
Hence the problem requires generating all the permutations of integers
from 1 to n, computing the total cost of each assignment and finally
selecting the one with the smallest sum.

Example:

J ob 0 J ob 1 J ob 2 J ob 3
Person 0

Person 1

Person 2

Person 3

Assignment

Total Cost

1, 2, 3, 4

9 + 4 + 1 + 4 = 18

1, 2, 4, 3

9 + 4 + 8 + 9 = 30

1, 3, 2, 4

9 + 3 + 8 + 4 = 24

1, 3, 4, 2

9 + 3 + 8 + 6 = 26

1, 4, 2, 3

9 + 7 + 8 + 9 = 33

1, 4, 3, 2

9 + 7 + 1 + 6 = 23
etc.

Analysis:
Input size: The number of jobs, n, and the number of people, n.
Basic operation: Generating the permutation of the n-tuple.
Let C (n) be the number of permutations generated.
Since we have to generate the permutations of a n-tuple,
C (n) = (n) ! (n!)

Design and Analysis of Algorithms


4. Divide-and-Conquer
The general plan of divide-and-conquer is as follows:
A problems instance is divided into several smaller instances of
the same problem ideally of the same size.
The smaller instances are solved.
If necessary the solutions obtained for the smaller instances are
combined to get a solution to the original problem.

It is ideally suited for parallel computations in which each subproblem


can be solved simultaneously by a processor.
The Divide-and-Conquer technique depicting the case of dividing a
problem into two smaller subproblems is shown in Fig. 4.1.
In general an instance of size n can be divided into several instances
of size n/b with a of them needing to be solved where a and b are
constants such that a >= 1, b > 1.

Fig 4.1 Divide-and-conquer technique


Few examples of Divide and Conquer technique
Sorting: mergesort and quicksort
Binary tree traversals
Binary search
Multiplication of large integers
Matrix multiplication: Strassens algorithm
Closest-pair and convex-hull algorithms
General Divide-and-Conquer Recurrence:

The running time of a problem of size n, which is divided into several


instances of size n/b with a of them needing to be solved (a >= 1 and
b > 1) is given by the following recurrence, called as the general
divide-and-conquer recurrence.
T(n) = aT(n/b) + f (n)
The order of growth of the general divide-and-conquer recurrence is
given by Master theorem, which states that:
If f(n) (nd), d 0
Then, the order of growth of T (n) can be estimated as follows
If a < bd, then T(n) (nd)
If a = bd, then T(n) (nd log n)
If a > bd, then

T(n) (nlog b a )

Note: The same results hold with O instead of .

4.1 Merge Sort

Explanation;
Split array A[0..n-1] into about two equal halves and make copies of
each half in arrays B and C
Sort arrays B and C recursively
Merge the sorted arrays B and C into array A as follows:
Repeat the following until no elements remain in one of the
arrays:
Compare the first elements in the remaining unprocessed
portions of the arrays

Copy the smaller of the two into A, while incrementing


the index indicating the unprocessed portion of that array
Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.
Mergesort is a perfect example of a successful application of the
divide and conquer technique.
Given a sequence of n elements A[0], A[1],, A[n - 1] split it into
two sets (A[0], A[1], .., A[ n/2 - 1] ) & (A[ n/2 ],, A[n
1]), each set is individually sorted and the resulting sequences are
merged to obtain a single sorted sequence of n elements.
Example: Let us consider the following array elements
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4

8 3 2 9

7 1 5 4

8 3

2 9

3 8

71

2 9

2 3 8 9

5 4

1 7

4 5

1 4 5 7

1 2 3 4 5 7 8 9

Analysis:
Input size: Number of elements in the array, n.

Basic operation: Key comparison operation B[i] <= C[j].


Let C(n) be the number of times the basic operation is executed
The recurrence for C(n) is as follows
C(n) = 0
= C(n/2) + C(n/2) + Cmerge(n)

for n = 1
for n > 1

Worst Case:
It occurs when smaller elements come from alternate arrays while
merging.
So, Cmerge(n) = n - 1
Thus, C(n) = 2 C(n/2) + n 1
Let n = 2k, Therefore, k = log2 n
C(2k) = 2 C(2k-1) + 2k 1
= 2 [ 2 C(2k-2) + 2k-1 1] + 2k -1
= 22 C(2k-2) + 2k 2 + 2k 1 = 22 C(2k-2) + 2 * 2k 2
1
= 23 C(2k-3) + 3 * 2k [ 22 + 21 + 20 ]
=
=
=
= 2i C(2k-i) + i * 2k [2i-1 + + 21 + 20]
when i = k
= 2k C(1) + k * 2k [2k-1 + + 21 + 20]
= k * 2k [2k 1] (because 2k-1 + + 21 + 20 = 2k 1)
= k * 2 k 2k + 1
= log2 n * n n + 1 (because n = 2k and k = log2 n)
C(n) (n log2 n)

Best Case:
It occurs when all the elements of one of the subarrays is either
greater than the last element of the other subarray or lesser than the
first element of the other subarry.
So, Cmerge(n) = n / 2
Thus, C(n) = 2 C(n/2) + n / 2
Solving the recurrence will show that C(n) (n log2 n)

4.2 Quick Sort


Algorithm quick sort(A[l.r])
//Sorts a subarray by quick sort
//Input: A subarray A[l.r] of A[0n - 1] defined by its left and right
indices l and r
//Output: The subarray A[l.r] sorted in increasing order.
if l < r
S partition(A[l.r] )
Quicksort(A[l.s - 1] )
Quicksort(A[s+1.r])

// s is a split position

Select a pivot (partitioning element) here, the first element

Rearrange the list so that all the elements in the first s positions are
smaller than or equal to the pivot and all the elements in the remaining
n-s positions are larger than or equal to the pivot

Sort the two subarrays recursively


Quicksort is an application of the divide and conquer technique.
Example: Let us consider the following array elements
40 20 23 54 15 18 85 23 78 98
Pass 1:
Pivot = 40, i = 1 and j = 9

Start comparing the pivot with a[i] and keep incrementing i until a[i]
becomes greater than pivot
40

20

23

54

15

18

85

23

78

98

Pivot i
20 < 40, therefore increment i

40

20

23
54
15
18
i
23 < 40, therefore increment i

85

23

78

98
j

40

20

85

23

78

98
j

23

54
i

15

18

54 > 40, therefore stop incrementing i and start comparing the pivot
with a[j], decrementing j until a[j] becomes less than pivot
98 > 40, therefore decrement j

40

20

23

54
15
18
i
78 > 40, therefore decrement j
40

20

23

40

20

23

85

23

78
j

98

54
15
18
85
23
78
i
j
23 < 40, Since i < j swap a[i], i.e., 54 and a[j], i.e., 23

98

54
15
18
85
23
78
98
i
j
Now continue incrementing i and decrementing j by comparing a[i]

and a[j] with the pivot respectively until i becomes greater than j
40

20

23

23
15
18
i
23 < 40, therefore increment i

85

54
j

78

98

40

20

23

23

15
18
i
15 < 40, therefore increment i

85

54
j

78

98

40

20

23

23

85

54
j

78

98

15

18
i

18 < 40, therefore increment i


40

20

23

23

15

18

85
54
78
98
i
j
85 > 40, therefore stop incrementing i and start decrementing j
40

20

23

23

15

18

85
ij

54

78

98

18
85
j
i
18 < 40, therefore stop decrementing j.

54

78

98

54 < 40, therefore decrement j


40

20

23

23

15

Also notice that i has become greater than j. Hence, we swap a[j], i.e.,
18 with Pivot, i.e., 40 to achieve the first partition where all the
elements to the left of 40 are less than or equal to 40 and all the
elements to the right of 40 are greater than or equal to 40.
18

20

23

23

15

40
85
54
78
98
Pivot
---------- <=40 --------------------------- =>40 ---------
Similarly, sort the two subarrays 18 20 23 23 15 and 85 54 78 98
recursively until all the elements are sorted.
Analysis:
Input size: Number of elements in the array, n.
Basic operation: Key comparison operation A[i] >= p and A[i] <= p.
Let C(n) be the number of times the key comparison operation is
executed.
Best case:
It occurs when all the splits happen in the middle of the
corresponding subarrays.
For the split to happen in the middle of the array, n comparisons
are needed, and after which the array will be divided into two
equal subarrays.
Therefore,

C(n) = 0
= C(n/2) + C(n/2) + n

for n = 1
for n > 1

Solving the recurrence,


C(n) = C(n/2) + C(n/2)
+n
Let n = 2k, Therefore, k = log2 n
C(2k) = 2 C(2k-1) + 2k

= 2 [ 2 C(2k-2) + 2k-1 ] + 2k
= 22 C(2k-2) + 2 * 2k
=
=
=
= 2i C(2k-i) + i * 2k
When i = k,
= 2k + k * 2k
= n + log2 n * n
C(n) (n log2 n)

Worst case:
It occurs when the splits happen in at the extremes.
For the split to happen at the extremes (n + 1) comparisons are
needed, and after which one of the two subarrays will be empty
while the size of the other will be just one less than the size of the
array being partitioned.
Therefore,

C(n) = 0
= C(n - 1) + (n + 1)

for n = 1
for n > 1

C(n) = C(n - 1) + (n + 1)
= [ C(n - 2) + n ] + (n + 1)
= C(n - 2) + n + (n + 1)
= [ C(n - 3) + (n 1) ] + n + (n + 1)
= C(n - 3) + (n 1) + n + (n + 1)
= ...
= ...
= ...

= C(n i) + (n i + 2) + ... + n + (n + 1)
when i = n 1,
= C(1) + (n (n 1) + 2 ) + ... + n + (n + 1)
= 3 + .. + n + (n + 1)

(n + 1)(n + 2)
= ---------------- - 3
2
2
C(n) (n )
Average case:
(n+1) comparisons are made before partitioning. Let k be
the position of the pivot element.
k
k-1 elements

n-k elements

Assuming that portion split can occur in each position k


where 0 k n-1 with the same probability 1/n we get
recurrence relation as
n

C(n) = (n+1) + 1/ [ C[n-1] + C[n-K]]

.equation 1

k=1

(n+1) : comparisons made before partitioning


1/n
: probability that partition split can occur in each position of the array
C(n - 1) : cost involved to solve left part
C(n - K): cost involved to solve right part
Multiply equation 1 with n
n

nC(n)=n(n+1) + [C(K-1) + C(n-K)]


k=1

=n(n+1) + [ C(0) + C(n-1) + C(n-2) + . + C(n-1) +C(0)]


= n(n+1) + 2[ C(0) + C(1) ++ C(n-1)]
.equation 2

Replace n by n-1 in equation 2


(n-1)C(n-1) = (n-1)n + 2 [ C(0) + C(1) ++C(n-2)]
subtract equation 3 from equation 2
nC(n) (n-1)C(n-1) = n2 +n- n2+n+2C(n-1)
= 2n +2C(n-1)
nC(n) = 2n +2C(n-1) +(n-1)C(n-1)
= 2n +C(n-1)[2+n-1]
nC(n) = 2n +C(n-1)(n+1)

.equation 3

.equation 4

Divide equation 4 by n(n+1)


C(n)/(n+1) = 2/(n+1) + C(n-1)/n
= [C(n-2)/(n-1) + 2/n] + [ 2/(n+1)]
= [C(n-3)/(n-2) + 2/(n-1)] +2/n + 2/(n+1)

= C(n-n)/1 + 2/2 +2/3 + 2/4 + .. + 2/n +2/(n+1)


= C(0)/1 +2/2 + 2/3 +..+2/(n-1) + 2/n + 2/(n+1)
= 0 + 2[1/2 + 1/3 + . + 1/n + 1/(n+1)]
n+1

= 2 1/K = 1/K = 21/K Dk


K=2

K=1

C(n)/(n+1) = 2[log e K]n1 = 2[log n log e 1] = 2log n


C(n) = ( n+1 ) * 2log n
C(n) O(nlogn)

4.3 Binary Search


Binary search is remarkably efficient algorithm for searching in a sorted
array.

It works by comparing search key, K with arrays middle element, say,


A[m].
If the key, K, matches with the middle element of the array, the algorithm
stops, otherwise same option is repeated recursively for first half of the
array if K < A[m] and for second half if K > A[m].
K
v/s
A[0] . . . A[m] . . . A[n-1]
ALGORITHM BinarySearch( A[0, , n 1], K )
l 0; r n-1
while l r do
m (l+r)/2
if K = A[m] return m
else if K < A[m] r m-1
else l m+1
return -1
Analysis:
Input size: Number of elements in the array, n.
Basic operation: Key comparison operation K = A[m].
Let C(n) be the number of times the key comparison operation is
executed.
Best case:
It occurs when the key element is in the middle of the array.
In this case the number of comparisons required is only one.
Therefore C (n) = 1 (n)
Worst Case:
It occurs when the key element is not found or it is found in the
last comparison before l becomes greater than r.

The number of key comparisons in this case is given by the


following recurrence:

C (n) = 1 + C (n / 2)
=1

for n > 1
for n = 1

Solving the recurrence,


C (n) = C (n / 2) + 1
Let n = 2k, Therefore, k = log2 n
C(2k) = C(2k-1) + 1
= C(2k-2) + 1 + 1 = C (2k-2) + 2
= C(2k-3) + 3
=
=
=
= C(2k-i) + i
When i = k,
= C(1) + k
= 1 + log2 n
C(n) (log2 n)

4.4 Binary Tree Traversals


Binary tree is a divide-and-conquer ready structure! In Fig 4.2 the binary
tree is shown, which can be traversed in preorder, postorder and inorder

Fig 4.2
Binary Tree
Ex. 1: Classic traversals (preorder, inorder, postorder)
All tree traversals visit nodes of binary tree recursively by visiting trees
root and its left and right subtree. They differ in just timing of roots visit.
Preorder Traversal: Root is visited before left and right subtree are
visited (Root Left Right).
Inorder Traversal: Root is visited after visiting its left subtree but
before visiting right subtree (Left Right Root).
Postorder Traversal: Root is visited after visiting right and left subtree
(Left Right Root).
Algorithm Inorder(T)
if T
Inorder(Tleft)
print(root of T)
Inorder(Tright)

Algorithm Preorder(T)
if T
print(root of T)
Preorder(Tleft)
Preorder(Tright)

Algorithm Postorder(T)
if T
Postorder(Tleft)
Postrder(Tright)
print(root of T)
Efficiency: (n)
Note: The analysis is same as the analysis for the algorithm to find the
height of a binary tree
Ex. 2: Computing the height of a binary tree

TL

TR

Binary tree T is defined as finite set of nodes that is either empty or


consists of root and 2 distinct binary trees Tl and Tr called left and
right subtrees of the root.
Since definition itself divides binary tree into 2 smaller structures of
the same type, left subtree & right subtree, many problems about
binary trees can be solved by applying divide and conquer techniques.
We consider a recursive algorithm to find the height of a binary tree.

Height of a tree is defined as longest path from root to a leaf.


Therefore height can be computed as max length of the(height) roots
left and right subtree + 1.
We add 1 to count for the extra level of the root.
It is convenient to define height of empty tree as 1.

Algorithm Height(T)
//Computes recursively height of binary tree.
//Input:Binary tree T.
//Output:Height of T.
If T = 0
return 1 //Empty tree
else
return max{Height (Tl),Height(Tr)}+1.

Analysis:
Input size: Number of nodes in the tree, n.
Basic operation: Key comparison operation T = 0.
Let C(n) be the number of times the comparison operation is
performed.
The comparison operation T = 0 is performed once at each of the
internal nodes and twice at the leaf nodes.
To show that the comparison operation is performed twice at the
leaf nodes, the binary tree can be extended to form a extended
binary tree in which each of the leaf nodes of the binary tree have
two nodes called as external nodes which may be represented by
square boxes.

We can notice that in such an extended binary tree the number of


external nodes (square nodes) will be one greater than the number
of internal nodes, n. i.e. number of external nodes = n + 1.
Now, the number of times the comparison operation performed can
be computed by just counting the number of nodes in the extended
binary tree.
C (n) = number of internal nodes + number of external nodes
= n + (n + 1)
= 2n + 1 (n)

4.5 Strassens Matrix Multiplication


The product C of two 2-by-2 matrices A and B can be found out by the
following formulas:

c00

c01

a00

a01

=
c10

c11

b00

b01

b10

b11

*
a10

a11

m1 + m4 - m5 + m7

m3 + m5

=
m2 + m4
where,
m1 = (a00 + a11) * (b00 + b11)
m2 = (a10 + a11) * b00
m3 = a00 * (b01 b11)
m4 = a11 * (b10 b00)
m5 = (a00 + a01) * b11
m6 = (a10 a00) * (b00 + b01)

m1 + m3 m2 + m6

m7 = (a01 a11) * (b10 + b11)

Strassens

algorithm

makes

seven

multiplications

and

18

additions/subtractions, whereas the brute-force algorithm requires eight


multiplications and four additions.

The same concept can be extended to multiply two n-by-n matrices A and
B by dividing them into four n/2-by-n/2 submatrices as shown in the
above formula
Apply the above formulas recursively to the submatrices obtained, until
the matrices reduce to order 2. When the matrices reduce to order 2, their
product can be easily found by using the formulas discussed above.
Strassens algorithm requires that the order of the matrices be an exact
power of 2and if they are not an exact power of 2, they can be padded
with zeros.

Analysis:
Input size: Order of the matrix, n.
Basic operation: Multiplication operation.
Let C (n) be the number of times multiplication operation is executed.
C (n) is given by the following recurrence:

C(n) = 7C (n/2)

if n >1

if n = 1

solving the recurrence,


C(n) = 7C (n/2)
Let n = 2k, Therefore, k = log2 n

C(2k) = 7 * C(2k-1)
= 7 ( 7 * C(2k-2)) = 72 * C (2k-2)
= 73 * C(2k-3)
=
=
=
= 7i * C(2k-i)
When i = k,
= 7k * C(1)
= 7k
= 7 log2 n
= n log2 7 (because a logc b = b logc a)
= n 2.807

4.6 Closest-Pair and Convex-Hull Problem


Closest-Pair Problem by Divide-and-Conquer
Step 1: Divide the points given into two subsets S1 and S2 by a
vertical line x = c so that half the points lie to the left or on the line
and half the points lie to the right or on the line.

Step 2: Find recursively the closest pairs for the left and right subsets.
Step 3: Set d = min{d1, d2}
We can limit our attention to the points in the symmetric vertical strip
of width 2d as possible closest pair. Let C1 and C2 be the subsets of
points in the left subset S1 and of the right subset S2, respectively,
that lie in this vertical strip. The points in C1 and C2 are stored in
increasing

order of their y coordinates, which is maintained by

merging during the execution of the next step.


Step 4: For every point P(x,y) in C1, we inspect points in C2 that may
be closer to P than d. There can be no more than 6 such points
(because d d2)!
The worst case scenario is depicted below:

Analysis:
Input size: Number of points, n.
Basic operation: Finding the square in the expression d Sqrt ( ( xi
xj )2 + ( yi yj )2 ) .
Let C (n) be the number of times, square is found.
C (n) is given by the following recurrence:
C(n) = 2C (n/2) + cmerge (n)
In the worst case cmerge (n) O (n)
Therefore, By the Master Theorem (with a = 2, b = 2, d = 1)
C(n) O(n log n)
Convex-Hull Algorithm
Convex hull: smallest convex set that includes given points
Assume points are sorted by x-coordinate values
Identify extreme points P1 and P2 (leftmost and rightmost)
Compute upper hull recursively:
- Find point Pmax that is farthest away from line P1P2
- Compute the upper hull of the points to the left of line
P1Pmax

- Compute the upper hull of the points to the left of line


PmaxP2
Compute lower hull in a similar manner

P2

Analysis:
Finding point farthest away from line P1P2 can be done in linear time
Time efficiency:
Worst case: (n2) (as quicksort)
Average case: (n) (under reasonable assumptions about
distribution of points given)
If points are not initially sorted by x-coordinate value, this can be
accomplished in O (n log n) time

Several O (n log n) algorithms for convex hull are known

5. Decrease and Conquer


It is also referred to as inductive or incremental approach.
It has three major steps:
Reduce problem instance into smaller instance of the same problem
Solve smaller instance

Extend solution of smaller instance to obtain solution to original


instance
There are three major variations:
Decrease by constant:
o The size of an instance is reduced by the same constant on each
iteration.
o Typically this constant is equal to one.
o Examples: Insertion sort, Graph search algorithms like DFS, BFS,
Topological sorting, algorithms for generation permutations,
subsets.
Decrease by a constant factor:
o The size of an instance is reduced by the same common factor on
each iteration.
o Examples: Binary search, Fake-coin problem, Josephus problem
Variable-size decrease
o The size reduction pattern varies from one iteration to another.
o Examples: Euclids algorithm, Selection by partition.

5.1 Insertion Sort

Explanation:
Assume that the array A[on-2] is sorted giving us an array of size n-1.
Find an appropriate position for A[n-1] among the sorted elements and
insert it.
It can be done in three ways:
Scan the sorted subarray from left to right until the first element
greater than or equal to A[n-1] is encountered and then insert A[n-1]
before that element.
Straight insertion sort / Insertion sort: Scan the sorted subarray from
right to left until the first element less than or equal to A[n-1] is
encountered and then insert A[n-1] after that element.
Binary insertion sort: Use binary search to find an appropriate position
for A[n-1] in the sorted portion of the array.
Example:

Sort 6, 4, 1, 8, 5
6|4 1 8 5
4 6|1 8 5
1 4 6|8 5

1 4 6 8|5
1 4 5 6 8
Analysis:
Input size: Number of elements in the array, n.
Basic operation: Key comparison operation, A[j] > v.
Let C (n) be the number of times the key comparison operation is
executed.
Best case:
It occurs when the elements in the list are sorted.
n-1
C(n) = 1
i=1
= n 1 (n).
Worst case:
It occurs when the elements in the list are sorted in the reverse order.
n-1 i-1
C(n) =
i=1

j=0

n-1
= (i - 1) + 1
i=1

n-1
= i
i=1

= (n 1) + (n 2) + + 1 = n * (n 1) / 2
= (n2 n) / 2 (n2).

Average case
In average case C (n) (n2).

5.2 Depth-First Search and Breadth-First Search


Many problems require processing all graph vertices in systematic
fashion
Graph traversal algorithms:
Depth-First search
Breadth-First search

Depth-First Search
DFS Traversal:
It selects and visits a starting vertex v.
A vertex w adjacent to v and which is not visited earlier is explored,
leaving the rest of the vertices, which are adjacent to v unexplored.
When the vertex w is completely explored, the next vertex adjacent to v
is explored.
The above process is repeated until all the vertices are explored.
It visits the vertices of a graph recursively using a stack.
A vertex is pushed onto the stack when it is reached for the first time
A vertex is popped off the stack when it becomes a dead end, i.e.,
when there is no adjacent unvisited vertex

DFS Forest:

It is constructed from the depth first search traversal of a graph.


The traversals starting vertex serves as the root of the first tree in such
a forest.
Whenever a new unvisited vertex is reached for the first time, it is
attached as a child to the vertex it is being reached from with an edge
called a tree edge.
If the graph has an edge that leads to a previously visited vertex other
than its immediate predecessor, it is called a back edge.

Example undirected graph

Solution
DFS tree:

Red edges are tree edges and Black


edges are back edges

DFS traversal stack considering a as source:


a
ab
abf
abfe
abf
ab
abg
abgc
abgcd

abgcdh
abgcd

Analysis: Adjacency Matrix


Input size: order of matrix n x n is n2
Basic Operation: Comparison operation If w is marked with 0.
Let C(n) be the number of times the comparison operation is executed.

C(n) =

v=1

w=1

n
(n 1 + 1)

v=1
n
= n

v=1
= n (n 1 + 1)
C(n) n2
Analysis: Adjacency Linked List
Input size: Number of vertices V and edges E
Basic Operation: Comparison operation If w is marked with 0.
Let C(n) be the number of times the comparison operation is executed.
Scanning the list of vertices v in V (V).
The number of times the comparison operation is performed depends
on the recursive call dfs(w) which is proportional to the number of
edges and it (E)

C(n) = (V + E)

Applications of DFS:
Checking connectivity, finding connected components.
Checking acyclicity.
Searching state-space of problems for solution.

Breadth-First Search
BFS Traversal:
Select and visit a starting vertex v
Visit all the vertices that are adjacent to v then all unvisited vertices two
edges apart from v and so on until all the vertices are visited.
It visits the vertices of a graph using a Queue.
A vertex is queued when it is reached for the first time.
A vertex is removed from the queue when it when all the vertices
adjacent to it are visited.
BFS Forest:
It is constructed from the breadth first search traversal of a graph.
The traversals starting vertex serves as the root of the first tree in such a
forest.
Whenever a new unvisited vertex is reached for the first time, it is
attached as a child to the vertex it is being reached from with an edge
called a tree edge.

If the graph has an edge leading to a previously visited vertex other than
its immediate predecessor, it is called a cross edge.

Example undirected graph

Solution
BFS tree:

Red edges are tree edges and Black edges are


cross edges

BFS traversal queue considering a as source:


a
bef
efg
fg
g
ch
hd
d

Analysis: Adjacency Matrix

Input size: order of matrix n x n is n2


Basic Operation: Comparison operation If w is marked with 0.
Let C(n) be the number of times the comparison operation is executed.

C(n) =

v=1

w=1

n
(n 1 + 1)

v=1
n
= n

v=1
= n (n 1 + 1)
C(n) n2
Analysis: Adjacency Linked List
Input size: Number of vertices V and edges E
Basic Operation: Comparison operation If w is marked with 0.
Let C(n) be the number of times the comparison operation is executed.
Scanning the list of vertices v in V (V).
The number of times the comparison operation is performed depends
on the recursive call dfs(w) which is proportional to the number of
edges and it (E)
C(n) = (V + E)

Applications of BFS:

To find paths from a vertex to all other vertices with the smallest number
of edges
Connectivity
Acyclicity

5.3 Topological Sorting


List the vertices of a directed graph in such an order that for every edge
in the graph, the vertex where the edge starts is listed before the vertex
where the edge ends.
The necessary and sufficient condition for topological sorting to be
possible is that the graph must be a directed acyclic graph (DAG).

DAG
Topological sorting Algorithms
1. DFS-based algorithm:
2. Source removal algorithm

DFS-based algorithm for topological sorting

Not a DAG

Perform DFS traversal, noting the order in which the vertices are popped
off the stack.
Reverse the order of the vertices obtained in the above step.
Example:

Order in which vertices are popped off the stack: f, g, b, e, a, h, d, c


Topological order: c, d, h, a, e, b, g, f

Efficiency: The same as that of DFS.

Source removal algorithm


Identify a source which is a vertex with no incoming edges and delete
it along with all the edges outgoing from it.
Repeat the above step until all vertices are deleted.

Example:

a has no incoming edges Delete a

b has no incoming edges Delete b


c

e has no incoming edges


g Delete e
f

f has no incoming edges Delete f

c has no incoming edges Delete c


d

g incoming edges Delete d


d has no

g has no incoming edges Delete g

h has no incoming edges Delete h

Topological order is : a, b, e, f, c, d, g, h
Note: While selecting a node for deletion, any node, which doesnt have an
incoming edge, can be selected. The above sequence is just one such order.
There can be other topological orderings of the vertices also, depending on
which node we select for deletion.
Efficiency: The same as that of DFS-based algorithm

Design and Analysis of Algorithms


6. Transform and Conquer
Transform and conquer works in two stages:
Transformation stage: The problems instance is modified to
another simpler form which will be easier to solve
Conquer stage: The problems instance is solved
There are 3 major variations:

Instance

simplification:

Transforms

problem

to

simpler/more convenient instance of the same problem.

Representation change: Transforms a problem to a different


representation of the same instance.
Problem reduction: Transforms a problem to a different
problem for which an algorithm is already available.
Simpler instance /
Problems instance

Another representation /

Solution

Another problems instance

6.1 Instance simplification - Presorting


Many problems involving lists are easier when list is sorted.
Searching
Computing the median (selection problem)
Checking if all elements are distinct (element uniqueness)
Topological sorting helps solving some problems for dags.
Presorting is used in many geometric algorithms.

Searching with presorting


Problem: Search for a given K in A[0..n-1].
Presorting-based algorithm:
Stage 1: Sort the array by an efficient sorting algorithm
Stage 2: Apply binary search.
Efficiency: (nlog n) + (log n) = (nlog n).
The efficiency of sequential search is better, which is linear.

Example 1: Element Uniqueness with presorting in an array


Problem: Check whether all the elements in an array A[0..n-1] are
unique or not
Presorting-based algorithm:
Stage 1: Sort the array by an efficient sorting algorithm
Stage 2: Scan array to check pairs of adjacent elements and return
true if no two adjacent elements are equal, otherwise return
false.
ALGORITHM: PresortElementUniqueness(A[0n 1])
// Solves the element uniqueness problem by sorting the array first
// Input: An array A[0...n 1] of orderable elements
// Output: Returns true if A has no equal elements, false otherwise
Sort the Array
for i 0 to n 2 do
if A[i] = A[i+1] return false
return true

Analysis:
Efficiency of the entire presorting algorithm is:
C(n) = Csort(n) + Cscan(n)
(nlog n) + (n)
= (nlog n)
Better than the brute force algorithms which belongs to (n2)

Example 2: Computing a mode


Problem: Compute the mode of a given list of numbers, which is the
value that occurs most often in the list.
Presorting-based algorithm:
Stage 1: Sort the array by an efficient sorting algorithm
Stage 2: Scan the array and find the longest run of adjacent equal
values in the sorted array.
ALGORITHM: PresortMode(A[0...n 1])
// Computes the mode of an array by sorting the array first
// Input: An array A[0..n 1] of orderable elements
// Output: The arrays mode
Sort the Array A
i0
modefrequency 0
while i n 1 do
runlength 1
runvalue A[i]
while i + runlength n 1 and A[i + runlength] = runvalue
runlength runlength + 1
if runlength > modefrequency
modefrequency runlength
modevalue runvalue
i i + runlength
return modevalue

Analysis:
Efficiency of the entire presorting algorithm is:
C(n) = Csort(n) + Cscan(n)
(nlog n) + (n)
= (nlog n)
Better than the brute force algorithms which belongs to (n2)

6.2 Horners Rule


Given a polynomial of degree n
p(x) = an x n + an-1 x n-1 + + a1 x + a0 and a specific
value of x, find the value of p at that point.
Two brute-force algorithms:
Algorithm 1:
p0
for i n downto 0 do
power 1
for j 1 to i do
power power * x
p p + ai * power
return p
Algorithm 2:
p a0; power 1
for i 1 to n do
power power * x
p p + ai * power

return p
Example:
Evaluate p(x) = 2x4 - x3 + 3x2 + x - 5
= x (2x3 x2 + 3x + 1) - 5
= x (x (2x2 - x + 3) + 1) - 5
= x (x (x (2x - 1) + 3) + 1) - 5
Substitution into the last formula leads to a faster algorithm
Same sequences of computations are obtained by simply arranging the
coefficient in a table and proceeding as follows:
-------------------------------------------------------------------------------------------Coefficients: 2
-1
3
1
-5
-------------------------------------------------------------------------------------------x=3
2 3*2+(-1)=5 3*5+3=18 3*18+1=55 3*55+(-5)=160
--------------------------------------------------------------------------------------------

Analysis:
Input size: Degree of the polynomial, n.
Basic operation: Multiplication and addition operation.
Let C (n) be the number of times the multiplication and addition
operations are performed
n-1

C (n) =

= n 1 + 1 = n (n)

i=0

6.3 Binary Exponentiation


Computing an is an essential operation in several important primalitytesting and encryption methods
We now consider two algorithms for computing an .Both the
algorithms exploit the binary representation of exponent n, but one of
them processes this binary string left to right whereas the second does
it right to left.
Left-to-right binary exponentiation
ALGORITHM: LeftRightBinaryExponentiation(a, b(n))
// Computes an by the left-to-right binary exponentiation algorithm
// Input: A number a and a list b(n) of binary digits bI.. b0
//

in the binary expansion of a positive integer n

// Output: The value of an


product a
for i I 1 downto 0 do
Product product * product
if bi = 1
product product * a
return product
Explanation:
Initialize product accumulator by 1.

Scan ns binary expansion from left to right and do the following:


If the current binary digit is 0, square the accumulator;
if the binary digit is 1, square the accumulator and multiply it by a
Example: Compute a13 by the left-to-right binary exponentiation.
Here, n = 13 = 11012
-------------------------------------------------------------------------------------------Binary rep. of 13:
Product accumulator:

1
12*a=a

a2*a = a3 (a3) 2 = a6 (a6) 2*a= a13

-------------------------------------------------------------------------------------------(Computed left-to-right)

Efficiency: (b-1) C (n) 2(b-1) where b = log2 n + 1

Right-to-left binary exponentiation


ALGORITHM: RightLeftBinaryExponentiation(a, b(n))
// Computes an by the right -to-left binary exponentiation algorithm
// Input: A number a and a list b(n) of binary digits bI.. b0
//

in the binary expansion of a positive integer n

// Output: The value of an


term a
if b0 = 1
product a
else
product 1
for i 1 to I do

term term * term


if bi = 1
product product * term
return product

Explanation:
Scan ns binary expansion from right to left and compute an as the
product of terms a2i corresponding to 1s in this expansion.
Example: Compute a13 by the right-to-left binary exponentiation.
Here, n = 13 = 11012

-------------------------------------------------------------------------------------------1

Binary rep. of 13

a8

a4

a2

a2i terms

product accumulator

a5 *a8= a13

a*a4= a5

-------------------------------------------------------------------------------------------(Computed right-to-left)

Efficiency: same as that of left-to-right binary exponentiation

Design and Analysis of Algorithms

7. Space and Time Tradeoff


Two varieties of space-for-time algorithms:
Input enhancement preprocess the input (or its part) to store
some info to be used later in solving the problem
Counting for sorting
String searching algorithms
Prestructuring preprocess the input to make accessing its
elements easier
Hashing

Indexing schemes (e.g., B-trees)


7.1 Sorting by Counting
Comparison Sorting

For each element of a list to be sorted, count the total number of


elements smaller than this element and record the results in a table.

These numbers will indicate the positions of the elements in the


sorted list. e.g., if the count is 10 for some element, it should be in
the 11th position in the sorted list.
ALGORITHM: ComparisonCountingSort(A[0,,n 1])
// Sorts an array by comparison counting
// Input: Array A[0..n 1] of orderable values
// Output: Array S[0..n 1] of As elements sorted in nondecreasing order
for i 0 to n 1 do Count[i] 0
for i 0 to n 2
for j i + 1 to n 1 do

if A[i] < A[j]


Count[j] Count[j] + 1
else
Count[i] Count[i] + 1
for i 0 to n 1 do S[Count[i]] A[i]
return S

Example:

Array A[0..5]

62 31 84 96 19

47

Initially

count []

After pass i = 0

count []

After pass i = 1
After pass i = 2

count []
count []

After pass i = 3
After pass i = 4

count []
count []

Final state

Array A[0..5]

count []

19 31 47 62 84

Analysis:
Input size: Number of elements in the array, n.

96

Basic operation: Key comparison operaion, A[i] < A[j]


Let C (n) be the number of times the key comparison operation is
executed
Since the basic operation is executed within the two for loops, C (n)
(n2)

Distribution counting
If element values are integers between some lower bound l and upper
bound u, stored in an array A, then the frequency of each of those
values can be computed and stored them in an array F [0..u - l].
The elements of A whose values are equal to the lowest possible value
l are copied into the first F [0] elements of another array S, i.e.,
positions 0 through F[0] 1, the elements of value l + 1 are copied to
positions from F[0] to (F[0] + F[1]) 1, and so on.
Since each accumulated sums of frequencies are called a distribution
in statistics, the method is known as distribution counting.
ALGORITHM: DistributionCounting(A[0,,n 1])
// Sorts an array of integers from a limited range by distribution counting
// Input: Array A[0..n 1] of integers between l and u (l u)
// Output: Array S[0..n 1] of As elements sorted in nondecreasing order
for j 0 to u l do D[j] 0
for i 0 to n 1 do D[A[i] - l] D[A[i] - l] + 1
for j 1 to u l do D[j] D[j 1] + D[j]
for i n 1 downto 0 do
j A[i] l

S[D[j] 1] A[i]
D[j] D[j] 1
return S

Example:
13 11 12 13 12

Array A[0..5]

Array values
Frequencies
Distribution values

11
1
1

D [0..2]
A[5] = 12
A[4] = 12
A[3] = 13
A[2] = 12
A[1] = 11
A[0] = 13

12
3
4

12

13
2
6

S [0..5]

12
12
13
12
11
13

Analysis:
It is linear because it makes just two consecutive passes through its
input array A.

7.2 Input Enhancement in String Matching


Horspools Algorithm

A simplified version of Boyer-Moore algorithm:


Preprocesses pattern to generate a shift table that determines how
much to shift the pattern when a mismatch occurs
Always makes a shift based on the texts character c aligned with the
last compared (mismatched) character in the pattern according to the
shift tables entry for c
It compares the corresponding pairs of the characters in the pattern
and the text, from right to left.
If all the patterns characters match successfully, a matching substring
is found.
If a mismatch occurs shift the pattern by looking at the character c of
the text that was aligned against the last character of the pattern. In
general four possibilities can occur.
Case 1: if there are no cs in the pattern, then shift the pattern by its
entire length.
..c......................
BAOBAB
BAOBAB
Case 2: if there are occurrences of character c in the pattern but it
is not the last one, then align the rightmost occurrence of c in the
pattern with the c in the text.
......O......................
BAOBAB
BAOBAB
Case 3: if c happens to be the last character in the pattern but there
are no cs among its other m 1 characters, then shift the pattern
by its entire length.
...XYB........................
AAOAAB

AAOAAB
Case 4: if c happens to be the last character in the pattern and there
are other cs among its first m 1 characters, align the rightmost
occurrence of c among the first m 1 characters in the pattern with
the texts c.
............B................
BAOBAB
BAOBAB
Shift sizes can be precomputed by the formula:
Distance from cs rightmost occurrence in pattern
among its first m-1 characters to its right end
t(c) =
Patterns length m, otherwise

Eg: Shift table for the pattern BAOBAB is as shown below:


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6

Example of Horspools algorithm

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6

BARD LOVED BANANAS


BAOBAB
BAOBAB
BAOBAB
BAOBAB

_
6

BAOBAB
Search result: unsuccessful search
BARD LOVED BANANAS
BANANA
BANANA
BANANA
Search result: successful search

ALGORITHM ShiftTable (P[0..m 1])


//Fills the shift table used by Horspools and Boyer-Moore algorithms
//Input: Pattern P[0..m 1] and an alphabet of possible characters
//Output: Table[0..size 1] indexed by the alphabets characters and filled
//with shift sized computed by the formula.

Initialize all the elements of Table with m


For j 0 to m 2 do
Table [P[j]] m 1 j
Return Table

ALGORITHM HorspoolMatching(P[0..m 1], T[0..n 1])


//Implements Horspools algorithm for string matching
//Input: Pattern P[0..m 1] and text T[0..n 1]
//Output: The index of the left end of the first matching substring or 1 if
//there are no matches
ShiftTable (P[0..m 1])
im1
while i n 1 do
k0
while k m 1 and P[m 1 k] = T[i k]

kk+1
if k = m
return i m + 1
else
i i + Table [T[i]]
return -1

Boyer-Moore Algorithm
Based on the same two ideas:
Comparing pattern characters to text from right to left
precomputing shift sizes in two tables
bad-symbol table indicates how much to shift based on texts
character causing a mismatch
good-suffix table indicates how much to shift based on matched
part (suffix) of the pattern (taking advantage of the periodic
structure of the pattern)

Bad-symbol shift in Boyer-Moore algorithm


If the rightmost character of the pattern doesnt match, BM algorithm
acts as Horspools
If the rightmost character of the pattern does match, BM compares
preceding characters right to left until either all patterns characters
match or a mismatch on texts character c is encountered after k > 0
matches
Good-suffix shift in Boyer-Moore algorithm

Good-suffix shift d2 is applied after 0 < k < m last characters were


matched.
d2(k) = the distance between (the last letter of) the matched suffix of
size k and (the last letter of ) its rightmost occurrence in the pattern
that is not preceded by the same

character preceding the suffix

Example: CABABA d2(1) = 4.


If there is no such occurrence, match the longest part (tail) of the kcharacter

suffix

with

corresponding

prefix;

if there are no such suffix-prefix matches, d2 (k) = m.


Example: WOWWOW d2(2) = 5, d2(3) = 3, d2(4) = 3, d2(5) = 3

Boyer-Moore Algorithm
After matching successfully 0 < k < m characters, the algorithm shifts the
pattern right by
d = max {d1, d2}
where, d1 = max{t(c) - k, 1} is bad-symbol shift
d2(k) is good-suffix shift
Step 1: Fill in the bad-symbol shift table
Step 2: Fill in the good-suffix shift table
Step 3: Align the pattern against the beginning of the text
Step 4: Repeat until a matching substring is found or text ends:
Compare the corresponding characters right to left.
If no characters match, retrieve entry t1(c) from the bad-symbol table
for the texts character c causing the mismatch and shift the pattern to
the right by t1(c). If 0 < k < m characters are matched, retrieve entry
t1(c) from the bad-symbol table for the texts character c causing the

mismatch and entry d2(k) from the good-suffix table and shift the
pattern to the right by
d=max{d1,d2}
where d1 = max{t1(c) - k, 1}.
Example:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6

BESS_KNEW_ABOUT_BAOBABS
BAOBAB
d1 = t(K) = 6 B A O B A B
d1 = t(_)-2 = 4
d2(2) = 5
BAOBAB
d1 = t(_)-1 = 5
d2(1) = 2
B A O B A B (success)
k pattern

d2

1 BAOBAB 2
2 BAOBAB 5
3 BAOBAB 5
4 BAOBAB 5
5 BAOBAB 5

Design and Analysis of Algorithms

_
6

8. Dynamic Programming
Dynamic Programming is a general algorithm design technique for solving
problems defined by or formulated as recurrences with overlapping
subinstances
Some of the recursive algorithms solve common sub problems more
than once. Rather than solving the overlapping subproblems again and
again, dynamic programming supports solving each of the smaller
subproblems only once and recording the results in a table from which
the solution to the original problem can be obtained.
Dynamic programming, like divide-and-conquer method, solves
problems by combining solutions to sub problems.
In divide and conquer we partition the problem into independent
subproblems, solve the subproblems recursively and then combine
their solutions to solve the original problem.
In contrast, dynamic programming is applicable when subproblems
are not independent, that is when subproblems share subproblems.
Divide and conquer method does more work, by repeatedly solving
the common subproblems.
Dynamic programming algorithm solves every sub problem just once
and then saves the answer in a table there by avoiding the work of
recomputing answer every time the subproblem is encountered.
Dynamic programming is applied to optimization problems.

The main idea is

Set up a recurrence relating a solution to a larger instance to


solutions of some smaller instances
Solve smaller instances once
Record solutions in a table
Extract solution to the initial instance from that table

8.1 Computing Binomial Coefficient


It is the number of combinations or subsets of k-elements from an nelement set (0 <= k <= n)
It is denoted as C (n, k)
Properties of Binomial Coefficients:
Binomial coefficients are coefficients of the binomial formula:
(a + b) n = C (n,0) anb0 + . . . + C(n,k) an-kbk + . . . + C (n,n) a0bn
The value of the binomial coefficients can be computed using the
following recurrence:
C(n,k) = C(n-1,k) + C(n-1,k-1) for n > k > 0
C(n,0) = 1, C(n,n) = 1

for n 0

The value of C (n,k) can be computed easily by filling a table as


shown below:
o The row represents the number of combinations, k, and the
column represents the number of elements, n.
o Using the recurrence defined above the table entries are
filled.

0 0
0 1
1 1
.
.
.
n-1
n

. . .

k-1

C(n-1,k-1)

No. of
elements

Example: C (9, 5)
0
0
1
2
3
4
5
6
7
8
9

0
1
1
1
1
1
1
1
1
1
1

Analysis:

No. of combinations

1
2
1
3
3 1
4
6 4
5
10 10
6
15 20
7
21 35
8
28 56
9
36 84
C (9, 5) = 126

1
5
15
35
70
126

1
6
21
56
126

C(n-1,k)
C(n,k)

Input size: Number of elements, n and the number of cominations, k.


Basic operation: Addition operation C[i-1, j-1] + C[i-1, j].
Let C (n) be the number of times the addition operation is performed.
k i-1
n
k
C(n) = 1 +
1
i=1 j=1
i=k+1 j=1
k
n
= (i-1-1+1) + (k-1+1)
i=1
i=k+1
k
n
= (i-1) + k
i=1
i=k+1
= [0+1+2+----+(k-1)] + [k(n-k-1+1)]
= (k(k-1)) / 2 + k(n-k)
(nk)

8.2 Warshalls Algorithm


It computes the transitive closure of a directed graph
The transitive closure of a directed graph with n vertices is defined as
a n x n Boolean matrix T = {ti,j} in which the elements in the ith row
(1 i n) and the jth column (1 j n) is 1 if there exist a directed
path of positive length from the ith vertex to the jth vertex, otherwise
ti,j = 0

Explanation:
It constructs the transitive closure, R(n), through a series of n-by-n
boolean matrices, R(0), , R(k-1), R(k), R(n)
All the elements of each matrix R(k) can be computed from its
immediate predecessor R(k-1).
The elements of rij (k) in the ith row and jth column of matrix R(k)
(k = 0,1,2..n) is equal to 1 if and only if there exists a directed
path from the ith vertex to the jth vertex with each intermediate vertex,
if any, numbered not higher than k.
The formula for generating the elements of matrix R(k) from the
elements of the matrix R(k-1)
rij (k) = rij (k-1) or rik (k-1) and rkj (k-1)
Example:
a

Adjacency matrix
a
b
c
d

a
A= b
c
d

0
0
0
1

1
0
0
0

0
0
0
1

0
1
0
0

0
R =A= 0
0
1

1
0
0
0

0
0
0
1

0
1
0
0

0
0
0
1

1
0
0
1

0
0
0
1

0
1
0
0

0
0
0
1

1
0
0
1

0
0
0
1

1
1
0
1

0
0
0
1

1
0
0
1

0
0
0
1

1
1
0
1

1
1
0
1

1
1
0
1

1
1
0
1

1
1
0
1

(0)

(1)

R =

(2)

R =

(3)

R =

(4)

R =

Analysis:
Input size: order of the matrix, n.
Basic operation: R(k) [i,j] = R (k-1)[i,j] or R (k-1) [i,j] and R(k-1)[k,j]
Let C (n) be the number of times the above operation is performed.

n
C(n) =
k=1

i=1

n
1
j=1

n
=
k=1

n
(n-1+1)
i=1

n
=
k=1

n
n
i=1

n
= n (n-1+1)
k=1
C(n) (n3)

8.3 Floyds Algorithm


It computes the lengths of the shortest paths from each vertex to all
other vertices in a weighted connected graph.

The lengths of the shortest paths is recorded in an n-by-n matrix D


called the distance matrix in which the element dij in the ith row and
the jth column indicates the length of the shortest path from the ith
vertex to the jth vertex (1 = i, j <= n).

It is applicable to both undirected and directed weighted graphs


provided they do not contain a cycle of a negative length.
It constructs the distance matrix, D(n), through a series of n-by-n
matrices, D(0), , D(k-1), D(k), D(n)

All the elements of each matrix D(k) can be computed from its
immediate predecessor D(k-1).
The formula for generating the elements of matrix D(k) from the
elements of the matrix D(k-1)
dij (k) = min {dij (k-1), dik (k-1) + d kj (k-1)} for k >=1, dij (0) = wij

Example:
10
a

b
40

20

50
c

Cost matrix
b
c

a
W=b
c
d

0
999
50
999

10
0
999
999

10
0
999
999

999
999
0
60

40
20
999
0

60

(0)

D =W=

Computing D(1)

0
999
50
999

999
999
0
60

40
20
999
0

From a
A(a,a)=min{A(a,a),A(a,a)+A(a,a)}=0
A(a,b)=min{A(a,b),A(a,a)+A(a,b)}=10
A(a,c)=min{A(a,c),A(a,a)+A(a,c)}=999

A(a,d)=min{A(a,d),A(a,a)+A(a,d)}=40
From b
A(b,a)=min{A(b,a),A(b,a)+A(a,a)}=999
A(b,b)=min{A(b,b),A(b,a)+A(a,b)}=0
A(b,c)=min{A(b,c),A(b,a)+A(a,c)}=999
A(b,d)=min{A(b,d),A(b,a)+A(a,d)}=20
From c
A(c,a)=min{A(c,a),A(c,a)+A(a,a)}=50
A(c,b)=min{A(c,b),A(c,a)+A(a,b)}=60
A(c,c)=min{A(c,c),A(c,a)+A(a,c)}=0
A(c,d)=min{A(c,d),A(c,a)+A(a,d)}=90
From d
A(d,a)=min{A(d,a),A(d,a)+A(a,a)}=110
A(d,b)=min{A(d,b),A(d,a)+A(a,b)}=999
A(d,c)=min{A(d,c),A(d,a)+A(a,c)}=60
A(d,d)=min{A(d,d),A(d,a)+A(a,d)}=0
0
10 999 40
D = 999 0
999 20
50 60 0
90
110 999 60 0
Similarly the following matrices are computed
(1)

(2)

D =

0
999
50
110
0

10
0
60
999
10

999
999
0
60
999

30
20
80
0
30

D(3) =

130 0
999 20
50 60 0
80
110 120 60 0
0
130
50
110

(4)

D =

10
0
60
120

90
80
0
60

30
20
80
0

Analysis:
Input size: order of the matrix, n.
Basic operation: dij (k) = min{dij (k-1), dik (k-1) + d kj (k-1)}
Let C (n) be the number of times the above operation is performed.
n
C(n) =
k=1

i=1

n
1
j=1

n
=
k=1

n
(n-1+1)
i=1

n
=
k=1

n
n
i=1

n
= n (n-1+1)
k=1
C(n) (n3)

8.4 Optimal Binary Search Trees


Problem: Given n keys a1 < < an and probabilities p1 pn
searching for them, find a BST with a minimum average number of
comparisons in successful search. Since total number of BSTs with n
nodes is given by C(2n,n)/(n+1), which grows exponentially.
Example: An optimal Binary Search Tree (BST) for keys A, B, C, and
D with search probabilities 0.1, 0.2, 0.4, and 0.3, respectively

C
B

Average number of comparisons


= 1*0.4 + 2*(0.2+0.3) + 3*0.1
= 1.7
Let C[i,j] be minimum average number of comparisons made in T[i,j],
optimal BST for keys ai < < aj , where 1 i j n. Consider
optimal BST among all BSTs with some ak (i k j ) as their root;
T[i,j] is the best among them.
ak

Optimal
BST for
a i , ..., a

k-1

Optimal
BST for
a k +1 , ..., a

k-1
C[i,j] = min {pk 1 + ps (level as in T[i,k-1] +1) +
ikj
s=i
j
ps (level as in T[k+1,j] +1)}
s =k+1
After simplifications, we obtain the recurrence for C[i,j] :
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ps for 1 i j n
C[i,i] = pi

for 1 i j n
0

p1
0

n
goal

p2

C[i,j]

pn
n+1

Example:

key
A B C D
probability 0.1 0.2 0.4 0.3

The tables below are filled diagonal by diagonal: the left one is filled using
the recurrence
j
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ps , C[i,i] = pi ;
ikj
s=i
the right one, for trees roots, records ks values giving the minima

Analysis:
Time efficiency: (n3) but can be reduced to (n2) by taking
advantage of monotonicity of entries in the
root table, i.e., R[i,j] is always in the range
between R[i,j-1] and R[i+1,j]
Space efficiency: (n2)

8.5 Knapsack Problem


Given n items of known weights w1, . . . wn and values v1, . . . vn and a
knapsack of capacity W, find the most valuable subset of the items
that fit into the knapsack.

A recurrence relation that expresses a solution to an instance of the


knapsack problem in terms of solutions to its smaller sub instances
can be derived as follows:

o Consider an instance defined by the first i items, 1 i n, with


weights w1, . . . wi and values v1, . . . vi, and a knapsack
capacity j, 1 j W.

o Let V [i, j] be the value of an optimal solution to this instance.


o All the subsets of the first I items that fit the knapsack of
capacity j can be divided into two categories:
1. The subsets that do not include the ith item, hence an
optimal subset is given by V [i 1, j].
2. The subsets that do include the ith item, an optimal subset
is made up of this item and an optimal subset of the first
i 1 items that fit into the knapsack of capacity j wi.
The value of such an optimal subset is
vi + V [i 1, j wi].
o Thus the following recurrence can be obtained:
V [i, j] = max {V [i 1, j], vi + V [i 1, j wi]} if j wi 0
V [i 1, j]

if j wi < 0

With the following initial conditions


V [0, j] = 0 for j 0 and V [i, 0] = 0 for I 0

j wi

i1 0
wi, vi

n 0

V [i 1, j wi]

V [i 1, j]
V [i, j]

goal

The goal is to find V [n, W], the maximal value of a subset of the
given items that fit into the knapsack of capacity W, and an optimal
subset itself.

Example:
Item

weight

value

Rs. 12

Rs. 10

Rs. 20

Rs. 15

Capacity W = 5

V [0, 0] = V [0, 1] = V [0, 2] = V [0, 3] = V [0, 4] = V [0, 5] = 0,


because V [0, j] = 0.

V [1, 0] = V [2, 0] = V [3, 0] = V [4, 0] = 0, because V [i, 0] = 0.

V [1, 1]:
j wi = 1 2 which is < 0.
V [1, 1] = V [i - 1, j] = V [1 - 1, 1] = 0

V [1, 2]:
j wi = 2 2 which is = 0,
V [1, 2] = max{V [i 1, j], vi + V [i 1, j wi]}
= max{V [1 - 1, 2], 12 + V [1 1, 2 2]}=max {0, 12 + 0}
= 12

V [1, 3]:
3 2 > 0,
V [1, 3] = max{V [1 - 1, 3], 12 + V [1 1, 3 2]}=max {0, 12 + 0}
= 12

Similarly, V [1, 4] = V [1, 5] = 12

V [2, 1]:
11=0
V [2, 1] = max{V [2 - 1, 1], 10 + V [2 1, 1 1]}=max {0, 10 + 0}
= 10
V [2, 2]:
21>0
V [2, 2] = max{V [2 - 1, 2], 10 + V [2 1, 2 1]}
= max {12, 10 + 0}
= 12
V [2, 3]:
31>0
V [2, 3] = max{V [2 - 1, 3], 10 + V [2 1, 3 1]}
= max {12, 10 + 12} = 22

The rest of the table entries can be filled similarly. The completely
filled table is as shown below:

Capacity j
i

w1 = 2, v1 = 12

12

12

12

12

w2 = 1, v2 = 10

10

12

22

22

22

w3 = 3, v3 = 20

10

12

22

30

32

w4 = 2, v4 = 15

10

15

25

30

37

Thus, the maximum value is V [4, 5] = Rs. 37


The composition of an optimal subset can be found out by tracing
back the computations of the entries in the table as follows:
Since V [4, 5] V [3, 5], item 4 was included in an optimal
solution. The remaining knapsack capacity is W - w4 = 5 2 =
3. Hence to find the next item go to V [3, 3].
Since V [3, 3] = V [2, 3], item 3 is not a part of an optimal
subset. Hence to find the next item go to V [2, 3].
Since V [2, 3] V [1, 3], item 2 is a part of an optimal solution.
To find the next item go to V [1, 3 - 1].
Since V [1, 2] V [0, 2], item 1 is the final part of the
optimal solution.
Thus the optimal subset consists of the following items
{item 1, item 2, item 4}

8.6 Memory Functions


The recurrence solves the problem in a top down approach, but a
dynamic programming approach fills the table bottom up.

The top down approach solves the common subproblems more than
once and hence it is very inefficient.
Even though the bottom up approach solves common subproblems
only once, it solves all the subproblems but some of these smaller
subproblems are often not necessary for getting a solution
Memory functions combine the strengths of the top down and bottom
up approach and solve only subproblems that are necessary only
once.
Initially all the table entries are initialized with a special Null symbol
to indicate that they have not yet been calculated by a technique
called virtual initialization and recursive function is called with the
values i = n and j = w.
The idea of memory functions is implemented in the following
algorithm.

Algorithm: MFKnapsack (i, j)


If V [i, j] < 0
If j < Weights [i]
Value MFKnapsack (i 1, j)
Else
Value max (MFKnapsack (i 1, j), Values [i] +
MFKnapsack (i 1, j Weights[i]))
V [i, j] Value
return [i, j]

Only 10 out of 20 nontrivial values are computed applying the


memory function method to the knapsack instance considered
previously as shown below.
Capacity j
i

w1 = 2, v1 = 12

12

12

12

w2 = 1, v2 = 10

12

22

22

w3 = 3, v3 = 20

22

32

w4 = 2, v4 = 15

37

Design and Analysis of Algorithms

9. Greedy Technique
Greedy technique suggests constructing a solution to a sequence of
steps each expanding a partially constructed solution obtained so far
until a complete solution to the problem is reached.
It constructs a solution to an optimization problem piece by piece
through a sequence of choices that are:
Feasible, i.e. satisfying the constraints
Locally optimal (with respect to some neighborhood
definition)
Greedy (in terms of some measure), and irrevocable

For some problems, it yields a globally optimal solution for every


instance. For most, does not but can be useful for fast approximations.
Applications of the Greedy technique
Optimal solutions:
o Change making for normal coin denominations
o Minimum spanning tree
o Single source shortest path
o Simple scheduling problems
o Huffman codes
Approximations /Heuristics
o Traveling salesman problem
o Knapsack problem
o Other combinatorial optimizations problem

Minimum Spanning Tree (MST)


A Spanning Tree of a connected graph is its connected acyclic
subgraph that contains all the vertices of the graph
Minimum spanning tree of a weighted, connected graph G is a
spanning tree of G of the minimum total weight
Example: Consider the graph below

The minimum spanning tree for the above graph is as shown

c
a

9.1 Prims Algorithm


It constructs a minimum spanning tree through a sequence o f
expanding subtrees.
The initial subtree consists of a single vertex selected arbitrarily from
the set V of graphs vertices.
Start with tree T1 consisting of one (any) vertex and grow tree one
vertex at a time to produce Minimum Spanning Tree through a series
of expanding subtrees T1, T2, , Tn
On each iteration, construct Ti+1 from Ti by adding vertex not in Ti
that is closest to those already in Ti (this is a greedy step!)
Stop when all vertices are included in the tree
Algorithm: Prim(G)
VT {v0}
ET Null
for i 1 to |V| - 1 do

Find a minimum weight edge e = (v, u) among all the edges (v, u)
such that v is in VT and u is in V - VT
VT VT { u}
ET ET { e}
return ET
Example: Consider the graph given below

Cost adjacency matrix


1
2
3
4
5
6

Distance Matrix:
D(1) = 0
D(2) = 5
D(3) = 10
D(4) =

0
5
10
15

1
0
5
10

2
3
4
5
10
0
15

0
15 0
30 35
20 25 45

30

35
0
40

20
25
45
40
0

Near Matrix:
0
5
10
15

N(1) = 1
N(2) = 1
N(3) = 1
N(4) = 1

1
1
1
2

1
1
1
2

D(5) =
D(6) =

30
20

30
20

N(5) = 1
N(6) = 1

2
2

2
2

Minimum Spanning Tree is:


Correctness of Prims Algorithm:
To prove:
Each of the subtrees Ti , I = 0,1,,n-1 generated by Prims is a part of
some minimal spanning tree.
Proof:
Basis: A single vertex must be a part of any minimal spanning tree.
Assumption: Let Ti-1 be part of some minimum spanning tree.

Induction: We have to prove that Ti generated from Ti-1 is also a part


of minimum spanning tree. It will be proved by contradiction as
follows:
Assume that no minimum spanning tree of the graph can
contain Ti
Let Ei = (v, u) be the minimum weight edge from a vertex in Ti-1
to a vertex not in Ti-1
By the assumption, Ei cannot belong to any minimum spanning
tree including T
Therefore if we add Ei to T, a cycle must be formed consisting
of another edge (v1, u1) connecting a vertex v1 belonging to Ti-1
to a vertex u1 not in Ti-1
If we delete the edge (v1, u1) from this cycle, we obtain another
spanning tree of the entire graph whose weight is T. Hence,
this spanning tree which contradicts the assumption that no
minimum spanning tree contains Ti
Analysis:
O(|V|2) for graphs represented by weight matrix and array
implementation of priority queue.
O(|E|log|V|) for graphs represented by adj. lists and min-heap
implementation of priority queue.

9.2 Kruskals Algorithm


Sort the edges in nondecreasing order of lengths.
Start with a empty subgraph and keep adding edges from the sorted
list to the current subgraph.

On each iteration, add the next edge on the sorted list only if it
would not create a cycle, if it would, skip the edge.
Some of the subset operations necessary for kruskals algorithm
are:
Makeset (x): It creates one element set
Find (x): Returns the representative of the subset containing
x
Union (x, y): It constructs the union of the disjoint subsets
Sx and Sy containing x and y respectively and adds it to the
collection to replace Sx and Sy
Implementation of disjoint subsets:
a) Quick Find:
It uses an array Arr, indexed by the elements of the
underlying set S
The arrays values indicates the subsets representative
containing those elements
Each subset is implemented as a linked list whose header
contains pointer to the first and last elements of the list along
with the number of elements in the list.
Makeset ():
o Assign the corresponding element in the representative
array to x i.e., Arr[i] = i.
o (n)
Find (x):
o Retrieve the xs representative in the representative array
i.e., return Arr[x].

o (1)
Union (x, y):
o Implemented by replacing all the entries of the array Arr
where x is present with y or vice versa.
o O(n)
b) Quick Union:
Represent each subset by a rooted tree.
The nodes of the tree contain the subsets elements with the
roots element considered the subsets representative.
The trees edges are directed from children to their parents.
Makeset (x):
o It requires the creation of a single node tree.
o (n)
Find (x):
o Performed by following the pointer chain from the node
containing x to the trees root whose element is returned
as the subsets representative
o (n)
Union (x, y):
o Implemented by attaching the root of the ys tree to the
root of the xs tree
o (1)
Algorithm: Kruskal(G)
ET Null
ecounter 0
k0

while ecounter < |V| - 1


kk+1
if ET {eik} is acyclic
ET ET {eik}
ecounter ecounter + 1
return ET
Analysis:
Efficiency of Kruskals algorithm (E log E) if efficient sorting and
union-find algorithms are used.
Example: Consider the graph given below and assume the set operations are implemented
using the Quick find implementation.

9.3 Dijkstras Algorithm


Also called single source shortest path algorithm
Single Source Shortest Paths Problem: Given a weighted connected
(directed) graph G, find shortest paths from source vertex s to each of
the other vertices
Dijkstras algorithm: Similar to Prims MST algorithm, with a
different way of computing numerical labels: Among vertices not
already in the tree, it finds vertex u with the smallest sum
dv + w(v,u)
Where,
v - is a vertex for which shortest path has been already found
on preceding iterations (such vertices form a tree rooted at s).
dv - is the length of the shortest path from source s to v.
w(v, u) - is the length (weight) of edge from v to u.
For a given vertex called the source in a weighted connected graph
find the shortest paths to all its other vertices.
It finds the shortest path from the source to a vertex nearest to it
then to a second nearest and so on.
Before the ith iteration, the algorithm would have identified the
shortest paths to i - 1 other vertices nearest to the source. These
vertices, the source and the edges of the shortest paths leading
to them from the source form a subtree Ti of the given graph.
The next vertex nearest to the source can be formed among the
vertices adjacent to the vertices of Ti called as fringe vertices.

To identify the ith nearest vertex, the algorithm computes for


every fringe vertex u, the sum of the distance to the nearest tree
vertex v and then selects the vertex with the smallest such sum.

Algorithm: Dijkstra(G, s)
Initialize(Q)
for every vertex v in V do
dv
Insert(Q, v, dv)
ds 0
Decrease(Q, s, ds)
VT Null
For I 0 to |V| - 1 do
u DeleteMin(Q)
VT VT { u}
For every vertex u in V - VT that is adjacent to u do
If du + w(u, u) < du
du du + w(u, u)
pu u
Decrease(Q, u, du)

Example: Consider the following graph

Analysis:
O(|V|2) for graphs represented by weight matrix and array
implementation of priority queue.
O(|E|log|V|) for graphs represented by adj. lists and min-heap
implementation of priority queue.

9.4 Huffman Tree


Huffman Tree is a binary tree that minimizes the weighted path length
from the root to the leaves containing a set of predefined weights.
The most important application of Huffman trees is Huffman codes.
A Huffman code is an optimal prefix-free variable-length encoding
scheme that assigns bit strings to characters based on their frequencies
in a given text. This is accomplished by a greedy construction of a
binary tree whose leaves represent the alphabet characters and whose
edges are labeled with 0s and 1s.
Huffman encoding is a variable length encoding technique, which is
one of the most important file compression techniques.
It generates prefix free codes ie in a prefix code; no code word is a
prefix of a code word of another character.
A tree constructed by Huffmans algorithms is called a Huffman tree.

Huffmans algorithms:
Initialize n one-node trees with alphabet characters and the tree
weights with their frequencies.

Repeat the following step n-1 times: join two binary trees with
smallest weights into one (as left and right subtrees) and make its
weight equal the sum of the weights of the two trees.
Mark edges leading to left and right subtrees with 0s and 1s,
respectively.

Example:
Consider 5 character analphabet {A, B, C, D, _} with following occurrence probability.
------------------------------------------------------------

Character
Probability:

0.35 0.1

0.2

0.2

0.15

-------------------------------------------------------------

Resulting code word is as follows:


----------------------------------------------------------Character A
B
C
D
-----------------------------------------------------------Probability 0.35 0.1

0.2

0.2

0.15

Code word 11

00

01

101

100

-----------------------------------------------------------With occurrence probability given and code word length obtained, expected
number of bits per character in this code is
2 * 0.35 + 3 * 0.1 + 2 * 0.2 + 2 * 0.2 + 3 * 0.15 = 2.25

Analysis:
The list of characters can be implemented as a min-heap.
For loop is executed exactly n -1 times and since each heap operation
requires time O(log n) loop contributes O(nlog n) to the running time.
Thus total running time of Huffman on a set of n characters is O(n
log n).

Design and Analysis of Algorithms


10. Limitations of Algorithm Power
10.1 Lower-Bound Arguments
It is an estimate on a minimum amount of work needed to solve a
given problem. E.g.: Number of comparisons needed to find the

largest element in a set of n numbers, Number of comparisons needed


to sort an array of size n, etc.
Lower bound can be
o An exact count
o An efficiency class
Tight lower bound: A lower bound is said to be tight if there exists an
algorithm with the same efficiency as the lower bound.

Problem

Lower bound
Tightness

Sorting (comparison-based)

(nlog n)

yes

Searching in a sorted array

(log n)

yes

Element uniqueness

(nlog n)

yes

n-digit integer multiplication

(n)

unknown

Multiplication of n-by-n matrices

(n2)

unknown

If there is a gap between the efficiency of the fastest algorithm and the
best lower bound known, there is roo m for possible improvement :
either a faster algorithm matching the lower bound could exist or a
better lower bound could be proved.
If the lower bound is already tight, we can hope for a constant-factor
improve ment at best.
Methods for establishing lower bounds
1. Trivial Lower Bounds
2. Information-Theoretic Arguments
3. Adversary Arguments
4. Proble m Reduction

Trivial Lower Bounds


It is based on counting the number of ite ms in the problems input that
must be processed and the number of output items that need to be
produced. E.g.: Any algorithm for generating all permutations of n
distinct items must be in (n !) because the size of the output is n!.
Drawbacks:
o They are often too low to be useful. E.g.: The trivial bound for
the traveling salesman problem is (n2), because its input is n
(n 1) / 2 intercity distances and its output is a list of n + 1
cities making up an optimal tour. But there is no know n
algorithm for this problem with the running time being a
polynomial function of any degree.
o Determining which part of an input must be processed by any
algorithm solving the proble m in question. E.g.: Searching for
an element of a given value in a sorted array does not require
processing all its elements.
TT

Information-Theoretic Arguments
It is called information-theoretic argument because of its connection
to information theory.

It seeks to establish a lower bound based on the amount of


information it has to produce. E.g.: Consider the game of deducing a
positive integer between 1 and n selected by somebody by asking that
person questions with yes/no answers.
Lower bound - log2 n, (the number of bits needed to specify a
particular number among the n possibilities) as each question can be

thought of as yielding at most one bit of information about the


selected number.

It has proved to be quite useful for finding the so-called informationtheoretic lower bounds for many problems involving comparisons.

Its underlying idea can be realized much more precisely through the
mechanism of decision trees.

Adversary Arguments

It is based on following the logic of a malevolent (wishing evil to


others) but honest adversary (opponent): the malevolence makes him
push the algorithm down the most time-consuming path while his
honesty forces him to stay consistent with the choices already made.

A lower bound is then obtained by measuring the amount of work


needed to shrink a set of potential inputs to a single input along the
most time-consuming path.

E.g.: Merging two sorted lists of size n into a single list of size 2n.
a1 < a2 < < an and b1 < b2 < < bn
o To derive the lower bound, the adversary will employ the

following rule: reply true to the comparison ai < bi if and


only if i < j, which implies that the smaller elements to the
merged list come from alternate lists.
o The above rule will force any correct merging algorithm to

produce the only combined list consistent with this rule:


b1 < a1 < b2 < a2 < < bn < an
o To produce this combined list, any correct algorithm will

have to explicitly compare 2n 1 adjacent pairs of its


elements i.e., b1 to a1, a1 to b2 and so on.

o Hence the lower bound on the number of key comparisons

made by any comparison-based algorithm for merging two


sorted lists is 2n 1.

Problem Reduction

To show that problem P is at least as hard as another problem Q with


a known lower bound, we need to reduce Q to P.

Then a lower bound for Q will be a lower bound for P.

The table below lists several important problems that are often used
for this purpose.

Problem

Lower bound
Tightness

Sorting (comparison-based)

(nlog n)

yes

Searching in a sorted array

(log n)

yes

Element uniqueness

(nlog n)

yes

n-digit integer multiplication

(n)

unknown

Multiplication of n-by-n matrices

(n2)

unknown

E.g.: Euclidean minimum spanning tree problem: Given n points in


the Cartesian plane, construct a tree of minimum total length whose
vertices are the given points.
o To establish its lower bound, we use the element uniqueness

problem whose lower bound is known.


o Transform the set of n real numbers x1, x2, , xn into a set of n

points in the Cartesian plane by simply adding 0 as the points y


coordinate: (x1, 0), (x2, 0), . . . , (xn, 0).
o Let T be a minimum spanning tree found for this set of points.

o Since T must contain a shortest edge, checking whether T

contains a zero-length edge will answer the question about


uniqueness of the given numbers.
o This reduction implies that (nlog n) is a lower bound for the

Euclidean minimum spanning tree problem, too.

Since final results about complexity of many problems are not known,
the reduction technique is often used to compare the relative
complexity of problems.
E.g.: The formulas
x . y = ((x + y)2 (x - y)2) / 4

and

x2 = x . x

show that the problem of computing the product of two n-digit


integers and squaring an n-digit integer belong to the same
complexity class.

10.2 Decision Trees


Decision tree is a convenient model of algorithms involving
comparisons in which:
o Internal nodes represent comparisons
o Leaves represent outcomes (or input cases)
The

The decision tree of an algorithm for finding a minimum of three


numbers is as shown below.
yes

yes

no

a<b

no

yes

a< c

no

b<c

Note that the number of leaves will be at least as large as the number
of possible outcomes.
The algorithms work on a particular input of size n can be traced by a
path from the root to a leaf in its decision tree.
The number of comparisons made by the algorithm on any run is
equal to the number of edges in that path and hence, the number of
comparisons in the worst case is equal to the height of the algorithms
decision tree.
It can be proved that for any binary tree with l leaves and height h,
h log 2 l
which puts a lower bound on the heights of binary decision trees and
hence the worst-case number of comparisons made by any
comparison-based algorithm.

Decision Trees for sorting algorithms


Most sorting algorithms are comparison-based. Therefore, by studying
properties of decision trees for comparison-based sorting algorithms,
important lower bounds on time efficiencies of such algorithms can be
derived.
An outcome of a sorting algorithm can be interpreted as finding a
permutation of the ele ment indices of an input list that puts the lists
elements in ascending order. Hence, the number of possible outco mes
for sorting an arbitrary n-element list is equal to n!.

Since the height of a binary decision tree for any comparison-based


sorting algorithm and hence the worst-case number of comparisons
made by such an algorithm cannot be less than log 2 n! :
Cworst (n) log 2 n!
n log 2 n
The decision tree for the three-element selection sort is as shown
below:

o For the outcome a < c < b obtained by sorting a list a, b, c, the


permutation in question is 1, 3, 2.

The decision tree for the three-element insertion sort is as shown


below:
abc
yes

a<b

no

abc
yes

bac
no

b< c

yes

acb

a<b<c
yes

a<c

a<c<b

no

a<c

bca

b<a<c
no

c<a<b

yes

b<c

b<c<a

Decision trees can also be used for analyzing the average-case


behavior of a comparison-based sorting algorithm.

Decision Trees for searching a sorted array


The principal algorithm for searching a sorted array is binary search.
Binary search deals with three-way comparisons in which search key
K is compared with some ele ment a [i] to see whether K < A [i], K =
A [i] or K > A [i].
An algorithm for searching a sorted array by three-way comparisons
can be represented by a ternary decision tree as shown in the figure
below:
Number of nodes, n = 4

no

c<b<a

For an array of n elements, all such decision trees will have 2n + 1


leaves (n for successful searches and n + 1 for unsuccessful ones).
The minimum height h of a ternary tree with l leaves is log 3 l. Hence
the lower bound on the number of worst-case comparisons is :
Cworst (n) log 3 (2n + 1)
Since this lower bound is smaller than log

(n + 1), the number of

worst-case comparisons for binary search, a better lower bound can be


obtained by considering a binary rather than ternary decision tree as
shown below

o Internal nodes in such a tree correspond to the same three-way


comparisons as before, but they also serve as terminal nodes for
successful searches.
o Leaves therefore represent only unsuccessful searches, and
there are n + 1 of them.
Cworst (n) log 2 (n + 1)

10.3 P, NP and NP-Complete Problems


An algorithm is said to solve a problem in polynomial time if its
worst-case time efficiency belongs to O (p (n)) where p (n) is a
polynomial of the problems input size n.
Problems that can be solved in polynomial time are called tractable,
and problems that cannot be solved in polynomial time are called
intractable.

Problems can be classified into two categories:


o Optimization problems: They find a solution that maximizes or
minimizes some objective function.
o Decision problems: Problems with yes / no answers.
Decision problems are more convenient for formal investigation of
their complexity.
Decision problems that cannot be solved at all by any algorithms are
called undecidable. E.g.: Alan Turings halting problem: Given a
computer program and an input to it, determine whether the program
will halt on that input or continue working indefinitely on it.
Proof by contradiction:
o Assume that A is an algorithm that solves the halting problem.
i.e., for any program P and input I,
A (P, I) = 1, if program P halts on input I;
0, if program P does not halt on input I.
o We can consider program P as an input to itself and use the
output of algorithm A for pair (P, P) to construct a program Q
as follows:
Q (P) = halts, if A (P, P) = 0, i.e., if program P does not halt on
input P;
does not halt, if A(P, P) = 1, i.e., if program P halts on
input P.
o Then on substituting Q for P, we obtain
Q (Q) = halts, if A (Q, Q) = 1, i.e, if program Q does not halt on
input Q;
does not halt, if A (Q, Q) = 1, i.e., if program Q halts on
input Q.

o This is a contradiction because neither of the two outcomes for


program Q is possible, and it completes the proof.

Class P (Polynomial)
Class P is a class of decision problems that can be solved in
polynomial time by (deterministic) algorithms. E.g.: searching,
element uniqueness, graph connectivity etc.
Problems for which neither a polynomial-time algorithm has been
found, nor has the impossibility of such an algorithm been proved.
o Hamiltonian circuit
o Traveling sales
o Knapsack
o Graph coloring etc.
Nondeterministic algorithm
It is a two-stage procedure that takes as it input an instance I of a
decision problem and does the following
o Nondeterministic (guessing) stage: An arbitrary string S is
generated that can be thought of as a candidate solution to the
given instance I.
o Deterministic (verification) stage: A deterministic algorithm
takes both I and S as its input and outputs yes if S represents a
solution to instance I.
A nondeterministic algorithm is said to be nondeterministic
polynomial if the time efficiency of its verification stage is
polynomial.

Class NP (Nondeterministic Polynomial)


Class of decision problems whose proposed solutions can be verified
in polynomial time i.e., class of decision problems that can be solved
by nondeterministic polynomial algorithms.
It includes all the problems in P
P C NP

Certain problems like the halting problem, is among the rare examples
of decision problems that are known not to be in NP, which leads to
the most important open question of theoretical computer science: Is P
a proper subset of NP, or are these two classes, in fact, the same?
P =? NP.
NP-complete problems
A decision problem D1 is said to be polynomially reducible to a
decision problem D2 if there exists a function t that transforms
instances of D1 to instances of D2 such that
o t maps all yes instances of D1 to yes instances of D2 and all no
instances of D1 to no instances of D2;
o t is computable by a polynomial-time algorithm.
A decision problem D is said to be NP-complete as shown in the
figure below if
o It belongs to class NP;
o Every problem in NP is polynomially reducible to D.
NP problems

NP -complete
problem

E.g.: CNF-satisfiability problem


o Proving CNF-satisfiability problem is NP-complete was
accomplished independently by Stephen Cook in United States
and Leonid Levin in the Soviet Union.
o It requires finding whether a Boolean expression in its
conjunctive normal form (CNF) satisfiable i.e., are there values
of its variables that make it true.
E.g.: (x1 V x2 V x3) & (x1 V x2) & (x1 V x2 V x3)
Solution: if x1 = true, x2 = true and x3 = false, the entire
expression is true
Showing that a decision problem is NP-complete can be done in two
steps:
o Show that the problem in question is in NP;
o Show that every problem in NP is reducible to the problem in
question in polynomial time.
The property of transitivity of polynomial reduction is useful in
showing that every problem in NP is reducible to the problem in
question as shown in the figure below.
NP problems

known
NP-complete
problem
candidate
for NP completeness

P = NP would imply that every problem in NP, including all NPcomplete proble ms, could be solved in polyno mial time.
If a polyno mial-time algorithm for just one NP-co mplete proble m is
discovered, then every problem in NP can be solved in polyno mial
time, i.e. P = NP.
Most but not all researchers believe that P NP, i.e., P is a proper
subset of NP. But Levin contended that we should expect the P = NP
outcome.

11 Backtracking
It is a more intelligent variation of the exhaustive search technique.
The principal idea is to construct solutions one component at a time
and evaluate such partially constructed candidates as follows:
If a partially constructed solution can be developed further
without violating the problems constraints, it is done by taking
the first remaining legitimate option for the next component.
If there is no legitimate option for the next component, no
alternatives for any remaining component need to be
considered and the algorithm backtracks to replace the last
component of the partially constructed solution with its next
option.
Constructs a tree of choices being made called as the state-space tree
in which
Nodes represent partial solutions and
Edges represent choices in extending partial solutions

Root represents an initial state before the search for a solution


begins.
A node is said to be promising if it corresponds to a partially
constructed solution that may still lead to a complete solution;
otherwise it is called nonpromising.
Leaves represent either nonpromising dead ends or complete
solutions found by the algorithm.
Explore the state space tree using depth-first search.

11.1 4-Queen, 8-Queen and n-Queens problem


The n-queens problem is to place n queens on an n-by-n chessboard so
that no two queens attack each other by being in the same row or in
the same column or on the same diagonal.
For n = 1, the problem has a trivial solution and there is no solution
for n = 2 and n = 3.
4-Queens Problem
Place 4 queens on a 4-by-4 chess board so that no two queens attack
each other.
Since each of the four queens has to be placed in its own row, all we
need to do is to assign a column for each queen on the board presented
in Fig 11.1

Fig 11.1

The state-space tree for a 4-queens problem is as shown in the Fig


11.2
We start with the empty board and then place queen 1 in the
first column of row 1.
Then we place queen 2, in the third column of row 2 after trying
unsuccessfully to place the queen in columns 2 and 3 of row 2.
This proves to be a dead end because there is no acceptable
position for queen 3.
The algorithm backtracks and puts queen 2 in column 4 of row
2.
Then queen 3 is placed at second column, which proves to be
another dead end.
The algorithm backtracks all the way to queen 1 and moves it to
second column
Queen 2 is then placed in the fourth column of the second row,
queen 3 in the first column of the third row and queen 4 in the

third column of the fourth row, which is a solution to the


problem.

Fig 11.2

The solution to the 4-queens problem can be represented by a 4-tuple,


in which the each of the positions represent the row in which the
queen has been placed and the value at each position represents the
column in which the queen is placed in that row. E.g.: The solution
obtained above can be represented as (2, 4, 1, 3)
The search can be continued to find more solutions by resuming its
operation at the leaf at which it stopped. The boards symmetry can
also be used for this purpose.

Another solution to the 4-queens problem is (3, 1, 4, 2)

8-Queens Problem
Place 8 queens on an 8-by-8 chess board so that no two queens attack
each other are shown in Fig 11.3.
A solution requires that no two queens share the same row, column, or
diagonal.
The solution to the 8-Queens problem where n = 8 is:
Divide n by 12. Remember the remainder (n is 8 for the eight
queens puzzle).
Write a list of the even numbers from 2 to n in order.
If the remainder is 3 or 9, move 2 to the end of the list.
Append the odd numbers from 1 to n in order, but, if the remainder
is 8, switch pairs (i.e. 3, 1, 7, 5, 11, 9, )
If the remainder is 2, switch the places of 1 and 3, then move 5 to
the end of the list.
If the remainder is 3 or 9, move 1 and 3 to the end of the list.
Place the first-column queen in the row with the first number in the
list, place the second-column queen in the row with the second
number in the list, etc.

Fig 11.3

A few more examples follow.


14 queens (remainder 2): 2, 4, 6, 8, 10, 12, 14, 3, 1, 7, 9, 11, 13, 5.
15 queens (remainder 3): 4, 6, 8, 10, 12, 14, 2, 5, 7, 9, 11, 13, 15, 1, 3.
20 queens (remainder 8): 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 3, 1, 7, 5,
11, 9, 15, 13, 19, 17.

11.2 Hamiltonian Circuit problem


Finding a path that visits all the graphs vertices exactly once before
returning to the starting vertex.
Consider the problem of finding a Hamiltonian circuit in the graph of
the Fig 11.4

Fig 11.4

Assuming the staring vertex as a, make vertex a the root of the statespace tree.
The first component of the solution is a first intermediate vertex of a
Hamiltonian cycle to be constructed, which is b in this case (using
alphabet order to break tie).
From b, the algorithm proceeds to c, then to d, then to e, and finally to
f, which proves to be a dead end.
So, the algorithm backtracks from f to e, then to d, and then to c,
which provides the first alternative for the algorithm to pursue.
Going from c to e eventually proves useless, and the algorithm has to
backtrack from e to c and then to b.
From there, it goes to the vertices f, e, c, and d, from which it can
legitimately return to a, yielding the Hamiltonian circuit a, b, f, e, c, d,
a.
The state-space tree is shown below in Fig 11.5
To find another solution, the above process can be continued by
backtracking from the leaf of the solution found.

Fig 11.5

11.3 Sum of Subset problem

Find a subset of a given set S = {s1,, sn} of n positive integers


whose sum is equal to a given positive integer d. E.g.: for S = {1, 2, 5,
6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}.

To find a solution, it is convenient to sort the sets elements in the


increasing order. So we will assume that s1 s2 sn.

The state-space tree is constructed as a binary tree.


The root represents the starting point, with no decisions about

the given elements made as yet.

Its left and right children represent, respectively, inclusion and

exclusion of s1 in the set being sought.


Similarly, going to the left from a node of the first level

corresponds to inclusion of s2, while going to the right


corresponds to its exclusion, and so on.
Thus, a path from the root to a node on the ith level of the tree

indicates which of the first i numbers have been included in the


subsets represented by that node.
We record the value of s, the sum of these numbers, in the

node.
If s is equal to d, we have a solution to the problem.
If s is not equal to d, we can terminate the node as

nonpromising if either of the two inequalities holds:


s + si + 1 > d (the sum s is too large)
n
s + sj < d (the sum s is too small)
j=i+1
Example:

Consider the state-space tree of the backtracking algorithm applied to the


instance S = {3, 5, 6, 7} and d = 15 of the subset-sum problem
The number inside a node is the sum of the elements already included
in subsets represented by the node
The leaf indicates the reason for its termination.
The solution is shown in Fig 11.6

Fig 1

Design and Analysis of Algorithms


12. Branch and Bound
An enhancement of backtracking.
Applicable to optimization problems.
For each node (partial solution) of a state-space tree, computes a
bound on the value of the objective function for all descendants of the
node (extensions of the partial solution).
Uses the bound for:
Ruling out certain nodes as nonpromising to prune the tree
if a nodes bound is not better than the best solution seen so far.
Guiding the search through state-space.
The search path at the current node in a state-space tree can be
terminated for any one of the following three reasons:

The value of the nodes bound is not better than the value of the
best solution seen so far.

The node represents no feasible solutions because the


constraints of the problem are already violated.

The subset of feasible solutions represented by the node


consists of a single point and hence we compare the value of the
objective function for this feasible solution with that of the best
solution seen so far and update the latter with the former if the
new solution is better.
Best-First branch-and-bound:
A variation of backtracking.
Among all the nonterminated leaves, called as the live nodes, in
the current tree, generate all the children of the most promising
node, instead of generation a single child of the last promising
node as it is done in backtracking.
Consider the node with the best bound as the most promising
node.

12.1 Assignment Problem


Assign n people to n jobs so that the total cost of the assignment is as
small as possible.
An instance of the assignment problem can be specified by an n-by-n
cost matrix and hence the problem can be stated as that of Selecting
one element in each row of the cost matrix so that:
No two selected elements are in the same column.
The sum is the smallest possible.

Example:
Assign four people (a, b, c, d) to four jobs (1, 2, 3, 4)
The cost of assigning a person to a particular job is given by the cost
adjacency matrix below:

Lower bound can be set by considering the fact that the cost of any
solution cannot be smaller than the sum of the smallest elements in
each of the matrixs rows, i.e. lb = 2 + 3 + 1 + 4 = 10.
Start with the root that corresponds to no elements selected from the
cost matrix with a lower bound, lb = 10.
Considering each one of the elements of the first row as being
selected, the first nodes of the first level of the tree are generated as
follows:
If the first element in the first row (9) is selected, then the
smaller elements in the other three rows (row2 - 3, row3 -1,
row4 - 4) lead to a lower bound of 9 + 3 + 1 + 4 = 17.
If the second element in the first row (2) is selected, then the
smaller elements in the other three rows (row2 - 3, row3 -1,
row4 - 4) lead to a lower bound of 2 + 3 + 1 + 4 = 10.
If the third element in the first row (7) is selected, then 3 in the
second row and 1 in the third row cannot be selected because a

selection in the same column is already made. Hence selecting


the next smaller elements (row2 - 4, row3 -5, row4 - 4) lead to a
lower bound of 7 + 4 + 5 + 4 = 20.
If the fourth element in the first row (8) is selected, then 4 in the
fourth row cannot be selected because a selection in the same
column is already made. Hence selecting the next smaller
elements (row2 - 3, row3 -1, row4 - 6) lead to a lower bound of
8 + 3 + 1 + 6 = 18.
The state-space tree with the first level of four live nodes (1, 2, 3, 4) is
as shown in the Fig 12.1

Fig 12.1

Among the four live nodes, node 2 is the most promising node,
because it has the smallest lower-bound value.
Following the best-first search strategy, branch out from node 2 to
generate the second level of the nodes as follows, assuming the
second element of the first row (2) as already being selected:

If the first element in the second row (6) is selected, then the
smaller elements in the other three rows (row3 -1, row4 - 4)
lead to a lower bound of 2 + 6 + 1 + 4 = 13.
Second element in the second row cannot be selected because
an element in the same column is already selected (2 in row1).
If the third element in the second row (3) is selected, then 1 in
the third row cannot be selected because a selection in the same
column is already made. Hence selecting the next smaller
elements (row3 -5, row4 - 4) lead to a lower bound of 2 + 3 + 5
+ 4 = 14.
If the fourth element in the second row (7) is selected, then 4
and 6 in the fourth row cannot be selected because a selection in
the same columns is already made. Hence selecting the next
smaller elements (row3 -1, row4 - 6) lead to a lower bound of 2
+ 7 + 1 + 7 = 17.
The state-space tree with the first two levels consisting six live nodes
(1, 3, 4, 5, 6, 7) is as shown in the Fig 12.2

Fig 12.2

Among the six live nodes, node 5 is the most promising node.
Branching out from node 5 to generate the third level of the nodes as
follows, assuming the second element of the first row (2) and first
element of the second row as already being selected:
First element in the third row cannot be selected because an
element in the same column is already selected (6 in row2).
Second element in the third row cannot be selected because an
element in the same column is already selected (2 in row1).
If the third element in the third row (1) is selected, then
selecting 4 from row4 will lead to a lower bound of 2 + 6 + 1 +
4 = 13.
If the fourth element in the third row (8) is selected, then 4 in
the fourth row cannot be selected. 7 and 6 also cannot be
selected. Hence selecting 9 from row4 will lead to a lower
bound of 2 + 6 + 8 + 9 = 25.
The state-space tree with the third level of nodes is as shown in the
Fig 12.3

Fig 12.3

Inspecting the leave of the state-space tree reveals that leaf 8 with a
minimum cost of 13 represents the value of the best selection.

12.2 Knapsack Problem


Given n items of known weights wi and values vi, i = 1, 2, , n, and a
knapsack of capacity W, find the most valuable subset of the items
that fit in the knapsack.
The solution to the knapsack problem can be found as follows:
Step 1: Order the items in decreasing order of relative values:
v1/w1 vn/wn.
Step 2: Select the items in this order skipping those that dont
fit into the knapsack

The state-space tree can be structured as a binary tree constructed as


follows:
Each node on the ith level of this tree, 0 i n, represent all the
subsets of n items that include a particular selection made from
the first i ordered items.
A particular selection is uniquely determined by a path from the
root to that node: a branch going to the left indicates the
inclusion of the next item while the branch going to the right
indicates its exclusion.
Record the total weight w and the total value v of this selection
in the node, along with some upper bound ub on the value of
any subset that can be obtained by adding zero or more items to
this selection.
A simple way to compute the upper bound ub is to add to v, the
total value of the items already selected, the product of the
remaining capacity of the knapsack W w and the best per unit
payoff among the remaining items, which is vi + 1/wi + 1.
Ub = v + (W w) (vi + 1/wi + 1).

Example:
Consider a knapsack with capacity 10.
item
weight
value
v/w
-------------------------------------------1
4
Rs.40
10
2
7
Rs.42
6
3
5
Rs.25
5
4
3
Rs.12
4
The state-space tree is as shown in the Fig 12.4

Fig 12.4

At the root, no items have been selected.


Hence,

w=0
v =0
ub = 0 + (10 0) (40 / 4) = 100.

The left child of the root, node 1, represents the subsets that
include item 1, with:
w=4
v = 40
ub = 40 + (10 4) (42 / 7) = 76.

The right child of the root, node 2, represents the subsets that
do not include item 1, with:
w=0
v=0
ub = 0 + (10 0) (42 / 7) = 60.
Since node 1 has a large upper bound than the upper bound of
node 2, it is more promising and we branch from node 1 first.
Node 3 represents the subsets that include item 2, but the total
weight of every subset represented by node 3 exceeds the
knapsacks capacity; hence node 3 can be terminated
immediately.
Node 4 represents the subsets that do not include item 2, with:
w=4
v = 40
ub = 40 + (10 4) (25 / 5) = 70.
Branch out from node 4, because it has a better upper bound
than node 2.
Node 5 represents the subsets that not include item 3, with:
w=9
v = 65
ub = 65 + (10 9) (12 / 3) = 69.
Node 6 represents the subsets that do not include item 3, with:
w=4
v = 40
ub = 40 + (10 4) (12 / 3) = 64.
Branch out from node 5, because it has a better upper bound
than node 6.

Node 7 represents the subsets that include item 4, but the total
weight of every subset represented by node 7 exceeds the
knapsacks capacity; hence node 7 can be terminated
immediately.
Node 8 represents the subsets that do not include item 4, with:
w=9
v = 65
ub = 65 + (10 9) (0 / 0) = 65.
Among the live nodes 2, 6 and 8, nodes 2 and 6 have smaller
upper-bound values than the value of node 8. Hence both can be
terminated making the subset {1, 3} of node 8 with value 65 the
optimal solution to the problem.

12.3 Traveling Salesman Problem


Given n cities with known distances between each pair, find the
shortest tour that passes through all the cities exactly once before
returning to the starting city.
A lower bound on the length l of any tour can be computed as follows
For each city i, 1 i n, find the sum si of the distances from
city i to the two nearest cities.
Compute the sum s of these n numbers.
Divide the result by 2 and round up the result to the nearest
integer
lb = s / 2
The lower bound for the graph shown in the Fig 12.5 can be computed
as follows:
lb = [(1 + 3) + (3 + 6) + (1 + 2) + (3 + 4) + (2 + 3)] / 2 = 14.

Fig 12.5

For any subset of tours that must include particular edges of a given
graph, the lower bound can be modified accordingly. E.g.: For all the
Hamiltonian circuits of the graph that must include edge (a, d), the
lower bound can be computed as follows:
lb = [(1 + 5) + (3 + 6) + (1 + 2) + (3 + 5) + (2 + 3)] / 2 = 16.
Applying the branch-and-bound algorithm, with the bounding
function lb = s / 2, to find the shortest Hamiltonian circuit for the
given graph, we obtain the state-space tree as shown below:
To reduce the amount of potential work, we take advantage of the
following two observations:
We can consider only tours that start with a.
Since the graph is undirected, we can generate only tours in
which b is visited before c.
In addition, after visiting n 1 cities, a tour has no choice but to visit
the remaining unvisited city and return to the starting one is shown in
the Fig 12.6

Fig 12.6
Root node includes only the starting vertex a with a lower
bound of
lb = [(1 + 3) + (3 + 6) + (1 + 2) + (3 + 4) + (2 + 3)] / 2 = 14.
Node 1 represents the inclusion of edge (a, b)
lb = [(1 + 3) + (3 + 6) + (1 + 2) + (3 + 4) + (2 + 3)] / 2 = 14.
Node 2 represents the inclusion of edge (a, c). Since b is not
visited before c, this node is terminated.
Node 3 represents the inclusion of edge (a, d)
lb = [(1 + 5) + (3 + 6) + (1 + 2) + (3 + 5) + (2 + 3)] / 2 = 16.
Node 1 represents the inclusion of edge (a, e)
lb = [(1 + 8) + (3 + 6) + (1 + 2) + (3 + 4) + (2 + 8)] / 2 = 19.

Among all the four live nodes of the root, node 1 has a better
lower bound. Hence we branch from node 1.
Node 5 represents the inclusion of edge (b, c)
lb = [(1 + 3) + (3 + 6) + (1 + 6) + (3 + 4) + (2 + 3)] / 2 = 16.
Node 6 represents the inclusion of edge (b, d)
lb = [(1 + 3) + (3 + 7) + (1 + 2) + (3 + 7) + (2 + 3)] / 2 = 16.
Node 7 represents the inclusion of edge (b, e)
lb = [(1 + 3) + (3 + 9) + (1 + 2) + (3 + 4) + (2 + 9)] / 2 = 19.
Since nodes 5 and 6 both have the same lower bound, we
branch out from each of them.
Node 8 represents the inclusion of the edges (c, d), (d, e) and
(e, a). Hence, the length of the tour,
l = 3 + 6 + 4 + 3 + 8 = 24.
Node 9 represents the inclusion of the edges (c, e), (e, d) and
(d, a). Hence, the length of the tour,
l = 3 + 6 + 2 + 3 + 5 = 19.
Node 10 represents the inclusion of the edges (d, c), (c, e) and
(e, a). Hence, the length of the tour,
l = 3 + 7 + 4 + 2 + 8 = 24.
Node 11 represents the inclusion of the edges (d, e), (e, c) and
(c, a). Hence, the length of the tour,
l = 3 + 7 + 3 + 2 + 1 = 16.
Node 11 represents an optimal tour since its tour length is better
than or equal to the other live nodes, 8, 9, 10, 3 and 4.
The optimal tour is a b d e c a with a tour length
of 16.

Design and Analysis of Algorithms


13. Approximation Algorithms for NP-hard Problems
The optimization versions of combinatorial problems fall in the class
of NP-hard problems.
NP-hard problems are those that are at least as hard as NP-complete
problems. Hence, there are no known polynomial time algorithms.
Since optimization problems dont have polynomial time algorithms,
they are solved approximately by a fast algorithm.
Solutions obtained by an approximation algorithm are not necessarily
optimal but hopefully close to it.
Accuracy measures: To know how accurate the approximation
solution is when compared to the actual optimal solution, we quantify
the accuracy of an approximate solution sa by the size of the relative
error of this approximation called as the accuracy ratio as follows:
For minimization problems
re (sa) = [f (sa) f (s*)] / f (s*)
= f (sa) / f (s*) 1
r (sa) = f(sa) / f (s*)
For maximization problems
r (sa) = f(s*) / f (sa)
Where, f(sa) and f(s*) are values of the objective function f for the
approximate solution sa and actual optimal solution s*.
Performance ratio (RA):
The lowest upper bound of r(sa) on all instances.

It serves as the principal metric indicating the quality of the


approximation algorithm.
It is desirable to have approximation algorithms with RA close
to 1 as possible, because closer R A is to 1, the better the
approximate solution is.
A polynomial-time approximation algorithm is said to be a capproximation algorithm if its performance ratio is at most c; that is,
for any instance of the problem in question, f(sa) c f(s*)

13.1 Traveling Salesman Problem


Given n cities with known distances between each pair, find the
shortest tour that passes through all the cities exactly once before
returning to the starting city.
Here, we consider two simple approximation algorithms for this
problem.
Nearest-neighbor algorithm
Twice-around-the-tree algorithm.
Nearest-Neighbor Algorithm for TSP
Step 1: Choose an arbitrary city as the start.
Step 2: Repeat the following operation until all the cities have been
visited: Go to the unvisited city nearest the one visited last.
Step 3: Return to the starting city.

Example:
Consider the graph given in the Fig 13.1 below:

Fig 13.1
With a as the starting vertex, the nearest-neighbor algorithm yields
the tour sa: a b c d a of length 10.
The optimal solution is the tour s*: a b d c a of length 8.
The accuracy ratio of this approximation is
r (sa) = f(sa) / f (s*) = 10 / 8 = 1.25
meaning that tour sa is 25% longer than the optimal tour s*.
It can force us to traverse a very long edge on the last leg of the tour.
Hence, RA = infinity. E.g: Edge (d, a) in the above graph can be made
as large as we wish.
Twice-Around-the-Tree Algorithm
Stage 1: Construct a minimum spanning tree of the graph
corresponding to a given instance of the TSP.

Stage 2: Starting at an arbitrary vertex, perform a walk around the


minimum spanning tree recording the vertices passed by.
Stage 3: Scan the list of vertices obtained in Step 2 and eliminate
from it all repeated occurrences of the same vertex except the starting
one at the end of the list. The vertices remaining on the list will form
a Hamiltonian circuit, which is the output of the algorithm.
Example:
Consider the graph in the Fig 13.2 given below

Fig 13.2
The minimum spanning tree of this graph is made up of edges (a, b),
(b, c), (b, d) and (d, e) as shown in the Fig 13.3 below

Fig 13.3
A twice-around-the-tree walk that starts and ends at a is
a, b, c, b, d, e, d, b, a
Eliminating the second b (a shortcut from c to d), and then the second
d and the third b (a shortcut from e to a) yields the Hamiltonian
circuit a, b, c, d, e, a.

13.2 Knapsack Problem

Given n items of known weights wi and values vi, i = 1, 2, , n, and


a knapsack of capacity W, find the most valuable subset of the items
that fit in the knapsack.

A knapsack problem is said to be discrete if items are selected as a


whole and not as a fraction, in which case it is said to be continuous.

Greedy algorithms for the discrete knapsack problem:


1. Select the items in decreasing order of their weights.

Drawback: Heavier items may not be the most valuable


in the set.

2. Select the items in decreasing order of their value.

Drawback: There is no guarantee that the knapsacks


capacity will be used efficiently.

3. Compute the value-to-weight ratios vi / wi, i = 1, 2, , n, and

select the items in decreasing order of these ratios

Step 1: Computer the value-to-weight ratios ri = vi / wi,


i = 1, , n, for the items given.

Step 2: Sort the items in nonincreasing order of the ratios


computed in step 1.

Step 3: Repeat the following operation until no item is


left in the sorted list: if the current item on the list fits
into the knapsack, place it in the knapsack; otherwise,
proceed to the next item.

Example:
Consider a knapsack with capacity 10 and item information as
follows:
item
weight
value
---------------------------------1
7
Rs.42
2
3
Rs.12
3
4
Rs.40
4
5
Rs.25
Computing the value-to-weight ratios and sorting the items in
nonincreasing order of these efficiency ratios yields
item
weight
value
v/w
-------------------------------------------1
4
Rs.40
10

2
3
4

7
5
3

Rs.42
Rs.25
Rs.12

6
5
4

The greedy algorithm will select the first item of weight 4, skip the
next item of weight 7, select the next item of weight 5, and skip the
last item of weight 3.

Optimal subset includes items {1, 3} with value RS.65.

It does not always yield an optimal solution.

Greedy algorithm for the continuous knapsack problem

Step 1: Compute the value-to-weight ratios vi / wi,

i = 1, , n, for

the items given.

Step 2: Sort the items in nonincreasing order of the ratios computed in


step 1.

Step 3: Repeat the following operation until the knapsack is filled to


its full capacity or no item is left in the sorted list: if the current item
on the list fits into the knapsack in its entirety, take it and proceed to
the next item; otherwise, take its largest fraction to fill the knapsack
to its full capacity and stop.
Example:
Consider a knapsack with capacity 10 and item information as
follows:
item
weight
value
---------------------------------1
7
Rs.42
2
3
Rs.12
3
4
Rs.40
4
5
Rs.25

Computing the value-to-weight ratios and sorting the items in


nonincreasing order of these efficiency ratios yields
item
weight
value
v/w
-------------------------------------------1
4
Rs.40
10
2
7
Rs.42
6
3
5
Rs.25
5
4
3
Rs.12
4
The algorithm will take the first item of weight 4 and then 6 / 7 of the
next item to fill the knapsack to its full capacity.
This algorithm always yields an optimal solution to the continuous
knapsack problem.
Approximation schemes for the discrete knapsack problem:
There exists polynomial-time approximation schemes, which are
parametric families of algorithms that allow us to get approximations
sa(k) with any predefined accuracy level:
f(sa) / f (s*) 1 + 1 / k for any instance of size n,
where k is an integer parameter in the range 0 k < n.
The first approximation was suggested by S. Sahni in 1975:
Generate all subsets of k items or less.
For each one that fits into the knapsack add the remaining items
as the greedy algorithm does.
The subset of the highest value obtained in this fashion is
returned as the algorithms output.
An example of an approximation scheme with k = 2 is given in
Fig 13.4 below:

Fig 13.4
The algorithm yields {1. 3. 4}, which is the optimal solution for
this instance
The total number of subsets the algorithm generates before
adding extra elements is

For each of those subsets, it needs O (n) time to determine the


subsets possible extension.
Thus, the algorithms efficiency is in O (knk+1).

Das könnte Ihnen auch gefallen