NP-complete Problems: Notes On Design and Analysis of Algorithms

Notes On Design and Analysis of Algorithms
NP-complete problems
NP-hard and NP-complete problems, basic concepts, non-deterministic algorithms, NP-hard
and NP-complete, Cook’s Theorem, decision and optimization problems, polynomial
reduction
1. Introduction
Can all computational problems be solved by a computer? There are computational problems
that cannot be solved by algorithms even with unlimited time. For example, Turing Halting
problem (Given a program and an input, whether the program will eventually halt when run with
that input, or will run forever). Alan Turing proved that general algorithm to solve the halting
problem for all possible program-input pairs cannot exist. A key part of the proof is, Turing
machine was used as a mathematical definition of a computer and program. Two categories of
problems
 Solvable
 Unsolvable
Solvable problem can be solved by designing a suitable efficient algorithm. The most common
resources required during computation are time (how many steps it takes to solve a problem) and
space (how much memory it takes to solve a problem). These resourse requiments decides the
time complexity and space complexity of an algorithm. The time complexity of an algorithm has
a nature or functionality
 Polynomial: time requirement grows as a polynomial function O(nO(1)) the size of the
input/output. E.g. O(1), O(log n), O(n), O(n × log n), O(n2), O(n3), O(nk)
 Exponential: time requirement grows as an exponential function O(2n) of the size of the
input/output. E.g. O(2n), O(n!), O(nn)
This is a real abuse of terminology. A function of the form kn is genuinely exponential. But now
some functions which are worse than polynomial but not quite exponential, such as O(nlog n), are
also (incorrectly) called exponential. And some functions which are worse than exponential, such
as the super exponentials, e.g. O(nn), will also (incorrectly) be called exponential. A better word
than „exponential‟ would be „super-polynomial‟.
Algorithm design techniques (greedy algorithms, dynamic programming, divide-and-conquer,

backtracking, etc.) are mechanism to solve problems. Algorithms are developed for finding
shortest paths and minimum spanning trees in graphs, matching in bipartite graphs, maximum
increasing subsequences, maximum flows in networks, and so on. The following table shows the
time complexity of algorithms and the nature of the growth function of algorithms.
Algorithm design technique and Time complexity Nature
problem
Divide and Conquer
Binary search O(lg n), n:No of elements in a list Polynomial
Merge Sort O(n lg n), n:No of elements in a list Polynomial
Quick sort O(n lg n), n:No of elements in a list Polynomial
Greedy Method
Kruskal‟s Algorithm for MST O(E lgE), E: No of edges in a Graph Polynomial
By Dr. M. M. Raghuwanshi, YCCE,Nagpur

Prim‟s Algorithm for MST O(E lg V ), E: no of edges and V: no of vertices Polynomial

in a graph
Dijkstra's algorithm for single O(V2), V: no of vertices in a graph Polynomial
source shortest path
Dynamic Programming
Multistage Graphs (V+E), E: no of edges and V: no of vertices in Polynomial
a graph
Floyd-Warshall algorithm for All O (n3), n: no of vertices in a graph Polynomial
Pair Shortest Path
Optimal BST O(n3), n: no of keys Polynomial
Travelling salesman problem (TSP) (n22n), n: no of cities Exponential
Backtracking
N-Queen problem O(n!), n: no of queens Exponential
In all these problems we are searching for a solution (sequence, path, tree, matching, etc.) from
among an exponential population of possibilities. Indeed, n boys can be matched with n girls in
n! different ways, a graph with n vertices has nn-2 spanning trees, and a typical graph has an
exponential number of paths from s to t. All these problems could in principle be solved in
exponential time by checking through all candidate solutions (in state space tree), one by one.
But an algorithm whose running time is 2n, or worse, is all but useless in practice. The quest for
efficient algorithms is about finding clever ways to bypass this process of exhaustive search,
using clues from the input in order to dramatically narrow down the search space.
The objective or goal of researchers is to design more efficient algorithms to solve a particular
problem. If algorithm to solve a problem is already exists, then improve it (like design a
polynomial time algorithm if the existing one is exponential time or design a linear time
algorithm if the existing is quadratic in nature).
There are two types of algorithms

1. Deterministic algorithm: is an algorithm which, given a particular input, will always
produce the same output, with the underlying machine always passing through the same
sequence of states. Deterministic algorithms are by far the most studied and familiar kind of
algorithm, as well as one of the most practical, since they can run on real machines
efficiently. Formally, a deterministic algorithm computes a mathematical function; a function
has a unique value for any input in its domain, and the algorithm is a process that produces
this particular value as output.
For example: Search an element x on A[1:n] where n>=1, on successful search return i if A[i]
is equal to x otherwise return 0.
Deterministic Algorithm for this problem :
1 Algorithm LinSrch(A, x, n)
2 {
3 for i= 1 to n do
4 If (A[i] == x)
5 return i;
6 return 0;

7 }
This algorithm has time complexity O(n).
2. Non-deterministic algorithm: is an algorithm ran on a non deterministic Turing machine.

Even for the same input, it can exhibit different behaviors on different runs, as opposed to a
deterministic algorithm. An algorithm that solves a problem in nondeterministic polynomial
time can run in polynomial time or exponential time depending on the choices it makes
during execution. The nondeterministic algorithms are often used to find an approximation to
a solution, when the exact solution would be too costly to obtain using a deterministic one.
This algorithm guesses a solution in a constant amount of time (O(1)). But right now we
don‟t know how to do, it hence it is an abstraction. In future when we will develop logic for
it that time this non-deterministic algorithm becomes a deterministic algorithm.
The notion of a non-deterministic algorithm combines the abstraction of a state space with
the expressivity of a procedural programming language. We will use non-deterministic
algorithms only at the level of pseudo-code. There are a couple of programming languages,
such as Prolog, that support non-determinismm but don't have ordinary programming
language operators.Some of the terms related to the non-deterministic algorithm is defined
below:
 choice(X) or guess(X): chooses any value (randomly or satisfy some conditions) from
the set X. This step gives output in O(1) time. The algorithm can "magically" make a
choice that leads to success.
 failure() : denotes the unsuccessful solution.
 success() : Solution is successful and current thread terminates.
For example: Search an element x on A[1:n] where n>=1, on successful search return j if A[j]
is equal to x otherwise return 0.
Non-deterministic Algorithm for this problem :
1. j= choice(A, n)
2. if(A[j]==x) then
{
write(j);
success();
}
3. write(0); failure();
This algorithm has time complexity O(1).
A nondeterministic algorithm usually has two phases and output steps. The first phase is the
guessing phase, which makes use of arbitrary characters to run the problem. The second phase is
the verifying phase, which returns true or false for the chosen string. Nondeterministic
algorithms are used in solving problems which allow multiple outcomes. Every outcome the
nondeterministic algorithm produces is valid, regardless of the choices made by the algorithm
during execution.

If a deterministic algorithm represents a single path from an input to an outcome, a

nondeterministic algorithm represents a single path stemming into many paths, some of which
may arrive at the same output and some of which may arrive at unique outputs.
Deterministic Algorithm Non-deterministic Algorithm

For a particular input the computer will For a particular input the computer will give different
give always same output. output on different execution.
Can solve the problem in polynomial
Can‟t solve the problem in polynomial time.
time.
Can determine the next step of Cannot determine the next step of execution due to
execution. more than one path the algorithm can take.
2. Classification of Problems
We can classify solvable problems into two broad classes:
 Tractable Problem: a problem that is solvable by a polynomial-time algorithm. The upper
bound is polynomial.
 Intractable Problem: a problem that cannot be solved by a polynomial-time algorithm.
The lower bound is exponential.
One might think that problems can be neatly divided into these two classes. But we have ignored
„gaps‟ between lower and upper bounds. Incredibly, there are problems for which the state of our
knowledge is such that the gap spans this coarse division into tractable and intractable. So, in
fact, there are three broad classes of problems:
 Problems with known polynomial-time algorithms.
 Problems that are provably intractable (proven to have no polynomial-time algorithm).
 Problems with no known polynomial-time algorithm, but not yet proven to be intractable.
Class P is set of problems that can be solved by a deterministic algorithm in polynomial time.
More specifically, they are problems that can be solved in time O(nk) for some constant k, where
n is the size of the input to the problem.
Class NP is set of problems that can be solved by a non-deterministic algorithm in polynomial
time or set of problems whose solutions can be verified in polynomial time.

Clearly, P ⊆ NP (If a problem is solvable in polynomial time its solutions are verifiable in
polynomial time). Informally, NP is set of decision problems which can be solved by a
polynomial time via a “Lucky Algorithm”, a magical algorithm that always makes a right guess
among the given set of choices.
 For example, in 8-Queens (or n-Queens) problem the complexity of finding solutions is
exponential time but if some solution is given, then it is possible to verify its correctness
in polynomial time (by performing column and diagonality test).
 For example, in the Hamiltonian-cycle problem, given a directed graph G = (V, E), a
certificate would be a sequence (v1, v2, v3, . . . ,v|V|) of |V| vertices. It is easy to check in
polynomial time that (vi, vi+1) ∈ E for i = 1, 2, 3… |V| − 1 and that (v|V|, v1) ∈ E as well.
A decision problem L is NP-complete if:

1. L is in NP (Any given solution for NP-complete problems can be verified quickly, but
there is no efficient known solution).
2. Every problem in NP is reducible to L in polynomial time (Reduction is defined below).
Following are some NP-Complete problems, for which no polynomial time algorithm is known.
1. Determining whether a graph has a Hamiltonian cycle
2. Determining whether a Boolean formula is satisfiable, etc.
If any NP-complete problem can be solved in polynomial time, then every problem in NP has a
polynomial time algorithm.
There is another set of problems which cannot be solved in polynomial time and whose solutions
cannot be verified in polynomial time.
Problem A is an NP-Hard problem, if all the problems in NP are polynomially reducible to A. A
problem in NP-Hard does not have to be in NP. (NP-Hard follows property 2 mentioned above,
doesn‟t need to follow property 1). Therefore, NP-Complete set is also a subset of NP-Hard set.
The group of problems which are both in NP and NP-hard are known as NP-Complete problem.
P: solution can be found and verified in polynomial time

NP: solution can be found in exponential time but can be verified in polynomial time.
NP-Complele: solution can be found in exponential time but reduced (in polynomial time) to
some problem whose solution can be verified in polynomial time.
NP-Hard: solution can be found in exponential time but cannot be verified in polynomial time.

Problem Type Solvable in P time Verified in P time

P Yes Yes
NP Yes or No Yes
NP-Complete No Yes
NP-Hard No Yes or No
A search problem or decision problem is specified by an algorithm C (checking algorithm) that

takes two inputs, an instance I and a proposed solution S, and runs in time polynomial in |I|. We
say S is a solution to I if and only if C(I, S) = TRUE. The set of all inputs of a decision problem
A for which the answer is TRUE is the language LA. The goal of a decision problem A is to
identify the language LA. That is, for each input I to verify if I ∈ LA or I LA. The set NP is the
set of all problems for which their language can be verified with a polynomial time algorithm.
The set Co-NP is the set of all problems for which their complement language (I LA) can be
verified with a polynomial time algorithm. If the answer to a problem in co-NP is No, then there
is a proof of this fact that can be checked in polynomial time.
Any optimization problem also a search problem in the sense that we are searching for a solution
that has the property of being optimal. The catch is that the solution to a search problem should
be easy to recognize, or polynomial-time checkable.
There is a convenient relationship between optimization problems and decision problems. We
usually can cast a given optimization problem as a related decision problem by imposing a bound
on the value to be optimized. We are asked to find a tour, a cycle that passes through every
vertex exactly once, of total cost ≤b.
The relationship between an optimization problem and its related decision problem works in our
favor when we try to show that the optimization problem is “hard.” That is because the decision
problem is in a sense “easier,” or at least “no harder”.
3. Satisfiability problem
SATISFIABILITY, or SAT, is a problem of great practical importance, with applications
ranging from chip testing and computer design to image analysis and software engineering. It is
also a canonical hard problem.
(x V y V z) (x V ￢y) (y V ￢z) (z V ￢x) (￢x V ￢y V ￢z):
This is a Boolean formula in conjunctive normal form (CNF). It is a collection of clauses (the
parentheses), each consisting of the disjunction (logical or, denoted V ) of several literals, where
a literal is either a Boolean variable (such as x) or the negation of one (such as ￢ x).

A satisfying truth assignment is an assignment of false or true to each variable so that every
clause contains a literal whose value is true.
The SAT problem is: given a Boolean formula in conjunctive normal form, either find a
satisfying truth assignment or else report that none exists.
we can always search through all truth assignments, one by one, but for formulas with n
variables, the number of possible assignments is exponential, 2n.
SAT is a typical search problem. We are given an instance I (a Boolean formula in conjunctive
normal form), and we are asked to find a solution S (an assignment that satisfies each clause). If
no such solution exists, we must say so.
To formalize the notion of quick checking, we will say that there is a polynomial-time algorithm
that takes as input I and S and decides whether or not S is a solution of I. For SAT, this is easy as
it just involves checking whether the assignment specified by S is indeed satisfies every clause in
I.
There are two natural variants of SAT for which we do have good algorithms. If all clauses
contain at most one positive literal, then the Boolean formula is called a Horn formula, and a
satisfying truth assignment, if one exists, can be found by the greedy algorithm. Alternatively, if
all clauses have only two literals, then graph theory comes into play, and 2SAT can be solved in
linear time by finding the strongly connected components of a particular graph constructed from
the instance ,
On the other hand, if we are just a little more permissive and allow clauses to contain three
literals, then the resulting problem, known as 3SAT, once again becomes hard to solve!
Example: CNF (Conjunctive Normal Form) satisfiability problem:
A Boolean formula contains variables whose values are 0 or 1; Boolean connectives such as
∧ (AND), ∨ (OR), and ￢ (NOT); and parentheses. A Boolean formula is satisfiable if there
is some assignment of the values 0 and 1 to its variables that causes it to evaluate to 1.
Informally, a Boolean formula is in k-conjunctive normal form, or k-CNF, if it is the AND of
clauses of ORs of exactly k variables or their negations.
For example, the Boolean formula (x1 ∨ ￢x2) ∧ (V x1 ∨ x3) ∧ (￢x2 ∨ ￢x3) is in 2-
CNF having three clause in three variables. Each clause is a conjuncture of two variables. .
(It has the satisfying assignment x1 = 1, x2 = 0, x3 = 1.)
Every Boolean formula can be represented as a CNF Boolean formula. A CNF formula  with n
variables {x1. . . xn} and k clauses:  = C1 ∧ C2 ∧ … ∧ Ck.
This problem is called SAT. SAT is a NP-Complete problem.
 SAT is an NP-Hard problem since there exists a polynomial time reduction from any
problem in NP to SAT.
 SAT is an NP problem since it is possible to verify in polynomial time if a TRUE and
FALSE assignment satisfies the formula.

Example: The traveling salesman problem:

In the traveling salesman problem (TSP) we are given n vertices 1 …n and all n(n - 1)/2
distances between them, as well as a budget b. We are asked to find a tour, a cycle that passes
through every vertex exactly once, of total cost ≤b or to report that no such tour exists. That
is, we seek a permutation T(1), … T(n) of the vertices such that when they are toured in this
order, the total distance covered is at most b:
( ) ( ) ( ) ( ) ( ) ( )
Since TSP is an optimization problem, in which the shortest possible tour is sought. But we
have defined the TSP as a search problem: given an instance, find a tour within the budget
(or report that none exists). Turning an optimization problem into a search problem does not
change its difficulty at all, because the two versions reduce to one another. Any algorithm
that solves the optimization TSP also readily solves the search problem. Conversely, an
algorithm for the search problem can also be used to solve the optimization problem.
Given a potential solution to the TSP, it is easy to check the properties “is a tour” (just check
that each vertex is visited exactly once) and “has total length ≤ b”. But how could one check
the property “is optimal”?
The minimum spanning tree (MST) problem, for which we do have efficient algorithms. To
phrase it as a search problem, we are again given a distance matrix and a bound b, and are
asked to find a tree T with total weight ∑( )∈ . The TSP can be thought of as a
tough cousin of the MST problem, in which the tree is not allowed to branch and is therefore
a path. This extra restriction on the structure of the tree results in a much harder problem.
There are different kind of search problems in graphs like:

 Hamiltonian cycle: we want a cycle that goes through all vertices, without repeating any
vertex.
 Euler’s cycle: we want a cycle that goes through all edges, without repeating any edge.
These problems are ominously reminiscent of the TSP.

4. Polynomial-time reduction
A polynomial-time reduction is a method for solving one problem using another. If a
hypothetical subroutine solving the second problem exists, then the first problem can be solved
by transforming or reducing it to inputs for the second problem and calling the subroutine one or
more times. If both the time required to transform the first problem to the second, and the
number of times the subroutine is called is polynomial, then the first problem is polynomial-time
reducible to the second.
 For example: the problem of solving linear equations in an indeterminate x reduces to the
problem of solving quadratic equations. Given an instance ax + b = 0, we transform it to
0x2 + ax + b = 0, whose solution provides a solution to ax + b = 0.
 For Example: Let find out the median of a data set (3, 4, 1, 5.2). We need to arrange this
data in increasing order by using some polynomial time algorithm. Let this algorithm
arrange as {1, 2, 3, 4, 5). Verify this sorted list by using algorithm to certify it. If the
answer is yes, the median is the middle element of the sorted list. The reduction
algorithm generates sorted list in polynomial time, which is verified as a sorted list in
polynomial time. Hence median is found in polynomial time by reduction
A polynomial-time reduction proves that the first problem is no more difficult than the second
one, because whenever an efficient algorithm exists for the second problem, one exists for the
first problem as well.The notion of showing that one problem is no harder or no easier than
another applies even when both problems are decision problems.
Let us consider a decision problem, say A, which we would like to solve in polynomial time. We
call the input to a particular problem an instance of that problem. Now suppose that there is a
different decision problem, say B that we already know how to solve in polynomial time. Finally,
suppose that we have a procedure that transforms any instance α of A into some instance β of B
with the following characteristics:
a. The transformation takes polynomial time.
b. The answers are the same. That is, the answer for α is “yes” if and only if the answer for β
is also “yes.”
We call such a procedure a polynomial-time reduction algorithm. It provides us a way to solve

problem A in polynomial time:
1. Given an instance α of problem A, use a polynomial-time reduction algorithm to transform
it to an instance β of problem B.
2. Run the polynomial-time decision algorithm for B on the instance β.
3. Use the answer for β as the answer for α.
As long as each of these steps takes polynomial time, all three together do also, and so we have a
way to decide on α in polynomial time. In other words, by “reducing” solving problem A to
solving problem B, we use the “easiness” of B to prove the “easiness” of A. If any subroutine for
decision problem B can also be used to decision problem A, then we say A reduces to B (A ≤P
B).

A language L1 is polynomial-time reducible to a language L2, written L1 ≤P L2, if there exists a

polynomial-time computable function f : {0, 1}∗ → {0, 1}∗ such that for all x ∈ {0, 1}∗, x ∈ L1
if and only if f (x) ∈ L2 .
The function f the reduction function, and a polynomial-time algorithm F that computes f is
called a reduction algorithm.
5. NP-complete problems
NP-complete problems arise in diverse domains: Boolean logic, graphs, arithmetic, network
design, sets and partitions, storage and retrieval, sequencing and scheduling, mathematical
programming, algebra and number theory, games and puzzles, automata and language theory,
program optimization, biology, chemistry, physics, and more.
A decision problem L is NP-complete if:
a. L is in NP
b. Every problem in NP is reducible to L in polynomial time means L is NP-hard.
No polynomial-time algorithm has yet been discovered for an NP-complete problem, nor has
anyone yet been able to prove that no polynomial-time algorithm can exist for any one of them.
The world is full of search problems, some of which can be solved efficiently, while others seem
to be very hard. This is depicted in the following table.
Hard Problem (NP-Complete) Easy Problem (in P)
3SAT 2SAT, HORN SAT
Travelling Salesperson Problem (TSP) Minimum Spanning Tree (MST)
Longest Path Shortest Path
Knapsack Unary Knapsack
Hamiltonian Cycle Euler Path
Integer Linear Programming Linear Programming
Balance Cut Minimum Cut
Reduction translates one search problem into another. The problems on the left side of the table
are all, in some sense, exactly the same problem, except that they are stated in different
languages
 Shortest vs. longest simple paths: A graph even with negative edge weights, we can find
shortest paths from a single source in a directed graph G=(V.E) in O(VE) time. Finding a
longest simple path between two vertices is difficult, however. Merely determining whether
a graph contains a simple path with at least a given number of edges is NP-complete.
 Euler tour vs. hamiltonian cycle: An Euler tour of a connected, directed graph G=(V.E)
is a cycle that traverses each edge of G exactly once, although it is allowed to visit each
vertex more than once. we can determine whether a graph has an Euler tour in only O(E)
time. A hamiltonian cycle of a directed graph G=(V.E) is a simple cycle that contains each
vertex in V. Determining whether a directed graph has a hamiltonian cycle is NP-complete.
 2-CNF satisfiability vs. 3-CNF satisfiability:A boolean formula is in k-conjunctive
normal form, or k-CNF, if it is the AND of clauses of ORs of exactly k variables or their
negations. For example, (x1 ∨ ￢x2) ∧ (V x1 ∨ x3) ∧ (￢x2 ∨ ￢x3) is in 2-CNF having
three clause in three variables. Each clause is a conjuncture of two variables. . (It has the
satisfying assignment x1 = 1, x2 = 0, x3 = 1.) Although we can determine in polynomial
time whether a 2-CNF formula is satisfiable. Determining whether a 3-CNF formula is
satisfiable is NP-complete.
6. Showing problems to be NP-complete

When we demonstrate that a problem is NP-complete, we are making a statement about how
hard it is (or at least how hard we think it is), rather than about how easy it is. We are not trying
to prove the existence of an efficient algorithm, but instead that no efficient algorithm is likely to
exist.
We rely on three key concepts in showing a problem to be NP-complete:
 NP-completeness applies directly not to optimization problems, however, but to decision
problems, in which the answer is simply “yes” or “no”. Although NP-complete problems
are confined to the realm of decision problems, we can take advantage of a convenient
relationship between optimization problems and decision problems. We usually can cast a
given optimization problem as a related decision problem by imposing a bound on the
value to be optimized.
 Polynomial reduction
 A first NP-complete problem: Because the technique of reduction relies on having a
problem already known to be NP-complete in order to prove a different problem NP-
complete, we need a “first” NP-complete problem. Normally SAT is consider as first NP-
complete problem.
In order to show that problem L is NP-complete do the following

 Proof that L is in NP
If any problem is in NP, then, given a „certificate‟ (a solution) to the problem and an
instance of the problem, we will be able to verify (check whether the solution given is
correct or not) the certificate in polynomial time. Or
We can write a non-deterministic algorithm for L having polynomial time complexity.
 Proof that L is NP Hard
To prove that L is NP Hard, we take some problem which has already been proven to be
NP Hard, and show that this problem can be reduced to the L problem
we shall use the reduction methodology to provide NPcompleteness proofs for a variety of
problems drawn from graph theory and set partitioning.
I. The clique problem

A clique in an undirected graph G = (V, E) is a subset V‟ ⊆ V of vertices, each pair of which is
connected by an edge in E. In other words, a clique is a complete subgraph of G. The size of a
clique is the number of vertices it contains. The clique problem is the optimization problem of
finding a clique of maximum size in a graph. As a decision problem, we ask simply whether a
clique of a given size k exists in the graph.
CLIQUE = {<G, k>: G is a graph with a clique of size k}.

The graph shown has one maximum clique,

the triangle {1, 2, 5}, and four more maximal
cliques, the pairs {2, 3}, {3, 4}, {4, 5}, and {4, 6}.
A naive algorithm for determining whether a graph G = (V, E) with |V| vertices has a clique of
size k is to list all k-subsets of V, and check each one to see whether it forms a clique. The
algorithm runs in super-polynomial time.
To prove that the clique problem is NP-complete
 show that CLIQUE ∈ NP, for a given graph G = (V, E), we use the set V‟ ⊆ V of vertices
in the clique as a certificate for G. Checking whether V‟is a clique can be accomplished
in polynomial time by checking whether, for each pair u, v ∈ V‟, the edge (u, v) belongs
to E.
 Prove that 3-CNF-SAT ≤P CLIQUE, which shows that the clique problem is NP-hard.
The reduction algorithm begins with an instance of 3-CNF-SAT. Let φ = C1 ∧ C2 ∧ · · · ∧ Ck be a
Boolean formula in 3-CNF with k clauses. For r = 1, 2. . . k, each clause Cr has exactly three
distinct literals lr1, lr2 and lr3. We shall construct a graph G such that φ is satisfiable if and only if
G has a clique of size k.
The graph G = (V, E) is constructed as follows. For each clause Cr = (lr1∨ lr2∨ lr3) in φ, we place
a triple of vertices vr1, vr2, and vr3into V. We put an edge between two vertices vri and vsj if both
of the following hold:
 vri and vsj are in different triples, that is, r ≠ s, and
 their corresponding literals are consistent, that is, lri is not the negation of lsj
This graph can easily be computed from φ in polynomial time. As an example of this
construction, if we have
φ = (x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x3),
then G is the graph as shown below
We must show that this transformation of φ into G is a reduction. First, suppose that φ has a
satisfying assignment. Then each clause Cr contains at least one literal lri that is assigned 1, and

each such literal corresponds to a vertex vri. Picking one such “true” literal from each clause
yields a set V‟ of k vertices. We claim that V‟ is a clique.
Yes True
Reduction <G> Algorithm to check
φ
algorithm Clique No
False
Conversely, suppose that G has a clique V‟ of size k. No edges in G connect vertices in the same
triple, and so V‟ contains exactly one vertex per triple. We can assign 1 to each literal lri such
that vri ∈ V‟ without fear of assigning 1 to both a literal and its complement, since G contains no
edges between inconsistent literals. Each clause is satisfied, and so φ is satisfied.
In the example, a satisfying assignment of φ has x2 = 0 and x3 = 1 (or x2 = 1 and x3 = 0). A
corresponding clique of size k = 3 consists of the vertices corresponding to ¬x2 from the first
clause, x3 from the second clause, and x3 from the third clause. Because the clique contains no
vertices corresponding to either x1 or ¬x1, we can set x1 to either 0 or 1 in this satisfying
assignment.
We have reduced an arbitrary instance of 3-CNF-SAT to an instance of CLIQUE with a
particular structure. It might seem that we have shown only that CLIQUE is NP-hard in graphs in
which the vertices are restricted to occur in triples and in which there are no edges between
vertices in the same triple.
II. The vertex-cover problem

Formally, a vertex cover V′ of an undirected graph G=(V,E) is a subset of V such that (u, v) ∈ E
⇒ u∈V′ ∨ v∈V′, that is to say it is a set of vertices V‟ where every edge has at least one endpoint
in the vertex cover V′. Such a set is said to cover the edges of G. The following figure shows two
examples of vertex covers, with some vertex cover V′ marked in red.
A minimum vertex cover is a vertex cover of smallest possible size. The vertex cover number k
is the size of a minimum vertex cover, i.e. k = |V′|. The following figure shows examples of
minimum vertex covers in the previous graphs.
The vertex-cover problem is to find a vertex cover of minimum size in a given graph. We wish to
determine whether a graph has a vertex cover of a given size k.
VERTEX-COVER = {<G, k>: graph G has a vertex cover of size k}.
To prove that this problem is NP-complete
 First show that VERTEX-COVER ∈ NP. Suppose we are given a graph G = (V, E) and
an integer k. The certificate we choose is the vertex cover V‟⊆ V itself. The verification
algorithm affirms that |V‟| = k, and then it checks, for each edge (u, v) ∈ E, that u ∈ V‟ or
v ∈ V‟. This verification can be performed straightforwardly in polynomial time.

 Prove that the vertex-cover problem is NP-hard by showing that CLIQUE ≤PVERTEX-
COVER.
The reduction is based on the notion of the “complement” of a graph. Given an undirected graph
G = (V, E), we define the complement of G as G‟ = (V, E‟), where E‟ = {(u, v): u, v ∈ V, u ≠ v,
and (u, v)  E}. In other words, G is the graph containing exactly those edges that are not in G.
An undirected graph G = (V, E) with clique V’ The graph G produced by the reduction algorithm
= {u, v, x, y} that has vertex cover V – V’ = {w, z}.
The reduction algorithm takes as input an instance <G, k> of the clique problem. It computes the
complement G‟, which is easily done in polynomial time. The output of the reduction algorithm
is the instance <G‟, |V| − k> of the vertex-cover problem.
Yes
Reduction <G’, |V| − k> Algorithm to check
Clique of
<G, k>
algorithm vertex-cover size k
No
Suppose that G has a clique V‟ ⊆ V with |V‟| = k. We claim that V – V‟ is a vertex cover in G.
Let (u, v) be any edge in E‟. Then, (u, v)  E, which implies that at least one of u or v does not
belong to V‟, since every pair of vertices in V‟ is connected by an edge of E. Equivalently, at
least one of u or v is in V – V‟, which means that edge (u, v) is covered by V – V‟. Since (u, v)
was chosen arbitrarily from E‟, every edge of E‟ is covered by a vertex in V – V‟. Hence, the set
V – V‟, which has size |V| − k, forms a vertex cover for G.
Conversely, suppose that G has a vertex cover V‟ ⊆ V, where |V‟| = |V| − k. Then, for all u, v ∈
V, if (u, v) ∈ E‟, then u ∈ V‟ or v ∈ V‟ or both. The contrapositive of this implication is that for
all u, v ∈ V, if u  V‟ and v  V‟, then (u, v) ∈ E. In other words, V−V‟ is a clique, and it has
size |V|−|V‟| = k.
Since vertex cover is now NP-hard problem and it is also NP problem hence it is NP-complete
problem.
III. The Hamiltonian-cycle problem

Hamiltonian cycle is a cycle that goes through all vertices, without repeating any vertex.
To prove that the Hamiltonian cycle problem is NP-complete
 First show that HAM-CYCLE belongs to NP. Given a graph G = (V, E), our certificate is
the sequence of |V| vertices that makes up the Hamiltonian cycle. The verification

algorithm checks that this sequence contains each vertex in V exactly once and that with
the first vertex repeated at the end, it forms a cycle in G. That is, it checks that there is an
edge between each pair of consecutive vertices and between the first and last vertices.
This verification can be performed in polynomial time.
 Prove that VERTEX-COVER ≤P HAM-CYCLE, which shows that HAM-CYCLE is NP-
complete. Given an undirected graph G = (V, E) and an integer k, we construct an
undirected graph G‟ = (V‟, E‟) that has a Hamiltonian cycle if and only if G has a vertex
cover of size k.
IV. The traveling-salesman problem

In the traveling-salesman problem, which is closely related to the Hamiltonian cycle problem, a
salesman must visit n cities. Modeling the problem as a complete graph with n vertices, we can
say that the salesman wishes to make a tour, or Hamiltonian cycle, visiting each city exactly once
and finishing at the city he starts from. There is an integer cost c(i, j ) to travel from city i to city
j , and the salesman wishes to make the tour whose total cost is minimum, where the total cost is
the sum of the individual costs along the edges of the tour.
a minimum-cost tour is <u, w, v, x, u>, with

cost 7
The formal language for the corresponding decision problem is

TSP = {<G, c, k> : G = (V, E) is a complete graph, c is a function from V × V → Z, k ∈ Z, and
G has a traveling-salesman tour with cost at most k}.
To prove that the traveling-salesman problem is NP-complete
 First show that TSP belongs to NP. Given an instance of the problem, we use as a
certificate the sequence of n vertices in the tour. The verification algorithm checks that
this sequence contains each vertex exactly once, sums up the edge costs, and checks
whether the sum is at most k. This process can certainly be done in polynomial time.
 Prove that TSP is NP-hard, we show that HAM-CYCLE ≤P TSP. Let G = (V, E) be an
instance of HAM-CYCLE. We construct an instance of TSP as follows.
We form the complete graph G‟ = (V, E’), where E‟= {(i, j) : i, j ∈ V and i ≠ j}, and we define
the cost function c by
( )∈
( ) {
( )
(Note that because G is undirected, it has no self-loops, and so c(v, v) = 1 for all vertices v ∈ V.)
The instance of TSP is then <G’, c, 0>, which is easily formed in polynomial time.
We now show that graph G has a Hamiltonian cycle if and only if graph G’ has a tour of cost at
most 0. Suppose that graph G has a Hamiltonian cycle h. Each edge in h belongs to E and thus
has cost 0 in G’. Thus, h is a tour in G’ with cost 0.

Conversely, suppose that graph G’ has a tour h’ of cost at most 0. Since the costs of the edges in
E‟ are 0 and 1, the cost of tour h’ is exactly 0 and each edge on the tour must have cost 0.
Therefore, h‟ contains only edges in E. We conclude that h‟ is a Hamiltonian cycle in graph G.
7. Cook-Levin Theorem
If SAT is in P, then every problem in NP are also in P. i.e., if SAT is in P, then P = NP
Intuitively, SAT must be one of the most difficult problems in NP. We call SAT an NP-complete
problem. (most difficult in NP)
The proofs made by Cook and Levin is a bit complicated, because intuitively they need to show
that no problems in NP can be more difficult than SAT. However, since Cook and Levin, many
people show that many other problems in NP are shown to be NP-complete
The class of “NP-complete” problems have the surprising property that if any NP-complete
problem can be solved in polynomial time, then every problem in NP has a polynomial-time
solution, that is, P = NP. Despite years of study, though, no polynomial-time algorithm has ever
been discovered for any NP-complete problem.
In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem,
states that the Boolean satisfiability problem is NP-complete. That is, any problem in NP can be
reduced in polynomial time by a deterministic Turing machine to the problem of determining
whether a Boolean formula is satisfiable. An important consequence of this theorem is that if
there exists a deterministic polynomial time algorithm for solving Boolean satisfiability, then
every NP problem can be solved by a deterministic polynomial time algorithm.
All proofs ultimately follow by reduction from the NP-completeness of CIRCUIT-SAT.
Exercise:
1. Show that P is closed under union, concatenation, and complement.
2. Show that NP is closed under union and concatenation.
3. Proof that Hamiltonian Path is NP-Complete
4. Construct a nondeterministic algorithm computing the MST for an input graph (pseudocode).
For phase 2 assume that the minimum cost is known in advance. What is the complexity of
this algorithm?

5. Construct a nondeterministic algorithm computing the shortest path from a node u to a node v
in an input graph (pseudocode). For phase 2 assume that the minimum cost is known in
advance. What is the complexity of this algorithm?
6. Assuming P ≠ NP, Which of the following statements are true or false? Why?
A. NP-complete = NP
B. NP-complete ∩ P = ∅
C. NP-hard = NP
D. P = NP-complete
7. The subgraph-isomorphism problem takes two undirected graphs G1 and G2, and it asks
whether G1 is isomorphic to a subgraph of G2. Show that the subgraph isomorphism problem
is NP-complete.
8. The longest-simple-cycle problem is the problem of determining a simple cycle (no repeated vertices)
of maximum length in a graph. Formulate a related decision problem, and show that the decision
problem is NP-complete.
9. A subset of the nodes of a graph G is a dominating set if every other node of G is adjacent to some
node in the subset. Let DOMINATING-SET = {⟨G, k⟩| G has a dominating set with k nodes}. Show
that it is NP-complete by giving a reduction from VERTEX-COVER.

NOTES
Status of NP Complete problems is another failure story, NP complete problems are problems
whose status is unknown. No polynomial time algorithm has yet been discovered for any NP
complete problem, nor has anybody yet been able to prove that no polynomial-time algorithm
exists for any of them. The interesting part is, if any one of the NP complete problems can be
solved in polynomial time, then all of them can be solved
In 1798, the British philosopher T. Robert Malthus published an essay in which he predicted that
the exponential growth (he called it “geometric growth”) of the human population would soon
deplete linearly growing resources.
In 1965, computer chip pioneer Gordon E. Moore noticed that transistor density in chips had
doubled every year in the early 1960s, and he predicted that this trend would continue. This
prediction, moderated to double every 18 months and extended to computer speed, is known as
Moore's law. It has held remarkably well for 40 years.
It would appear that Moore's law provides a disincentive for developing polynomial algorithms.
After all, if an algorithm is exponential, why not wait it out until Moore's law makes it feasible?
But in reality, the exact opposite happens: Moore's law is a huge incentive for developing
efficient algorithms, because such algorithms are needed in order to take advantage of the
exponential increase in computer speed.
Every optimization problem has its own search space. The number of decision variables in the
problem decides the dimension of the search space. The shape of the search space is decided by
type and values of decision variables. The search space is bounded by constraints on the problem
and bounds of decision variables.
There may or may not exist solution for a problem. If a solution exists then it can be found in
polynomial or exponential time. The obtained solution may or may not be verified (for
correctness) in polynomial time.

NP-complete Problems: Notes On Design and Analysis of Algorithms

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

NP-complete Problems: Notes On Design and Analysis of Algorithms

Hochgeladen von

Copyright:

Verfügbare Formate

Notes On Design and Analysis of Algorithms

Algorithm design techniques (greedy algorithms, dynamic programming, divide-and-conquer,

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

Prim‟s Algorithm for MST O(E lg V ), E: no of edges and V: no of vertices Polynomial

There are two types of algorithms

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

2. Non-deterministic algorithm: is an algorithm ran on a non deterministic Turing machine.

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

If a deterministic algorithm represents a single path from an input to an outcome, a

Deterministic Algorithm Non-deterministic Algorithm

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

A decision problem L is NP-complete if:

P: solution can be found and verified in polynomial time

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

Problem Type Solvable in P time Verified in P time

A search problem or decision problem is specified by an algorithm C (checking algorithm) that

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

Example: The traveling salesman problem:

There are different kind of search problems in graphs like:

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

We call such a procedure a polynomial-time reduction algorithm. It provides us a way to solve

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

A language L1 is polynomial-time reducible to a language L2, written L1 ≤P L2, if there exists a

6. Showing problems to be NP-complete

In order to show that problem L is NP-complete do the following

I. The clique problem

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

The graph shown has one maximum clique,

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

II. The vertex-cover problem

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

III. The Hamiltonian-cycle problem

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

IV. The traveling-salesman problem

a minimum-cost tour is <u, w, v, x, u>, with

The formal language for the corresponding decision problem is

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

All proofs ultimately follow by reduction from the NP-completeness of CIRCUIT-SAT.

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

By Dr. M. M. Raghuwanshi, YCCE,Nagpur

Das könnte Ihnen auch gefallen