Sie sind auf Seite 1von 17

Sorting Techniques Bubble Sort Strategy or Steps

Pass1.
1. Compare 1st element with 2nd element. If 1st element > 2nd element, then exchange it.
2. Compare 2nd element with 3rd element. If 2nd element > 3rd element, then exchange it.
3. Compare 3rd element with 4th element. If 3rd element > 4th element, then exchange it
Compare (n-1)th element with nth element. If less, then exchange it.
Pass 1
In first pass, n-1 comparisons will made as a result of which one largest element will bubble
out to the end.
Pass2
Repeat Pass 1 with one less comparison that is n-2 number of comparisons will carry out in
this pass. One another largest element will bubble out in this step.
Pass3
Repeat Pass 1 with two less comparisons that is n-3 number of comparisons will made in
this pass.
Pass n-1
In the last pass, one comparison is made that is Compare 1st element with 2nd element.
Bubble-Sort Algorithm(Array: A, Size: n)
Step 1. Input List A
Step 2. Repeat Steps 3 & 4 for I = 1 to n-1
Step 3. Set J=1
Step4. Repeat while J <= n-I
i.If A[J] > A[J+1] then
Exchange A[J] and A[J+1]
( End if )
ii. Set J = J + 1
( End of Inner loop )
( End of Outer loop )
Step5. Exit
Time Complexity of Bubble Sort Algorithm
The running time/cost of the Bubble Sort algorithm depends upon the number of iterations
carried out in arranging elements in the array. Assume each iteration cost = c = O(1)
Pass Number of iterations
1 n–1
2 n–2
3 n-3
! !
! !
n-2 2
n-1 1
Cost = Total Number of iterations
= (n-1) + (n-2) + (n-3) + ……+2+1
= n(n-1)/2
= (n2 – n)/2
= n2/2 – n/2
Total cost = cost per iteration*Total number of iterations
= O(1) *[ n2/2 – n/2]
= (n2/2) – (n/2)
f(n) = O(n2)
Best Case: if the input list is already in sorted order. In this case, the comparisons in the
inner loop of bubble sort algorithm are made but the condition for exchanging the items will
remain false each time. So the running time/cost of algorithm will reduce by some constant
that we neglect, hence Cost in best case = O(n2)
Worst Case: The bubble sort algorithm will run in worst case, if the input list is in reverse-
sorted order. In this case the condition for exchanging the elements in the inner loop will
remains true in each case. So running time or cost = O(n2)
Average Case: The bubble sort algorithm will run in average case, if the input list is in un-
sorted order. In this case, the condition inside the inner loop will true for some iterations
and false for some.
For false condition, no cost of exchanging elements will incur. So the cost will little bit
reduce as compare to worst case. So the running time/cost in average case = O(n2).
Quick Sort Algorithm/Partition-Exchange Sort: Quick sort or Partition-exchange sort, is
a sorting algorithm developed by Tony Hoare that, on average, makes O(n log n)
comparisons to sort n items. Quick Sort algorithm is one of the best sorting algorithms
because it is remarkably efficient on the average because its average cost is O(n logn). It has
also the advantage of sorting in place. It works well even in virtual memory environment.
Strategy: Quicksort is a divide-and-conquer sorting algorithm in which division of list is
dynamically carried out. The three steps of Quicksort algorithm are as follows:
Divide: In this step, we take an element as pivot and divide the array into two sub lists
logically in such a way that each element in the left sub array is less than the pivot and each
element in the right sub array is greater than the pivot point.
Conquer: Repeat the process recursively for the left and right sub arrays.
Combine: Since the sub arrays are sorted in place, no cost is needed to combine them.
Steps
1. Select an element A[q] as a pivot from the array A[p…r]. The leftmost/rightmost
element in the array is usually used as pivot.
2. Scan the array A[r…p] from the right to left and flush to the right all the keys that are
≥ pivot.
3. Scan the array A[p…r] from the left to the right and flush to the left all the keys that
are ≤ pivot.
4. Arrange the elements around the pivot as the array splits into two sub arrays A[p…q-
1] and A[q+1…r] such that
A[p…q-1] ≤ pivot ≤ A[q+1…r]
5. Recursively repeat the steps for each of the sub lists A[p…q-1] and A[q+1…r].
Algorithm: Quicksort (Array: A, p, r)
1. if p ≥ r then Return
2. q = Partition (A, p, r)
3. Quicksort (A, p, q-1)
4. Quicksort (A, q + 1, r)
Partition(A, p, r)
Piv = A[p]
j=p
for i = p+1 to r do
if (A[i] ≤ Piv) then
Exchange (A[i], A[j])
j=j+1
( End if ) ( end of loop ) Exchange(pivot, A[j])
Analysis of Quick Sort Algorithm
Worst Case
The quick sort algorithm will run in worst case, if the original list is already sorted or reverse
order sorted. In this case, the Partitioning routine will produce one sub list of n-1 elements.
Another Partitioned routine produces another sub list of n-2 elements and so on.
n

1
n-1

1 n-

!
1
!
!

1 1

The cost of the algorithm in recurrence is expressed as


T(n) = T(n-1) + O(n)
or T(n) = T(n-1) + n
To find cost we will solve this recurrence. Using Iteration method,
T(n) = T(n-1) + n
= [ T(n-2) + n ] + n
=T(n-2) + 2n
=[ T(n-3) + n] + 2n
=T(n-3) + 3n
=[ T(n-4) + n ] + 3n
= T(n-4) + 4n
=T( n-n+1) + (n-1)(n)
=T(1) + n2 – n
=n2 – n +1 =O(n2) Hence the Quick sort algorithm runs in O(n2) in worst case.
Best Case
The Quick sort algorithm will run in best case if it works on a list such that the partition
routine will produce a balance partition i.e each sub list of size n/2.
The cost in recurrence relation is expressed as T(n) = 2 T(n/2) + n
Using Master method, Here a=2, b=2, f(n)=n
f(n) = O( nc logk n) = n By comparing, we get

 c = 1, k=0
Now logb a = log2 2 = 1
Here c = logb a
So it is the 2nd case of Master theorem which states that for some constant k ≥ 0
f(n) = O( nc logk n) where c = logb a then
T(n) = O( nc logk+1 n)
Or T(n) = O(n1 log0+1 n)
= O(n log n) Hence in best case, the quick sort algorithm runs in O(n log n).
Average Case
The Quick sort algorithm runs in average case, if it works on a random input array. In this
case, we expect that some of the splits will be reasonably well-balanced and some will be
fairly unbalanced.
If the recurrence is represented in this case by tree, then the good and bad splits are
distributed randomly through out the tree in such a way that good and bad splits are at
alternate level.
n

1 n-1

(n-1)/2 (n-1)/2
Their combine cost = 1 + (n-1) + (n-1)/2
= n + ( n-1)/2
= (2n + n - 1)/2
= 3n-1/2
= O(n)
Implementation issues
Choice of pivot: In very early versions of quicksort, the leftmost/rightmost element of the
partition would often be chosen as the pivot element. Unfortunately, this causes worst-case
behavior on already sorted arrays, which is a rather common use-case.
Solution: The problem can be solved by choosing either a random index for the pivot or
choosing the middle index of the partition (especially for longer partitions) or choosing
the median of the first, middle and last element of the partition for the pivot.

This "median of three" rule counters the case of sorted (or reverse-sorted) input, and gives a
better estimate of the optimal pivot (the true median) than selecting any single element,
when no information about the ordering of the input is known.
More Repeated elements: With a partitioning algorithm (even with one that chooses good
pivot values), quicksort exhibits poor performance for inputs that contain many repeated
elements.
Solution: To solve this quicksort problem, an alternative linear-time partition routine can be
used that separates the values into three groups:

Values less than the pivot, values equal to the pivot, and values greater than the pivot
(Bentley and McIlroy call this a "fat partition").

The values equal to the pivot are already sorted, so only the less-than and greater-than
partitions need to be recursively sorted as a result of which the algorithm will run in average
case. For repeated elements, the pseudocode quicksort algorithm becomes as

function quicksort(A, lo, hi)


if lo < hi
p = pivot(A, lo, hi)
left, right = partition(A, p, lo, hi) // note: multiple return values
quicksort(A, lo, left)
quicksort(A, right, hi)

The best case for the algorithm now occurs when all elements are equal. In the case of all
equal elements, the modified quicksort will perform at most two recursive calls on empty
subarrays and thus finish in linear time.

Randomized Quick sort algorithm: Non-deterministic Algorithm is an algorithm that exhibits


different behavior on different runs for the same input. In the average case, quick sort
algorithm runs as randomized algorithm because for some
About Quick Sort
Not stable sort. Quicksort can be implemented with an in-place partitioning algorithm, so
the entire sort can be done with only O(log n) additional space used by the stack during the
recursion. Its space complexity is n log n.
Minimum Spanning Tree
Spanning tree:
Spanning tree is a sub-graph of an undirected graph G = ( V, E ) if it contains every vertex of
G, but has no cycles.

Minimum Spanning Tree:


A minimum spanning tree is a spanning tree of a connected, undirected graph that has
lesser weight or equal weight than all other spanning trees.
Example: Consider a graph.

Its minimum spanning tree is


Weighted Graph
A weighted graph is a graph in which each edge has a weight (some real number )
Weight of a Graph
The sum of the weights of all edges.

Prim’s Algorithm
It is a greedy algorithm.
Time complexity of Prims Algorithm

The time complexity of Prim's algorithm depends on the data structures used for the graph
and for ordering the edges by weight, which can be done using a priority queue. The
following table shows the typical choices:

Minimum edge weight data Time complexity


structure
Adjacency matrix O(|V|2)
Binary heap and adjacency list O(|V| + E log |V|) = Binary heap stores all
O(|E| log |V|) edges of the input
Worst case R. time graph, ordered by
their weight
Fibonacci heap and adjacency list O(|E| + |V| log |V|)
Greedy Algorithm: An optimization problem is one in which there are many solutions to the
problem exist and we want to find the optimal (best) solution among these.
Like a Dynamic Programming, Greedy algorithm is also used to solve the optimization
problems. Greedy algorithm makes use of local optimal choice in each step in the hope to
reach a globally optimal solution. While selecting the choice, it does not take care of future.
Greedy algorithms do not always yield optimal solutions, but for many problems they do. A
Greedy algorithm works in phases. At each phase It guesses for the best without regard for
the future consequences Believe on local optimum at each step and try to yield a global
optimum solution.
There is no general way, but problems having two ingredients are suitable to greedy
method.
The Greedy-Choice Property: Problems having such property that a global optimal solution
can be arrived at by making a locally optimal (greedy) choice.
This property is where Greedy algorithms differ from dynamic programming because greedy
strategy usually progresses in a top-down fashion, making one greedy choice after another,
iteratively reducing each given problem instance to a smaller one.
Optimal Substructure: This property is the key ingredient for the applicability of Greedy
algorithm to the problems.
A problem exhibits optimal substructure if an optimal solution to the problem contains
within its optimal solution to sub problems”. This property is exploited by both greedy and
Dynamic programming.
0-1 Knapsack Problem: A thief robbing a store finds n items. The ith item is worth vi dollars
and weights wi pounds.
He wants to take as valuable a load as possible, but he can carry at most H pounds in his
knapsack. What items should he take?
If he takes each item as a whole, not in part or left the item, then it is called 0-1 knapsack
problem, where 0 means left the item and 1 means takes the item as a whole.
Fractional Knapsack Problem
The setup is same as (0-1) Knapsack, but the thief can take fractions of items, rather than
having to make a binary choice (0-1) for each item. Although the problems are similar, the
fractional knapsack problem is solvable by a greedy strategy, whereas the 0-1 problem is
not.
Solution: To solve the fractional problem, we first compute the value per pound i.e. vi/wi for
each item. Obeying the greedy strategy, the thief begins by taking as much as possible of
the item with the greatest value per pound.
If the supply of item is exhausted, he can still carry more from the next valued item and so
forth until he can’t carry any more.
Sorting the item by value per pound, the greedy algorithm runs in O(n log n).
Q. Show by example that the greedy strategy does not work for the 0-1 knapsack
problem.
Consider three items, item1, 2 & 3, such that item1 weights 10 pounds and is worth 60$,
item2 weights 20 pounds and is worth 100$, item3 weighs 30 pounds and is worth 120$.
Suppose knapsack capacity = 50 pounds
Value per pound = vi/wi
Value of item1 = 60/10 = 6$,
Value of item2 = 100/20 = 5$,
Value of item3 = 120/30 = 4$
Obeying the greedy strategy, the thief begins by taking as much as possible of the item with
the greatest value per pound.
If the supply of item is exhausted, he can still carry more from the next valued item and so
forth until he can’t carry any more.
The greedy strategy would take item1 first because it has more value, then 2nd and then
3rd one. But if it takes items 1st and 2nd then there is no space in the knapsack for the third
item. So it will carry total weight of 30 pounds of value 160$.
However, the optimal solution is to leave the item 1 and takes items 2 & 3. The 2 possible
solutions in this case are not optimal but are suboptimal. So Greedy fails to give the optimal
solution in this case.
For fractional knapsack,
Greedy will take the first optimal (item1) having maximum value of 6$, then take 2nd
optimal (item2) and then fractional portion of 3rd item.
Thus, greedy carries weight of 50 pounds in the knapsack having maximum value of 240$.
Graph Theory
Tree data structure represents one-to-many relationships. In real life we deal with such
problems that can be described by many-to-many relationships. Such problems can be
described by Graph data structure.

 Graph is a non-linear data structure, made up of set of nodes and lines, where nodes
are called Vertices or points and lines are called Edges or arcs.
 Each Edge E is identified by a unique unordered pair [u,v] i.e. e = [u, v] where u and v
denotes the start and end nodes of edge e respectively. u and v are also called
adjacent nodes or neighbors.
 The degree of a node u, deg(u), is the number of edges containing u.
 If deg(u) = 0 i.e. u does not belong to any edge, then u is called an isolated node.
Types of Graph
Undirected Graph: A graph having edges that have no direction, is called undirected graph.
This kind of graph is also called as undigraph.

Directed Graph: A graph G is said to be directed graph, if each edge in G is assigned a


direction. This graph is also called a digraph. Each edge e in directed graph is identified by an
ordered pair (u, v).

The out degree of a node u in G, outdeg(u) is the number of edges beginning at u. Similarly
the indegree of u, indeg(u) is the number of edges ending at u.
A node u is called a source if it has a positive outdegree but zero indegree. Similarly a node u
is called a sink node if it has a zero outdegree but a positive indegree.
Complete Graph: A connected graph G is said to be complete if every node u in G is
adjacent to every other node v in G. A complete graph with n nodes will have n(n-1)/2
edges.
Tree Graph: A connected graph T without any cycles is called a Tree graph or Free tree or
Tree. There is a unique simple path between any two nodes u and v in tree graph. If T is a
finite tree having m nodes then it will have m-1 edges.

Weighted Graph: A graph G is said to be weighted, if each edge e is assigned a weight (+ve
numeric) value. Each edge in a weighted graph is represented by e = [u , v , w]
The weight of path in graph is the sum of the weights of the edges along the path.

Regular Graph: A graph is said to be a regular graph, if each node of graph has equal
number of indegree and outdegree.
Isomorphic Graphs: Two graphs are said to be isomorphic if they have same behavior in
terms of graphical property. The conditions for isomorphism are:
The number of nodes in two graphs must be same. There must be the same number of
edges in the two graphs. All the corresponding nodes of the two graphs must have same in
degree and out degree.
Representation of Graph in memory
Adjacency-Matrix Representation
A directed graph having n nodes is represented n x n matrix such that

aij = 1 if there is an edge from vi to vj

1 Otherwise
Suppose we have a four vertices directed graph.

Undirected Graph
An undirected graph G having n nodes is represented by n x n adjacent matrix such that
Matrix A = aij = 1 if there is an edge between vi and vj
1 otherwise
The adjacency matrix of undirected graph G is a symmetric matrix i.e. aij = aji for every i & j.
Suppose we have a four vertices undirected graph.

A matrix can be represented in memory by 2-dimensional array. A weighted graph may be


represented using the weight as the entry.

Das könnte Ihnen auch gefallen