Implementation of Sorting Algorithms

IMPLEMENTATION OF
SORTING ALGORITHMS
BY
ERINFOLAMI DANIEL OLUWATOSIN

(074617)
BEING A PROJECT SUBMITTED TO THE

DEPARTMENT OF MATHEMATICAL SCIENCE
(COMPUTER SCIENCE OPTION)
FACULTY OF SCIENCE,
EKITI STATE UNIVERSITY, ADO EKITI, NIGERIA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD OF
BACHELOR OF SCIENCE (B.SC) DEGREE IN COMPUTER SCIENCE
MARCH, 2012
i
CERTIFICATION
This is to certify that this work was carried out by ERINFOLAMI, Daniel
Oluwatosin with matriculation number 074617 of the Department of
Mathematical Science (Computer Science), Faculty of Science, Ekiti State
University, Ado-Ekiti,
____________________ _____________________
Mr. A.A. Obayomi DATE
PROJECT SUPERVISOR
____________________ _____________________
Dr. R.A. Adeleke DATE
HEAD OF DEPARTMENT
____________________ _____________________
External Examiner DATE
ii
DEDICATION
This project is dedicated to Almighty God for his guidance, protection and
provision throughout my course of study in Ekiti State University and also
to my family and friends for the encouragement, moral and financial support
given to me.
iii
ACKNOWLEDGEMENTS
A Big thanks to Almighty God the maker of Heaven and Earth who
has always been my Guide, my mentor, my Provider who has been
protecting me up till now, I owe him everything.
To my able Supervisor Mr. A.A Obayomi who is always being there
for me like a father to his son, who took time in going through this project
manuscript and making constructive criticisms, useful advice and
corrections for the success of this project, I say a big thank you.
I also appreciate the H.O.D of the department Dr R.A Adeleke, my
level coordinator, Mr. Ekong, and other lecturers in the department.
To my parents, Pastor & Mrs. Erinfolami who has always being there
for me, spiritually, financially and morally, you gave me the best gift of life
by sponsoring my education; I love you so much.
There are many to whom I owe a debt of gratitude for their valuable
suggestions, love and wonderful support all the way, More importantly Mr.
Abimbola Akande and a special crystal, Olutolami Falaiye. Also to my
siblings, I say thank you for your love and support, May God reward you
bountifully. I love you so much, thanks for being there. I cannot forget my
iv
friends; Akinwale Emmanuel Michael Falade, Samuel, Cecilia, and the
members of CAC Victory Assembly, EKSU, who all have been so loving and
have been such good people to me, it’s been a privilege meeting you all.
v
TABLE OF CONTENTS
CONTENTS PAGES
TITLE PAGE i
CERTIFICATION ii
DEDICATION iii
ACKNOWLEDGEMENTS iv
TABLE OF CONTENTS vi
ABSTRACT xi
CHAPTER ONE: INTRODUCTION
1.1 Introduction 1
1.2 Aims and Objectives of The Project 5
1.3 Methodology 6
1.4 Scope and Limitations of the Project 6
1.5 Structure of The Project 7
vi
CHAPTER TWO: LITERATURE REVIEW
2.1 History of Sorting 9
2.2 Terminologies 10
2.3 Concepts 13
2.4 Classification 18
2.5 Stability 19
2.6 Popular Sorting Algorithms 20
2.6.1 Bubble Sort 20
2.6.2 Cocktail Sort 22
2.6.3 Comb Sort 22
2.6.4 Gnome Sort 23
2.6.5 Heap Sort 24
2.6.6 Insertion Sort 25
2.6.7 JSort 27
2.6.8 Jump Sort 27
2.6.9 Merge Sort 28
2.6.10 Quick Sort 29
2.6.11 Quick Sort 3 33
vii
2.6.12 Selection Sort 34
2.6.13 Shaker Sort 35
2.6.14 Shell Sort 35
2.6.15 Smooth Sort 38
2.6.16 Snake Sort 38
2.7 Comparison 39
2.7.1 Comparison Sorts 40
2.7.2 Non-Comparison Sorts 43
2.7.3 Impractical Sorts 45
CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN
3.1 Introduction 47
3.2 The Ideal Sorting Algorithm 47
3.3 Characteristics of a Good System Design 48
3.4 Development Tools 49
3.5 Overview of the Proposed Program 49
3.6 Program Architecture 50
3.7 The User Interface 50
viii
CHAPTER FOUR: SYSTEM IMPLEMENTATION
4.1 System Requirement 52
4.2 Installation Description 53
4.3 Program Implementation 54
4.3.1 Tab 1: Basic Sorting 54
4.3.2 Tab 2: Sort File 56
4.3.3 Tab 3: Sort by Key Field 58
CHAPTER FIVE: CONCLUSION AND RECOMMENDATION
5.1 Conclusion 60
5.2 Recommendation 60
REFERENCES 61
APPENDICES 63
Appendix A: List of Sorting Algorithms 63
Appendix B: Pseudocode Implementation of some Sorting Algorithms 66
ix
Appendix C: Sorting Algorithms in Visual Basic 69
Appendix D: Program Source Code 77
x
ABSTRACT
This project presents and implements a collection of algorithms for sorting.
Descriptions are brief with just enough theory. The first section introduces
basic sorting concepts and classifications of sorting algorithms. The next
section presents several sorting algorithms. Source code for the most
popular algorithms, in Visual Basic, is included.
The implementation, in Visual Basic, is a Windows Forms application.
At first the sorting algorithms will be applied to an array of integers.
Although this is a good way to introduce the algorithms, it is not very
common in practical situations to sort an array of simple numbers. Usually,
the items being sorted are more complex, typically objects or structs, so that
each item is itself made up of a number of elements.
Hence, the application will be able (include procedures) implement a
predefined set of more complex tasks.
The bibliography appearing at the end of this project lists 38 sorting
algorithms and various books and articles published in recent years.
xi
xii
CHAPTER ONE
1.1 INTRODUCTION
According to Wikipedia,
Sorting is any process of arranging items in some sequence and/or in
different sets, and accordingly, it has two common, yet distinct meanings:
1. Ordering: arranging items of the same kind, class, nature, etc. in some
ordered sequence,
2. Categorizing: grouping and labeling items with similar properties
together (by sorts).
Although applying sorting algorithms to arrays of integers this is a good
way to introduce the algorithms, it is not very common in practical
situations to sort an array of simple numbers. Usually, the items being
sorted are more complex, typically objects or structs, so that each item is
itself made up of a number of elements. We will refer to the items that are
being sorted as records and to the elements that make up the items as fields
of the records. Sorting n-tuples (depending on context also called e.g.
records consisting of fields) can be done based on one or more of its
1
components. For example, if each record is a name and address, then the
fields might be first name, last name, street address, city, state, and zip code.
When an array of records is to be sorted, the sorted order typically depends
on only one of the fields of the record. This field is said to be the key field for
the sort (or sort key), and the values of this field are called keys. In other
words, when we sort an array of records, we compare keys, and we arrange
the records so that their keys are in ascending (or descending) order.
A new sort key can be created from two or more sort keys by
lexicographical order. The first is then called the primary sort key, the
second the secondary sort key, etc.
For example, addresses could be sorted using the city as primary sort key,
and the street as secondary sort key.
If the sort key values are totally ordered, the sort key defines a weak order of
the items: items with the same sort key are equivalent with respect to
sorting. See also stable sorting. If different items have different sort key
values then this defines a unique order of the items.
2
A standard order is often called ascending (corresponding to the fact that
the standard order of numbers is ascending, i.e. A to Z, 0 to 9), the reverse
order descending (Z to A, 9 to 0). For dates/times ascending means that
earlier values precede later ones e.g. 1/1/2000 will sort ahead of 1/1/2001.
A SORTING ALGORITHM is an algorithm that puts elements of a list in a
certain order. The most used orders are numerical and lexicographical order.
Sorting is a commonly encountered programming task in computing, hence,
efficient sorting is important for optimizing the use of other algorithms
(such as search and merge algorithms) that require sorted lists to work
correctly; it is often useful for canonicalizing data and for producing human-
readable output.
More formally, the output must satisfy two conditions:
1. The output is in non-decreasing order (each element is no smaller than
the previous element according to the desired total order)
2. The output is a permutation, or reordering, of the output.
3
SORTING EXAMPLES:
 List containing exam scores sorted from Lowest to Highest or from
Highest to Lowest.
 List containing words that were misspelled and be listed in
alphabetical order.
 List of student records and sorted by student number or
alphabetically by first or last name.
WHY DO WE SORT?
 Searching for an element in an array will be more efficient. (Example:
looking up for information like phone number).
 It’s always nice to see data in a sorted display. (Example: spreadsheet
or database application).
4
1.2 AIMS AND OBJECTIVES OF THE PROJECT
This project attempts to provide an overview of Lists, Arrays and searching
of such data structures.
It also provides an understanding of the basics and application of Sorting.
This project also aims to;
 Show how each algorithm operates
 Show that there is no best sorting algorithm
 Show the advantages and disadvantages of each algorithm
 Show that the worse-case asymptotic behavior is not always the best
deciding factor in choosing an algorithm
 Show that the initial condition (input order and key distribution)
affects performance as much as the algorithm choice.
This project also aimed at developing a simple application that
demonstrates simple and practical applications of sorting. The application
will be able to:
 Sort the contents of a text file,
 Sort a list and remove duplicates,
5
 Randomize an array, and
 Sort objects by different key fields by building a comparer class
(in Visual Basic .NET)
1.3 METHODOLOGY
This project work has benefitted majorly from information gathered from
various articles, publications and presentations. These materials are results
of high-quality research from both the Academics background with lasting
significance in the theory, design, implementation, analysis and applications
of sorting.
To develop the application the following will be used
Programming Language: Microsoft Visual Basic .NET
.NET Framework: Version 4
1.4 SCOPE AND LIMITATION OF THE STUDY
Sorting is a vast topic; the application will demonstrate basic
implementation of sorting algorithms.
6
External sorting, radix sorting, string sorting, and linked list sorting – all
wonderful and interesting topics – are deliberately omitted to limit the
scope of the project.
1.5 STRUCTURE OF THE PROJECT
i. Chapter One (Introduction): the background of the study, aims of the
study, scopes of the study and the methodology used.
ii. Chapter Two (Literature Review): analyses core algorithm concepts,
popular sorting algorithms and compares them. The chapter also
explores available written literatures.
iii. Chapter Three (System Design and Analysis): discusses how the
proposed application will be implemented by giving an overview of the
proposed application, describing the working operations, analysis of it
and the design.
iv. Chapter Four (System Implementation): takes a broad look at the
working operations of the application, the chapter exposes it to real life
scenarios to confirm performance. It starts by taking a look at the
7
hardware and software requirement for installation and the process of
installation itself.
v. Chapter Five (Conclusion and Recommendation): discusses the future
of sorting and sorting algorithms and based on that, recommendations
will be suggested.
8
CHAPTER TWO
LITERATURE REVIEW
2.1 History of Sorting
Sorting is one of the most important operations performed by computers. In
the days of magnetic tape storage before modern databases, database
updating was done by sorting transactions and merging them with a master
file. It's still important for presentation of data extracted from databases:
most people prefer to get reports sorted into some relevant order before
flipping through pages of data.
In computer science, sorting is one of the most extensively researched
subjects because of the need to speed up the operation on thousands or
millions of records during a search operation.
The main purpose of sorting information is to optimize its usefulness for
specific tasks. In general, there are two ways of grouping information: by
category e.g. a shopping catalogue where items are compiled together under
headings such as 'home', 'sport & leisure', 'women's clothes' etc. (nominal
9
scale) and by the intensity of some property, such as price, e.g. from the
cheapest to most expensive (ordinal scale.)
2.2 Terminologies
a) Algorithm:
An algorithm is an effective method expressed as a finite list of well-defined
instructions for calculating a function. Starting from an initial state and
initial input (perhaps empty), the instructions describe a computation that,
when executed, will proceed through a finite number of well-defined
successive states, eventually producing "output" and terminating at a final
ending state.
b) Lists:
A list or sequence is an abstract data type that implements an ordered
collection of values, where the same value may occur more than once. An
instance of a list is a computer representation of the mathematical concept
of a finite sequence. Each instance of a value in the list is usually called an
10
item, entry, or element of the list; if the same value occurs multiple times,
each occurrence is considered a distinct item.
c) Array:
An array data structure or simply array is a data structure consisting of a
collection of elements (values or variables), each identified by at least one
array index or key. An array is stored so that the position of each element
can be computed from its index tuple by a mathematical formula.
Arrays are among the oldest and most important data structures, and are
used by almost every program. They are also used to implement many other
data structures, such as lists and strings.
When data objects are stored in an array, individual objects are selected by
an index that is usually a non-negative scalar integer. Indices are also called
subscripts. An index maps the array value to a stored object.
There are three ways in which the elements of an array can be indexed:
 0 (zero-based indexing): The first element of the array is indexed by
subscript of 0.
11
 1 (one-based indexing): The first element of the array is indexed by
subscript of 1.
 n (n-based indexing): The base index of an array can be freely chosen.
Arrays can have multiple dimensions, thus it is not uncommon to access an
array using multiple indices. The number of indices needed to specify an
element is called the dimension, dimensionality, or rank of the array.
d) Search Algorithm:
A search algorithm, broadly speaking, is an algorithm for finding an item
with specified properties among a collection of items. The items may be
stored individually as records in a database; or may be elements of a search
space defined by a mathematical formula or procedure, such as the roots of
an equation with integer variables.
e) N-Tuple:
An n-tuple, sometimes simply called a “tuple” when the number n is known
implicitly, is another word for a list, i.e., an ordered set of n elements. It can
be interpreted as a vector, or more specifically, an n-vector.
12
2.3 Concepts
a) Big O Notation:
In Mathematics, Big O Notation is used to describe the limiting behavior of
a function when the argument tends towards a particular value or infinity,
usually in terms of simpler functions. Big O notation characterizes functions
according to their growth rate.
b) Timing Estimates
We can place an upper-bound on the execution time of algorithms using O
(big-oh) notation. An algorithm that runs in O(n2) time indicates that
execution time increases with the square of the dataset size. For example, if
we increase dataset size by a factor of ten, execution time will increase by a
factor of 100. A more precise explanation of big-oh follows.
Assume that execution time is some function t(n), where n is the dataset
size. The statement
t(n) = O(g(n))
implies that there exists positive constants c and n0 such that
13
t(n) <= c·g(n)
for all n greater than or equal to n0. This is illustrated graphically in the
following figure.
Figure 2-1: Big-Oh
Big-oh notation does not describe the exact time that an algorithm takes,
but only indicates an asymptotic upper bound on execution time within a
constant factor. If an algorithm takes O(n2) time, then execution time grows
no worse than the square of the size of the list.
14
7/6 2
n lg n n n lg n n
1 0 1 0 1
16 4 25 64 256
256 8 645 2,048 65,536
4,096 12 16,384 49,152 16,777,216
65,536 16 416,128 1,048,565 4,294,967,296
1,048,576 20 10,568,983 20,971,520 1,099,511,627,776
16,777,216 24 268,435,456 402,653,183 281,474,976,710,656
Table 2-1: Growth Rates
Table 1-1 illustrates growth rates for various functions. A growth rate of
O(lg n) occurs for algorithms similar to the binary search. The lg (logarithm,
base 2) function increases by one when n is doubled. Recall that we can
search twice as many items with one more comparison in the binary search.
Thus the binary search is a O(lg n) algorithm.
If the values in Table 1-1 represented microseconds, then a O(n1.25)
algorithm may take 10 microseconds to process 1,048,476 items, a O(lg n)
algorithm 20 seconds, and a O(n2) algorithm up to 12 days! In the following
chapters a timing estimate for each algorithm, using big-O notation, will be
included. For a more formal derivation of these formulas you may wish to
consult the references.
15
c) Divide and Conquer Algorithms:
Divide and Conquer (D&C) is an important algorithm design paradigm
based on multi-branched recursion. A divide and conquer algorithm works
by recursively breaking down a problem into two or more sub-problems of
the same (or related) type, until these problems become simple enough to be
solved directly. The solutions to the sub-problems are then combined to
give a solution to the original problem.
This technique is the basis of efficient algorithms for all kinds of problems
such as sorting (e.g., Quicksort, merge sort.)
d) Data Structures:
A data structure is a particular way of storing and organizing data in a
computer so that it can be used efficiently. Different kinds of data structures
are suited to different kinds of applications. For example, compiler
implementations usually use hash tables to look up identifiers. Data
structures are used in almost every program or software system. Data
structures provide a means to manage huge amounts of data efficiently such
as large databases and internet indexing services.
16
e) Best, Worst and Average Case:
Best, Worst and Average Cases of a given algorithm express what the
resource usage is at least, at most and on average, respectively. Usually the
resource being considered is running time, but it could also be memory or
other resources. In real-time computing, the worst-case execution time is
often of particular concern since it is important to know how much time
might be needed in the worst case to guarantee that the algorithm will
always finish on time.
f) Space-Time Tradeoffs (or time-memory tradeoff):
This is a situation where the memory use can be reduced at the cost of
slower program execution (and, conversely, the computation time can be
reduced at the cost of increased memory use). As the relative costs of CPU
cycles, RAM space, and hard drive space change – hard drive space has for
some time been getting cheaper at a much faster rate than other components
of computers – the appropriate choices of space-time tradeoffs have changed
radically. Often, by exploiting a space-time tradeoff, a program or algorithm
can be made to run much faster.
17
2.4 CLASSIFICATION
Sorting algorithms used in computer science are often classified by:
a) Recursion. Some algorithms are either recursive or non-recursive, while
others may be both (e.g., merge sort).
b) Whether or not they are a comparison sort. A comparison sort examines
the data only by comparing two elements with a comparison operator.
c) Memory usage. Strictly, a sorting algorithm is said to be an in-place
sorting algorithm if the amount of extra space required by the algorithm
is O(1); sometimes O(log(n)) additional memory is considered "in place".
That is, the amount of extra space is bounded by a constant, independent
of the size of the array.
d) Computational complexity of swaps (for "in place" algorithms)
e) Computational complexity (worst, average and best behavior) of
element comparisons in terms of the size of the list (n). For typical
sorting algorithms good behavior is O(n log n) and bad behavior is
O(n2). Ideal behavior for a sort is O(n), but this is not possible in the
average case. Comparison-based sorting algorithms, which evaluate the
18
elements of the list via an abstract key comparison operation, need at
least O(n log n) comparisons for most inputs.
f) Stability: stable sorting algorithms maintain the relative order of records
with equal keys
g) General method: insertion, exchange, selection, merging, etc.. Exchange
sorts include bubble sort and quicksort. Selection sorts include shaker
sort and heapsort.
h) Adaptability: Whether or not the presortedness of the input affects the
running time. Algorithms that take this into account are known to be
adaptive.
2.5 STABILITY
A sorting algorithm is said to be stable if items that have the same key
remain in the same relative order after the sorting algorithm is applied as
they were in before the sort.
Insertion Sort is a stable sorting algorithm because it never changes the
ordering of items with the same key. In Selection Sort, on the other hand, an
19
item can jump from one part of the array to another, moving past other
items with the same key. Selection Sort is not stable. Similarly, Heap Sort is
not a stable sorting algorithm.
Stability is not always needed in a sorting algorithm, but it is desirable in
some cases. Suppose, for example, we are sorting records that contain first
name and last name fields. Suppose that we sort the array using the first
name as the key, and then we sort using the last name as key. If we use a
stable sorting algorithm, then records with the same last name will still be
in alphabetical order by first name. If the algorithm is not stable, then
people with the same last name can end up in an arbitrary order. [3]
2.6 POPULAR SORTING ALGORITHMS
2.6.1. Bubble Sort
Bubble sort is a simple sorting algorithm. It works by repeatedly stepping
through the list to be sorted, comparing two items at a time and swapping
them if they are in the wrong order. The pass through the list is repeated
until no swaps are needed, which means the list is sorted.
20
Figure 2-2: A Visual Representation of How Bubble Sort Works
The algorithm gets its name from the way smaller elements "bubble" to the
top (i.e. the beginning) of the list via the swaps. One way to optimize
bubble sort (implemented here) is to note that, after each pass, the largest
element will always move down to the bottom. Thus it suffices to sort the
remaining n - 1 elements each subsequent pass.
Although simple, this algorithm is highly inefficient and is rarely used
except in education.
21
2.6.2. Cocktail Sort
Cocktail sort is a variation of bubble sort that sorts in both directions each
pass through the list.
One optimization (implemented here) is to add an if-statement that checks
whether there has been a swap after the first pass each iteration. If there
hasn't been a swap the list is sorted and the algorithm can stop.
2.6.3. Comb Sort
Comb sort was invented by Stephen Lacey and Richard Box, who first
described it to Byte Magazine in 1991. It improves on bubble sort and rivals
in speed more complex algorithms like quick sort. The idea is to eliminate
turtles, or small values near the end of the list, since in a bubble sort these
slow the sorting down tremendously. (Rabbits, large values around the
beginning of the list, do not pose a problem in bubble sort.)
In bubble sort, when any two elements are compared, they always have a
gap (distance from each other) of one. The basic idea of comb sort is that the
gap can be much more than one.
22
The gap starts out as the length of the list being sorted divided by the shrink
factor, and the list is sorted with that value (rounded down to an integer if
needed) for the gap. Then the gap is divided by the shrink factor again, the
list is sorted with this new gap, and the process repeats until the gap is one.
At this point, comb sort reverts to a true bubble sort, using a gap of one
until the list is fully sorted. In this final stage of the sort most turtles have
already been dealt with, so a bubble sort will be efficient.
The shrink factor has a great effect on the efficiency of comb sort. In the
original article, the authors suggested 1.3 after trying some random lists and
finding it to be generally the most effective. A value too small slows the
algorithm down because more comparisons must be made, whereas a value
too large may not kill enough turtles to be practical.
2.6.4. Gnome Sort
Gnome sort is a sorting algorithm which is similar to insertion sort except
that moving an element to its proper place is accomplished by a series of
swaps, as in bubble sort. It is conceptually simple, requiring no nested
loops. In practice the algorithm has been reported to generally run as fast as
23
Insertion sort, although this depends on the details of the architecture and
the implementation
2.6.5. Heap Sort
Heap sort is a much more efficient version of selection sort. Invented by
John William Joseph Williams in 1964, it works efficiently by using a data
structure called a heap.
A heap is a specialized tree-based data structure that satisfies the heap
property: if B is a child node of A, then key(A) >= key(B). This implies that
the element with the greatest key is always in the root node. All elements to
be sorted are inserted into a heap, and the heap organizes the elements
added to it in such a way that the largest value can be quickly extracted.
Once the data list has been made into a heap, the root node is guaranteed to
be the largest element. It is removed and placed at the end of the list, then
the heap is rearranged so the largest element remaining moves to the root.
Using the heap, finding the next largest element takes much less time than
scanning every remaining element, which gives heap sort much better
24
performance than selection sort. Similar to selection sort, the initial
conditions have little or no effect on the amount of time required.
Heap sort is one of the best general-purpose sorting algorithms. Although
somewhat slower in practice on most machines than a good implementation
of quick sort, it has a better worst-case runtime.
2.6.6. Insertion Sort
One of the simplest methods to sort an array is an insertion sort. Insertion
sort is a simple comparison sort in which the sorted array (or list) is built
one entry at a time. It is much less efficient on large lists than more
advanced algorithms such as quicksort, heap sort, or merge sort, but it's very
efficient on small (5-50 key) lists, as well as lists that are mostly sorted to
begin with.
25
Figure 2-3: Insertion Sort
In abstract terms, every iteration of an insertion sort removes an element
from the input data, inserting it at the correct position in the already sorted
list, until no elements are left in the input. The choice of which element to
remove from the input is arbitrary and can be made using almost any choice
algorithm.
26
2.6.7. JSort
JSort is a hybrid of heap sort and insertion sort developed by Jason
Morrison. It works by running two heap passes to roughly order the array,
and then finishes with an insertion sort.
The first heap pass converts the array to a heap, moving the smallest item to
the top. The second heap pass works in reverse, moving the largest element
to the bottom. These two passes combine to roughly order the array, though
much work is still left to the final insertion sort.
For small lists, JSort is extremely efficient, but due to its design it does not
scale well.
2.6.8. Jump Sort
Similar to shell sort and comb sort, jump sort employs a gap value that
decreases in each successive pass that allows out-of-place elements to be
moved very far initially. The underlying framework is based on bubble sort
instead of the more efficient insertion sort, but due to the initial ordering in
the early passes it ends up being very efficient in practice, though still
somewhat slower than either shell sort or comb sort.

27
2.6.9. Merge Sort
Similar to quick sort, merge sort is a recursive algorithm based on a divide-
and-conquer strategy. First, the sequence to be sorted is split into two
halves. Each half is then sorted independently, and the two sorted halves are
merged to a sorted sequence. Merge sort takes advantage of the ease of
merging together already sorted lists.
In many implementations, merge sort calls out to an external algorithm --
usually insertion sort -- when it reaches a level of around 10-20 elements.
This is not necessary in Visual Basic; in fact such an implementation appears
to slow merge sort's overall performance in practice.
Instead, the implementation here uses the purest form of merge sort, where
the list is recursively divided into halves until it reaches a list size of two, at
which point those two elements are sorted.
Unfortunately, extra memory is required to combine two sorted lists
together. The extra memory in this implementation is in the form of a full
copy of the initial array. There are various optimizations that can be made to
improve on this, but that is left as an exercise for the reader.
28
Invented in 1945 by John von Neumann, merge sort is far and away the
fastest stable algorithm.
2.6.10. Quick Sort
Quick sort was originally invented in 1960 by Charles Antony Richard
Hoare. It is a divide and conquer algorithm which relies on a partition
operation.
To partition an array, a pivot element is first randomly selected, and then
compared against every other element. All smaller elements are moved
before the pivot, and all larger elements are moved after. The lesser and
greater sublists are then recursively processed until the entire list is sorted.
This can be done efficiently in linear time and in-place.
Quick sort turns out to be the fastest sorting algorithm in practice.
However, in the (very rare) worst case quick sort is as slow as bubble sort.
There are good sorting algorithms with a better worst case, e.g. heap sort
and merge sort, but on the average they are slower than quick sort by a
consistent margin.
29
int function Partition (Array A, int Lb, int Ub);
begin
select a pivot from A[Lb]...A[Ub];
reorder A[Lb]...A[Ub] such that:
all values to the left of the pivot are <= pivot
all values to the right of the pivot are >= pivot
return pivot position;
end;
procedure QuickSort (Array A, int Lb, int Ub);

begin
if Lb < Ub then
M = Partition (A, Lb, Ub);
QuickSort (A, Lb, M - 1);
QuickSort (A, M, Ub);
end;
Figure 2-4: Quicksort Algorithm
The implementation here uses Niklaus Wirth's variant for selecting the
pivot value, which is simply using the middle value. This works particularly
well for already sorted lists.
In Figure 2-5(a), the pivot selected is 3. Indices are run starting at both ends
of the array. One index starts on the left and selects an element that is larger
than the pivot, while another index starts on the right and selects an
element that is smaller than the pivot. In this case, numbers 4 and 1 are
selected. These elements are then exchanged, as is shown in Figure 2-5(b).
This process repeats until all elements to the left of the pivot <= the pivot,
30
and all elements to the right of the pivot are >= the pivot. Quick Sort
recursively sorts the two sub arrays, resulting in the array shown in Figure
2-5(c).
Figure 2-5: Quicksort Example 1
Its speed and modest space usage makes quick sort one of the most popular
sorting algorithms, available in many standard libraries.
Full example of quicksort on a random set of numbers (Figure 2-6)
The shaded element is the pivot. It is always chosen as the last element of
the partition. However, always choosing the last element in the partition as
the pivot in this way results in poor performance ( ) on already sorted
31
Figure 2-5: Quicksort Example 2
lists, or lists of identical elements. Since sub-lists of sorted / identical
elements crop up a lot towards the end of a sorting procedure on a large set,
versions of the quicksort algorithm which choose the pivot as the middle
element run much more quickly than the algorithm described in this
diagram on large sets of numbers.
32
2.6.11. Quick Sort 3
The critical operation in the standard quick sort is choosing a pivot: the
element around which the list is partitioned. The simplest pivot selection
algorithm is to take the first or the last element of the list as the pivot,
causing poor behavior for the case of sorted or nearly-sorted input. Niklaus
Wirth's variant uses the middle element to prevent these occurrences,
degenerating to O(n²) for contrived sequences.
The median-of-3 pivot selection algorithm takes the median of the first,
middle, and last elements of the list; however, even though this performs
well on many real-world inputs, it is still possible to contrive a median-of-3
killer list that will cause dramatic slowdown of a quicksort based on this
pivot selection technique. Such inputs could potentially be exploited by an
aggressor, for example by sending such a list to an Internet server for sorting
as a denial of service attack.
The quicksort3 implementation here uses a median-of-3 technique, but
instead of using the first, last and middle elements, three elements are
chosen at random. This has the advantage of being immune to intentional
33
attacks, though there is still a possibility (however remote) of realizing the
worst case scenario.
2.6.12. Selection Sort
Selection sort is a simple sorting algorithm that mimics the way humans
instinctively sort. It works by first scanning the entire list to find the
smallest element, swapping it into the first position. It then finds the next
smallest element, swapping that into the second position, and so on until
the list is sorted.
Selection sort is unique compared to almost any other algorithm in that its
running time is not affected by the prior ordering of the list; it always
performs the same number of operations because of its simple structure.
Selection sort also requires only n swaps, which can be very attractive if
writes are the most expensive operation.
Unless writes are very expensive, selection sort will usually be
outperformed by the more complicated algorithms, though it will always
outperform a basic bubble sort. Heap sort is an efficient variation of
selection sort that is both very fast and also scales well.
34
2.6.13. Shaker Sort
Shaker sort is a gap-based bubble sort with a twist. Most gap sorts -- shell
sort, comb sort, et al. -- begin with a large gap and gradually shrink it down
to one. By the time the gap reaches one, the list should be mostly ordered so
the final pass should be efficient.
Like other gap sorts, shaker sort begins with a large gap which gradually
shrinks. However, once the gap reaches one, the gap gets expanded again
before shrinking back toward one. The expanding and contracting gap sizes
constitute the "shaking" part of shaker sort. Each additional expansion is
smaller and smaller until it eventually resolves to one, when no further
expansion is done. At this point the list is almost certain to be nearly sorted,
so the final bubble sort pass is very efficient.
2.6.14. Shell Sort
Shell sort is a variation of insertion sort that was invented by (and takes its
name from) Donald Shell, who published the algorithm in 1959.
Shell sort improves on insertion sort by comparing elements separated by a
gap of several positions. This lets an element take "bigger steps" toward its
35
expected position. Multiple passes over the data are taken using insertion
sort with smaller and smaller gap sizes. The last step of Shell sort is with a
gap size of one -- meaning it is a standard insertion sort -- guaranteeing that
the final list is sorted. By then, the list will be almost sorted already, so the
final pass is efficient.
The gap sequence is an integral part of the shellsort algorithm. Any
increment sequence will work, so long as the last element is 1. Donald Shell
originally suggested a gap sequence starting at half the size of the list,
dividing by half, every iteration until it reached one. While offering
significant improvement over a standard insertion sort, it was later found
that steps of three improve performance even further.
In the implementation here, the initial gap size is calculated by an iterative
formula x=3x+1, where x starts at 0 and grows until the gap is larger than the
list size. Each insertion sort loop begins by dividing the gap by three, thus
ensuring a very large starting gap.
36
Figure 2-7: Shell Sort
A key feature of shell sort is that the elements remain k-sorted even as the
gap diminishes. For instance, if a list was 5-sorted and then 3-sorted, the list
is now not only 3-sorted, but both 5- and 3-sorted. If this were not true, the
algorithm would undo work that it had done in previous iterations, and
would not achieve such a low running time.
Although shell sort is inefficient for large data sets, it is one of the fastest
algorithms for sorting small numbers of elements (sets with less than 1000
or so elements). Another advantage of this algorithm is that it requires
relatively small amounts of memory. It is worth noting that shell sort
37
enjoyed a brief period when it was the fastest sorting algorithm known, only
to be eclipsed by quick sort one year later.
2.6.15. Smooth Sort
Smooth sort is a variation of heap sort developed by Edsger Dijkstra in 1981.
Smooth sort has a similar average case to heap sort but a much better best
case, with a smooth transition between the two. This is where the name
comes from. The advantage of smooth sort is that it's faster if the input is
already sorted to some degree.
Due to its complexity, smooth sort is rarely used.
2.6.16. Snake Sort
An original algorithm, snake sort was invented in 2007. It is as fast as or
faster than any other algorithm, including quick sort, and it scales up as well
as quick sort excluding memory constraints. It is in the merge sort family,
and is unstable, out-of-place, offline, and non-recursive. I call it snake sort
due to its similarities to fantasy football snake drafts.
38
The idea is simple: A random ordering will result in very small contiguous
ordered blocks in either direction. Snake sort begins by identifying all those
blocks, and then merges them together. Each merge pass will halve the
remaining number of blocks, so it very quickly resolves to a sorted state.
It uses quite a bit of memory: a full copy of the original array, plus an index
array (to remember the block cutoffs) half the size of the original array.
The most interesting feature of snake sort is that the more ordered the array
is initially, the faster it runs. Because it is in the unique position of knowing
when the initial order is descending, it is optimally efficient at transposing
such a list to an ordered state.
2.7 COMPARISON
In this table, n is the number of records to be sorted. The columns "Average"
and "Worst" give the time complexity in each case, under the assumption
that the length of each key is constant, and that therefore all comparisons,
swaps, and other needed operations can proceed in constant time. "Memory"
denotes the amount of auxiliary storage needed beyond that used by the list
39
itself, under the same assumption. These are all comparison sorts. The run
time and the memory of algorithms could be measured using various
notations like theta, omega, Big-O, small-o, etc. The memory and the run
times below are applicable for all the 5 notations.
2.7.1 COMPARISON SORTS
Name Best Average Worst Memory Stable Method Other notes
When using a self-

Binary tree
Yes Insertion balancing binary
sort
search tree
Randomly permute
Bogosort No Luck the array and check
if sorted.
Bubble sort Yes Exchanging Tiny code size
Cocktail sort Yes Exchanging
Comb sort — — No Exchanging Small code size

In-place with
theoretically
Cycle sort — No Insertion
optimal number of
writes
Gnome sort Yes Exchanging Tiny code size
Heapsort No Selection
Implemented in
Standard Template
Library (STL); can
In-place
Yes Merging be implemented as
Merge sort
a stable sort based
on stable in-place
merging
40
O(n + d), where d is

Insertion sort Yes Insertion the number
of inversions
Partition-
Used
ing &
Introsort No in SGI STL implem
Select-
entations
ion
Library sort — Yes Insertion
Depends;
worst
Merge sort Yes Merging
case
is
Finds all
Insertion the longest
Patience
— — No & increasing
sorting
Selection subsequences withi
n O(n log n)
Quicksort is usually
done in place with
O(log(n)) stack
space. Most
implementations
Partition- are unstable, as
Quicksort Depends
ing stable in-place
partitioning is more
complex. Naïve vari
ants use an O(n)
space array to
store the partition.
Stable with O(n) extra

Selection sort No Selection space, for example
using lists.
41
Depends
on gap
sequence;
Shell sort or No Insertion
best known
is
Remarkably
Slow sort — — — No Selection inefficient sorting
algorithm
An adaptive sort -
comparisons
Smoothsort No Selection when the data is
already sorted, and
0 swaps.
Strand sort Yes Selection
comparisons
Insertion
when the data is
Timsort Yes &
already sorted or
Merging
reverse sorted.
Tournament
— Selection
sort
Table 2-2: Comparison Sorts
The following table describes integer sorting algorithms and other sorting
algorithms that are not comparison sorts. As such, they are not limited by a
lower bound. Complexities below are in terms of n, the number of items to

42
be sorted, k, the size of each key, and d, the digit size used by the
implementation. Many of them are based on the assumption that the key
size is large enough that all entries have unique key values, and hence that n
<< 2k, where << means "much less than."
2.7.2 Non-Comparison Sorts
k
Name Best Average Worst Memory Stable n << 2 Notes
Pigeonhole sort — Yes Yes
Bucket Assumes uniform

sort (uniform — Yes No distribution of elements from
keys) the domain in the array.
Bucket r is the range of numbers to

sort (integer — Yes Yes be sorted. If r = then
keys) Avg RT =
r is the range of numbers to

Counting sort — Yes Yes be sorted. If r = then
Avg RT =
43
2.7.2 Non-Comparison Sorts
k
Name Best Average Worst Memory Stable n << 2 Notes
LSD Radix Sort — Yes No
Stable version uses an

MSD Radix
— Yes No external array of size n to
Sort
hold all of the bins
MSD Radix In-Place. k / d recursion

— No No d
Sort levels, 2 for count array
Asymptotics are based on

k
the assumption that n << 2 ,
Spreadsort — No No
but the algorithm does not
require this.
Table 2-3: Non-Comparison Sorts
The following table describes some sorting algorithms that are impractical
for real-life use due to extremely poor performance or a requirement for
specialized hardware.
44
2.7.3 Impractical Sorts
Name Best Average Worst Memory Stable Comparison Other notes
Bead
— N/A N/A — N/A No Requires specialized hardware
sort
Simple
pancake — No Yes Count is number of flips.
sort
This A linear-time, analog

algorithm for sorting a sequence
Spaghetti
of items, requiring O(n) stack
(Poll) Yes Polling
space, and the sort is stable. This
sort
requires parallel
processors. Spaghetti sort
Sorting Requires a custom circuit of

— Yes No
networks size
Table 2-3: Impractical Sorts
45
Algorithms not yet compared above include:
 Odd-even sort
 Flash sort
 Burst sort
 Postman sort
 Stooge sort
 Sample sort
 Bitonic sorter
46
CHAPTER THREE
SYSTEM ANALYSIS AND DESIGN
This chapter discusses how the proposed application will be implemented by giving an
overview of the proposed application, describing the working operations, analysis of it and
the design.
3.0. INTRODUCTION
A complete functional understanding of the sorting program is presented. In
addition, the various algorithms are graphically visualized the running
process of each algorithm.
3.1. THE IDEAL SORTING ALGORITHM
Would have the following properties
 Stable: Equal keys aren’t reordered.
 Operates in-place, requiring O(1) extra space.
47
 Worst-case O(n∙lg(n)) key comparisons.
 Worst-case O(n) swaps.
 Adaptive: Speeds up to O(n) when data is nearly sorted or when there
are few unique keys.
There is no algorithm that has all these properties, and so the choice of
sorting algorithm depends on the application.
3.2. CHARACTERISTICS OF A GOOD SYSTEM DESIGN
I. A good system must meet user’s requirements and also be cost
effective.
II. A good system must be user friendly and the user interface must be
easy to understand and very simple.
III. A good system design must be conscious of the growth in the
institution for which such system is designed and must be flexible
enough to accommodate such growth.
IV. A good system design must be maintainable with a very simple and
self-explanatory documentation.
48
V. A good system design must be reliable, secured and put into
consideration features that would protect it from fraudulent attempt
or abuse.
3.3. DEVELOPMENT TOOLS
The Program is written in Visual Basic .NET (Visual Studio 2010, .NET
Framework 4.)
Visual Basic .NET (VB.NET) is an object-oriented computer programming
language that can be viewed as an evolution of the classic Visual Basic (VB),
which is implemented on the .NET Framework.
3.4. OVERVIEW OF THE PROPOSED PROGRAM
The proposed program is meant to show practical situations and
implementation of sorting algorithms. Each unit of the program will be able
to solve one task or the other a using sorting algorithm.
The program does not need installation or any security features.
49
3.5. PROGRAM ARCHITECTURE
The program has one module and two classes.
The first class contains the various subroutines used in the program while
the module contains the actual sorting algorithm (insert sort) used in the
program.
The second class is nested in the first class because it’s a Private class.
Input for each text box is verified to be of integer type.
3.6. THE USER INTERFACE
There are 3 tabs in the application. Each tab solves a particular problem.
Tab 1: Basic Sorting: It allows the user to input a set of numbers and sorts it
and also removes duplicates. This process can also be automated in which
the computer randomly generates the set of numbers.
Tab 2: Sort Text File: This example, takes a text file, splits the contents,
sorts the resulting string alphabetically and rewrites it back to the text file.
50
Tab 3: Sort by Key Fields: This example sorts different objects by different
key fields.
51
CHAPTER FOUR
SYSTEM IMPLEMENTATION
This chapter takes a broad look at the working operations of the application, the chapter
exposes it to real life scenarios to confirm the performance, It starts by taking a look at the
hardware and software requirement for running.
4.1 SYSTEM REQUIREMENT
The program is a very light program that requires very little processing
power and memory space to run. For a successful running of the program,
the following minimum system specification is required among other things.
(a) Minimum Hardware requirements
Pentium III or higher with memory not lesser than 128MB
(b) Software requirements
I. Operating System:
o Microsoft Windows 2000, Xp, Vista, 7 or any other higher
version.
52
o Linux or other variants like Ubuntu
II. .NET Framework 4.0
4.2 INSTALLATION DESCRIPTION
The program is very light (about 100kb) and requires no installation. The
user simply clicks and runs it from the storage media on which it is saved.
4.3 PROGRAM IMPLEMENTATION
4.3.1 Tab 1 – Basic Sorting
It allows the user to input a set of numbers and sorts it and also removes
duplicates. This process can also be automated in which the computer
randomly generates the set of numbers.
1. To input the set of numbers manually,
a) The user clicks on the text box and inputs a number
b) The user clicks the add button
53
Figure 4-1: Tab 1: Basic Sorting
c) The program checks if the input is valid. A valid entry is a non-empty
text box containing numeric characters. If the entry contains
alphanumeric characters, then we use the val() function to
convert it to the appropriate type. The val() function Returns the
numbers contained in a string as a numeric value of appropriate type.
54
d) The number is then added to Listbox1
e) The text button is automatically focused again
The user will go over these steps till he has input all the numbers he
wishes to sort.
2. After the inputs, the user clicks the Sort button and the add button is
disabled to prevent input during the running of the program. The
program then copies all the items in listbox1 to an array and sorts the
array using the Insert sort algorithm. The resulting sorted array is then
copies to ListBox2
3. The user can remove duplicate items from the new sorted array by
clicking the “Remove Duplicates” button. A simple algorithm scans
ListBox2 and removes every duplicate therein.
4. Alternatively, the program can automatically generate a set of numbers
into ListBox1 and sort them into ListBox2. To do this, the user
clicks the Button labeled “Automate Add And Sort”
5. An input box is displayed prompting the user to input the range of the
dataset. If the range is higher than 6,400, the user is alerted that datasets
55
higher than that figure results in exponentially longer run time and
hence more computer memory and CPU resources.
4.3.2 Tab 2: Sort File
This example, takes a text file, splits the contents, sorts the resulting string
alphabetically and rewrites it back to the text file.
Figure 4-2: Tab 2: Sort Text File
56
1. The file to be analyzed must be a text (.txt) file. Hence if the user
wants to sort the contents of a Word (.docx) or a Pfd (.pdf) file, the
user must copy the contents of the file onto a blank Notepad.
2. The user copies the file address into the text box supplied.
3. The student then selects the delimiter that separates each item in the
text file.
4. Once the “Open + Split” Button is clicked, the program first checks if
the file address specified in the text box exists. If it does, it then
checks if the text file is notEmpty.
5. If the text file path does not exist or is empty, the user is notified and
the control is transferred back to the text box.
6. Once the input is valid (i.e. the file exists and is not empty), the
program copies the content of the text file and splits it using the
delimiter specified earlier by the user.
7. The output of the split, a string array, is then copied into the
ListBox.
8. The user clicks the “Sort” button to sort the contents of the
ListBox.
57
9. Once the ListBox is sorted, the user clicks on “Rewrite” button to
copy the contents back into the text file.
10. The Len box displays the number of characters in the text file while
the Num box displays the number of words in the text file.
4.3.3 Tab 3: Sort by Key Field
This simple example sorts a list of 10 names either by Name or by ID.
The example simply shows that objects with several fields can be sorted by
different sort keys.
The user selects the radio button by Name or by ID and the program sorts
the list box according to the selected choice.
58
Figure 4-3: Tab 3: Sort by Key Field
59
CHAPTER FIVE
CONCLUSION AND RECOMMENDATION
5.1 CONCLUSION
This project has been successfully implemented and concluded.
The choice of sorting algorithm ultimately depends on the program.
Suggestions and constructive criticism is highly welcomed for the
improvement of the program.
5.2 RECOMMENDATION
This application and procedures therein can be further expanded and used
for developing a password storage and encryption system.
60
REFERENCES
Black, Paul E. (13 November 2008). "array". Dictionary of Algorithms and

Data Structures. National Institute of Standards and Technology. Retrieved
2010-08-22
Bjoern Andres; Ullrich Koethe; Thorben Kroeger; Hamprecht (2010).

"Runtime-Flexible Multi-dimensional Arrays and Views for C++98 and
C++0x". arXiv:1008.2909 [cs.DS].
Charles E., Rivest, Ronald L., Stein, Clifford (2009) [1990]. Introduction to
Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4
David R. Richardson (2002), The Book on Data Structures. iUniverse, 112

pages. ISBN 0-595-24039-9, 9780595240395
Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and

Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0
Garcia, Ronald; Lumsdaine, Andrew (2005). "MultiArray: a C++ library for

generic programming with arrays". Software: Practice and Experience 35
(2): 159–188. doi:10.1002/spe.630. ISSN 0038-0644.Cormen, Thomas H.;
Leiserson
Insertion Sort & Shellsort, Andy Le, CS146 – Dr. Sin Min Lee, Spring 2004,
ISSort-AndyLe.ppt, Ret. March 01, 2012
61
LaMarca, A.; Ladner, R. E. (1997). "The influence of caches on the
performance of sorting". Proc. 8th Ann. ACM-SIAM Symp. on Discrete
Algorithms (SODA97): 370–379
Sedgewick, Robert (2003), "6.10 Key-Indexed Counting", Algorithms in Java,

Parts 1-4: Fundamentals, Data Structures, Sorting, and Searching (3rd ed.),
Addison-Wesley, pp. 312–314
Sorting and Searching Algorithms, Thomas Niemann, epaperpress.com
T. Veldhuizen. Arrays in Blitz++. In Proc. of the 2nd Int. Conf. on Scientific

Computing in Object-Oriented Parallel Environments (ISCOPE), LNCS
1505, pages 223-220. Springer, 1998
http://www.sorting-algorithms.com
WIKIPEDIA - http://en.wikipedia.org/wiki/Sorting_algorithm
http://en.wikipedia.org/wiki/Sorting
62
APPENDIX A
List of Sorting Algorithms

Exchange Sorts
1. Bubble sort: for each pair of indices, swap the items if out of order
2. Cocktail sort
3. Comb sort
4. Gnome sort
5. Odd-even sort
6. Quicksort: divide list into two, with all items on the first list coming
before all items on the second list; then sort the two lists. Often the
method of choice
Humorous or ineffective
7. Bogosort
8. Stooge sort
Hybrid
9. Introsort: begin with quicksort and switch to heapsort when the
recursion depth exceeds a certain level
10. JSort
11. Spread sort
12. Timsort: adaptive algorithm derived from merge sort and insertion
sort. Used in Python 2.3 and up, and Java SE 7.
13. UnShuffle sort
Insertion sorts
14. Cycle sort: in-place with theoretically optimal number of writes
15. Insertion sort: determine where the current item belongs in the list of
sorted ones, and insert it there
16. Library sort
63
17. Patience sorting
18. Shell sort: an attempt to improve insertion sort
19. Tree sort (binary tree sort): build binary tree, then traverse it to
create sorted list
Merge sorts
20. Merge sort: sort the first and second half of the list separately, then
merge the sorted lists
21. Polyphase merge sort
22. Strand sort
Distribution (Non-comparison) sorts

23. American flag sort
24. Bead sort
25. Bucket sort
26. Burstsort: build a compact, cache efficient burst trie and then traverse
it to create sorted output
27. Counting sort
28. Flashsort
29. Pigeonhole sort
30. Proxmap sort: variant of Bucket sort which takes advantage of
hierarchical structure
31. Radix sort: sorts strings letter by letter
Selection sorts
32. Cartesian Tree sort
33. Cycle sort
34. Heap sort: convert the list into a heap, keep removing the largest
element from the heap and adding it to the end of the list
35. Selection sort: pick the smallest of the remaining elements, add it to
the end of the sorted list
64
36. Smooth sort
37. Tournament sort
Concurrent sorts
38. Batch odd-even mergesort
39. Bitonic sorter
40. Pairwise sorting network
Other
41. Pancake sorting
42. Topological sort
Unknown class
43. Distribution sort
44. Jump sort
45. Samplesort
46. Shaker sort
47. Snake sort
65
APPENDIX B
Pseudocode Implementation of some Sorting Algorithms
(a) Bubble Sort
procedure bubbleSort( A : list of sortable items )

n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure
(b) Bucket Sort
function bucketSort(array, n) is
buckets ← new array of n empty lists
for i = 0 to (length(array)-1) do
insert array[i] into buckets[msbits(array[i], k)]
for i = 0 to n - 1 do
nextSort(buckets[i])
return the concatenation of buckets[0], ..., buckets[n-1]
66
(c) Counting Sort
''' allocate an array Count[0..k] ; initialize each array cell to zero

; THEN '''
for each input item x:
Count[key(x)] = Count[key(x)] + 1
total = 0
for i = 0, 1, ... k:
c = Count[i]
Count[i] = total
total = total + c
''' allocate an output array Output[0..n-1] ; THEN '''

for each input item x:
store x in Output[Count[key(x)]]
Count[key(x)] = Count[key(x)] + 1
return Output
67
(d) Merge Sort
function merge_sort(list m)
// if list size is 1, consider it sorted and return it
if length(m) <= 1
return m
// else list size is > 1, so split the list into two sublists
var list left, right
var integer middle = length(m) / 2
for each x in m up to middle
add x to left
for each x in m after or equal middle
add x to right
// recursively call merge_sort() to further split each sublist
// until sublist size is 1
left = merge_sort(left)
right = merge_sort(right)
// merge the sublists returned from prior calls to merge_sort()
// and return the resulting merged sublist
return merge(left, right)
68
APPENDIX C
Some Sorting Algorithms in Visual Basic
(a) Insert Sort

(b) QSort
(c) Quick Sort
(d) Shell Sort
= = = = = = = = = = = = = = = = = = = = = = = =
(A) insert sort
= = = = = = = = = = = = = = = = = = = = = = = =
Public Sub InsertSort(ByRef A() As Variant, _

ByVal Lb As Long, ByVal Ub As
Long)
Dim t As Variant
Dim i As Long
Dim j As Long
'sort A[Lb..Ub]
For i = Lb + 1 To Ub
t = A(i)
'shift elements down until insertion point

'found
For j = i - 1 To Lb Step -1
If A(j) <= t Then Exit For
A(j + 1) = A(j)
Next j
' insert
A(j + 1) = t
Next i
End Sub
69
= = = = = = = = = = = = = = = = = = = = = = = =
(B) QSort
= = = = = = = = = = = = = = = = = = = = = = = =
Public Sub QSort(ByRef A() As Variant, _

Long)
Dim lbStack(32) As Long

Dim ubStack(32) As Long
Dim sp As Long ' stack pointer
Dim lbx As Long ' current lower-
bound
Dim ubx As Long ' current upper-
bound
Dim m As Long
Dim p As Long ' index to pivot
Dim i As Long
Dim j As Long
Dim t As Variant ' temp used for exchanges
lbStack(0) = Lb
ubStack(0) = Ub
sp = 0
Do While sp >= 0
lbx = lbStack(sp)
ubx = ubStack(sp)
Do While (lbx < ubx)
' select pivot and exchange with 1st

' element
p = lbx + (ubx - lbx) \ 2
70
' exchange lbx, p
t = A(lbx)
A(lbx) = A(p)
A(p) = t
' partition into two segments

i = lbx + 1
j = ubx
Do
Do While i < j
If A(lbx) <= A(i) Then Exit Do
i = i + 1
Loop
Do While j >= i
If A(j) <= A(lbx) Then Exit Do
j = j - 1
Loop
If i >= j Then Exit Do
' exchange i, j
t = A(i)
A(i) = A(j)
A(j) = t
j = j - 1
i = i + 1
Loop
' pivot belongs in A[j]

' exchange lbx, j
t = A(lbx)
A(lbx) = A(j)
A(j) = t
71
m = j
' keep processing smallest segment, and

' stack largest
If m - lbx <= ubx - m Then
If m + 1 < ubx Then
lbStack(sp) = m + 1
ubStack(sp) = ubx
sp = sp + 1
End If
ubx = m - 1
Else
If m - 1 > lbx Then
lbStack(sp) = lbx
ubStack(sp) = m - 1
sp = sp + 1
End If
lbx = m + 1
End If
Loop
sp = sp - 1
Loop
End Sub
= = = = = = = = = = = = = = = = = = = = = = = =
(C) Quick Sort
= = = = = = = = = = = = = = = = = = = = = = = =
Public Sub QuickSort(ByRef A() As Variant, _

Long)
Dim m As Long
' sort array A(lb..ub)
Do While Lb < Ub
72
' quickly sort short lists
If (Ub - Lb <= 12) Then
Call InsertSort(A, Lb, Ub)
Exit Sub
End If
' partition into two segments

m = Partition(A, Lb, Ub)
' sort the smallest partition to minimize

' stack requirements
If m - Lb <= Ub - m Then
Call QuickSort(A, Lb, m - 1)
Lb = m + 1
Else
Call QuickSort(A, m + 1, Ub)
Ub = m - 1
End If
Loop
End Sub
---------
Private Function Partition(ByRef A() As Variant,_

ByVal Lb As Long, ByVal Ub As Long) _
As Long
Dim t As Variant
Dim pivot As Variant
Dim i As Long
Dim j As Long
Dim p As Long
' partition array[lb..ub]
' select pivot and exchange with 1st element

p = Lb + (Ub - Lb) \ 2
73
pivot = A(p)
A(p) = A(Lb)
' sort Lb+1 .. Ub based on pivot

i = Lb
j = Ub + 1
Do
Do
j = j - 1
While j > i And A(j) > pivot
Do
i = i + 1
While i < j And A(i) < pivot
If i >= j Then Exit Do
' swap A(i), A(j)

t = A(i)
A(i) = A(j)
A(j) = t
Loop
' pivot belongs in A(j)

A(Lb) = A(j)
A(j) = pivot
Partition = j
End Function
= = = = = = = = = = = = = = = = = = = = = = = =
(D) Shell Sort
= = = = = = = = = = = = = = = = = = = = = = = =
Public Sub ShellSort(ByRef A() As Variant, ByVal Lb

As Long, ByVal Ub As Long)
74
Dim n As Long
Dim h As Long
Dim i As Long
Dim j As Long
Dim s As Long
Dim t As Variant
Dim seq(28) As Long
' sort array[lb..ub]
s = ShellSeq(seq, Ub - Lb + 1)
Do While s >= 0
' sort by insertion in increments of h

h = seq(s)
For i = Lb + h To Ub
t = A(i)
For j = i - h To Lb Step -h
A(j + h) = A(j)
Next j
A(j + h) = t
Next i
s = s - 1
Loop
End Sub
Private Function ShellSeq(ByRef h() As Long, ByVal

n As Long)
' establish increment sequence (see Knuth, Vol

3)
Dim p1 As Long
Dim p2 As Long
75
Dim p3 As Long
Dim s As Long
p1 = 1
p2 = 1
p3 = 1
s = -1
Do
s = s + 1
If s Mod 2 Then
h(s) = 8 * p1 - 6 * p2 + 1
Else
h(s) = 9 * p1 - 9 * p3 + 1
p2 = 2 * p2
p3 = 2 * p3
End If
p1 = 2 * p1
Loop While 3 * h(s) < n
If s > 0 Then s = s - 1
ShellSeq = s
End Function
76
APPENDIX D
PROGRAM SOURCE CODE
MICROSOFT VISUAL STUDIO 2010 ULTIMATE
.NET FRAMEWORK 4 BETA 2
Imports System.IO
Imports Microsoft.VisualBasic.FileSystem
Imports System.Windows.Forms.KeyEventArgs
Public Class insertsortform

Inherits System.Windows.Forms.Form
Public Sub sort(ByVal aa As Windows.Forms.ListBox, _

ByVal bb As Windows.Forms.ListBox)
Dim elem As Integer = aa.Items.Count - 1

Dim array1(elem) As Object
Dim nu As Integer
For nu = 0 To elem
array1(nu) = aa.Items.Item(nu)
Next
array1 = all.InsertSort(array1, LBound(array1), UBound(array1))
If bb.Items.Count > 0 Then

bb.Items.Clear()
End If
For nu = 0 To elem
bb.Items.Add(array1(nu))
Next
''
End Sub
#Region "Form Load"
Private m_Items(9) As CustomerInfo
Private Sub insertsortform_Load(ByVal sender As System.Object, _

ByVal e As System.EventArgs) Handles
MyBase.Load
77
m_Items(0) = New CustomerInfo("Archer", 4)
m_Items(1) = New CustomerInfo("Beck", 7)
m_Items(2) = New CustomerInfo("Cantu", 2)
m_Items(3) = New CustomerInfo("Deevers", 1)
m_Items(4) = New CustomerInfo("Edwards", 9)
m_Items(5) = New CustomerInfo("Finnagan", 3)
m_Items(6) = New CustomerInfo("Guy", 5)
m_Items(7) = New CustomerInfo("Hennesey", 8)
m_Items(8) = New CustomerInfo("Jacquinth", 6)
m_Items(9) = New CustomerInfo("Irving", 10)
'' Uncomment to test duplicate objects.
'm_Items(6) = New CustomerInfo("Finnagan", 3)
'm_Items(7) = New CustomerInfo("Edwards", 9)
'm_Items(8) = New CustomerInfo("Finnagan", 3)
'm_Items(9) = New CustomerInfo("Deevers", 1)
' Start sorting by name.

radByName.Checked = True
txt.Focus()
End Sub
' Holds data.

Private Class CustomerInfo
Public Name As String
Public ID As Integer ' We sort on this value.
Public Sub New(ByVal new_name As String, ByVal new_id As Integer)

Name = new_name
ID = new_id
End Sub
Public Overrides Function ToString() As String

Return ID.ToString() & ". " & Name
End Function
End Class
#End Region
#Region "Tab1 Basics"
Private Sub txt_GotFocus(ByVal sender As Object, _

ByVal e As System.EventArgs) Handles txt.GotFocus
If addBtn.Enabled = False Then

ListBox2.Items.Clear()
addBtn.Enabled = True
End If
78
End Sub
Private Sub txt_KeyDown(ByVal sender As Object, _

ByVal e As System.Windows.Forms.KeyEventArgs)
If e.KeyCode = Windows.Forms.Keys.Return Or e.KeyValue() = 13 Then
add()
End If
End Sub
Private Sub Label5_Click(ByVal sender As System.Object, _

Label5.Click
End Sub
Private Sub addBtn_Click(ByVal sender As System.Object, _

addBtn.Click
add()
End Sub
Sub add()
If addBtn.Enabled = True Then
Dim item As Single = Val(txt.Text)
If item <= 0 Then
txt.Clear()
Exit Sub
End If
ListBox1.Items.Add(item)
txt.Clear()
End If
txt.Focus()
End Sub
Private Sub Button5_Click(ByVal sender As System.Object, _

ByVal e As System.EventArgs) Handles Button5.Click
With ListBox2.Items
For i = 1 To 7
Dim oOK As Integer = .Count - 1
Dim c1 As Integer = 0
While oOK > c1
If .Item(c1) = .Item(c1 + 1) Then
79
.RemoveAt(c1)
oOK -= 1
End If
c1 += 1
End While
Next
End With
End Sub
Private Sub sortBtn_Click(ByVal sender As System.Object, _

ByVal e As System.EventArgs) Handles sortBtn.Click
sort(ListBox1, ListBox2)
addBtn.Enabled = False
End Sub
' to Remove individual items from listbox1

Private Sub ListBox1_KeyDown(ByVal sender As Object, ByVal e As
System.Windows.Forms.KeyEventArgs)
If e.KeyCode = Windows.Forms.Keys.Delete Then
ListBox1.Items.RemoveAt(ListBox1.SelectedIndex)
End If
End Sub
Private Sub clearBtn_Click(ByVal sender As System.Object, ByVal e As

System.EventArgs) Handles clearBtn.Click
End Sub
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As

System.EventArgs) Handles Button1.Click
Dim auto As Integer = Val(InputBox("Number of items to iterate

through. (1 - 262144)" & vbNewLine & "Dataset greater than 131072 may slow
down your PC (depending on your processor)", "Automate", 0))
If auto < 1 Or auto > 262144 Then

MsgBox("Input should be a positive Integer between 1 and 262144",
MsgBoxStyle.Critical, "Error!")
Exit Sub
End If
Dim tim = Timer
80
For autoGen = 1 To auto
ListBox1.Items.Add(Int((1000 * Rnd()) + 1))
Next
MsgBox("Dataset of " & auto & " items was sorted in " & (Timer - tim)
& " seconds!", MsgBoxStyle.Information, "TIME")
End Sub
#End Region
#Region "Tab3 Sort Text File"
Private Sub openSplitText_Click(ByVal sender As System.Object, _

ByVal e As System.EventArgs) Handles openSplitText.Click
Dim content As String
Try
content = FileIO.FileSystem.ReadAllText(dir.Text)
Catch
MsgBox("check your file dir")
dir.Focus()
Exit Sub
End Try
' while the text file is not empty,

Dim lent As Integer = content.Length
len.Text = lent
If lent < 1 Then

MsgBox("Empty File")
dir.Focus()
Exit Sub
End If
Dim delimit As String = delim.Text

Dim split As String()
Select Case delimit

Case "<any>"
split = content.Split(New [Char]() {" "c, ","c, ":"c, "/"c,
"*"c, CChar(vbTab), CChar(vbNewLine)})
Case "New Line"
split = content.Split(New [Char]() {CChar(vbNewLine)})
Case "Tab"
split = content.Split(New [Char]() {CChar(vbTab)})
Case "Space"
81
split = content.Split(New [Char]() {" "c})
Case "Comma"
split = content.Split(New [Char]() {","c})
Case "Underscore"
split = content.Split(New [Char]() {"_"c})
Case "@"
split = content.Split(New [Char]() {"@"c})
Case "/"
split = content.Split(New [Char]() {"/"c})
Case "\"
split = content.Split(New [Char]() {"\"c})
Case "#"
split = content.Split(New [Char]() {"#"c})
Case "$"
split = content.Split(New [Char]() {"$"c})
Case ":"
split = content.Split(New [Char]() {":"c})
Case "&"
split = content.Split(New [Char]() {"&"c})
Case "*"
split = content.Split(New [Char]() {"*"c})
Case "-"
split = content.Split(New [Char]() {"-"c})
Case "+"
split = content.Split(New [Char]() {"+"c})
Case "="
split = content.Split(New [Char]() {"="c})
Case Else
split = content.Split(New [Char]() {" "c, ","c, ":"c, "/"c,
CChar(vbTab), CChar(vbNewLine)})
End Select
Dim numb As Integer = split.GetLength(0)

num.Text = numb
For Each s As String In split

If s.Trim() <> "" Then
ListBox3.Items.Add(s)
End If
Next
End Sub
Private Sub sortText_Click(ByVal sender As System.Object, _

ByVal e As System.EventArgs) Handles sortText.Click
82
End Sub
Private Sub reWriteText_Click(ByVal sender As System.Object, _

reWriteText.Click
Dim a As String
Dim content As String
content = FileIO.FileSystem.ReadAllText(dir.Text)
FileIO.FileSystem.WriteAllText(dir.Text, "", False)
Try
For i = 0 To ListBox3.Items.Count - 1
a = ListBox3.Items.Item(i) & vbNewLine
FileIO.FileSystem.WriteAllText(dir.Text, a, True)
Next
Catch ex As Exception
FileIO.FileSystem.WriteAllText(dir.Text, content, False)
MsgBox("an error occured")
End Try
End Sub
#End Region
#Region "Tab5 Key Fields"
' Compares CustomerInfo objects for sorting.

Private Class CustomerInfoComparer
Implements System.Collections.IComparer
' Determines whether we compare by name or ID.

Public Enum CompareTypes
ByName
ById
End Enum
Public CompareType As CompareTypes = CompareTypes.ById
Public Sub New(ByVal compare_type As CompareTypes)

CompareType = compare_type
End Sub
' Compare two CustomerInfo objects.

Public Function Compare(ByVal x As Object, ByVal y As Object) _
As Integer Implements System.Collections.IComparer.Compare
Dim x_cust As CustomerInfo = DirectCast(x, CustomerInfo)
Dim y_cust As CustomerInfo = DirectCast(y, CustomerInfo)
83
Select Case CompareType
Case CompareTypes.ById
Return x_cust.ID.CompareTo(y_cust.ID)
Case CompareTypes.ByName
Return x_cust.Name.CompareTo(y_cust.Name)
End Select
End Function
End Class
' Sort by name.

Private Sub radByName_CheckedChanged(ByVal sender As System.Object, _
radByName.CheckedChanged
If radByName.Checked Then
SortBy(CustomerInfoComparer.CompareTypes.ByName)
End Sub
' Sort by ID.

Private Sub radById_CheckedChanged(ByVal sender As System.Object, _
radById.CheckedChanged
If radById.Checked Then SortBy(CustomerInfoComparer.CompareTypes.ById)
End Sub
' Sort by name or ID.

Private Sub SortBy(ByVal compare_type As
CustomerInfoComparer.CompareTypes)
' Sort.
Dim cust_comparer As New CustomerInfoComparer(compare_type)
Array.Sort(m_Items, cust_comparer)
' Display the items.

lstItems.Items.Clear()
For i As Integer = 0 To m_Items.Length - 1
lstItems.Items.Add(m_Items(i).ToString())
Next i
End Sub
#End Region
End Class
Module all
' insert sort

Public Function InsertSort(ByRef A() As Object, ByVal Lb As Long, _
ByVal Ub As Long)
84
Dim t As Object
Dim i As Long
Dim j As Long
' sort A[Lb..Ub]

For i = Lb + 1 To Ub
t = A(i)
' shift elements down until insertion point found

For j = i - 1 To Lb Step -1
A(j + 1) = A(j)
Next j
' insert
A(j + 1) = t
Next i
InsertSort = A
End Function
End Module
85

Implementation of Sorting Algorithms

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Implementation of Sorting Algorithms

Hochgeladen von

Copyright:

Verfügbare Formate

IMPLEMENTATION OF

ERINFOLAMI DANIEL OLUWATOSIN

BEING A PROJECT SUBMITTED TO THE

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD OF

BACHELOR OF SCIENCE (B.SC) DEGREE IN COMPUTER SCIENCE

Oluwatosin with matriculation number 074617 of the Department of

Mathematical Science (Computer Science), Faculty of Science, Ekiti State

provision throughout my course of study in Ekiti State University and also

has always been my Guide, my mentor, my Provider who has been

protecting me up till now, I owe him everything.

To my able Supervisor Mr. A.A Obayomi who is always being there

manuscript and making constructive criticisms, useful advice and

I also appreciate the H.O.D of the department Dr R.A Adeleke, my

level coordinator, Mr. Ekong, and other lecturers in the department.

by sponsoring my education; I love you so much.

Abimbola Akande and a special crystal, Olutolami Falaiye. Also to my

CHAPTER ONE: INTRODUCTION

1.2 Aims and Objectives of The Project 5

1.4 Scope and Limitations of the Project 6

1.5 Structure of The Project 7

2.1 History of Sorting 9

2.6 Popular Sorting Algorithms 20

2.6.1 Bubble Sort 20

2.6.2 Cocktail Sort 22

2.6.3 Comb Sort 22

2.6.4 Gnome Sort 23

2.6.5 Heap Sort 24

2.6.6 Insertion Sort 25

2.6.8 Jump Sort 27

2.6.9 Merge Sort 28

2.6.10 Quick Sort 29

2.6.11 Quick Sort 3 33

2.6.13 Shaker Sort 35

2.6.14 Shell Sort 35

2.6.15 Smooth Sort 38

2.6.16 Snake Sort 38

2.7.1 Comparison Sorts 40

2.7.2 Non-Comparison Sorts 43

2.7.3 Impractical Sorts 45

CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN

3.2 The Ideal Sorting Algorithm 47

3.3 Characteristics of a Good System Design 48

3.4 Development Tools 49

3.5 Overview of the Proposed Program 49

3.6 Program Architecture 50

3.7 The User Interface 50

4.1 System Requirement 52

4.2 Installation Description 53

4.3 Program Implementation 54

4.3.1 Tab 1: Basic Sorting 54

4.3.2 Tab 2: Sort File 56

4.3.3 Tab 3: Sort by Key Field 58

CHAPTER FIVE: CONCLUSION AND RECOMMENDATION

Appendix A: List of Sorting Algorithms 63

Appendix B: Pseudocode Implementation of some Sorting Algorithms 66

Appendix D: Program Source Code 77

This project presents and implements a collection of algorithms for sorting.

basic sorting concepts and classifications of sorting algorithms. The next

popular algorithms, in Visual Basic, is included.

The implementation, in Visual Basic, is a Windows Forms application.

At first the sorting algorithms will be applied to an array of integers.

Although this is a good way to introduce the algorithms, it is not very

common in practical situations to sort an array of simple numbers. Usually,