M4 Sorting

Resmi N.G. References: Data Structures and Algorithms: Alfred V. Aho, John E. Hopcroft, Jeffrey D.
Ullman
Syllabus
Searching - Sequential Search - Searching Arrays and Linked Lists Binary Searching - Searching arrays and Binary Search Trees Hashing - Open & Closed Hashing - Hash functions Resolution of Collision Sorting- n2 Sorts - Bubble Sort - Insertion Sort - Selection Sort - n log n Sorts - Quick Sort - Heap Sort - Merge Sort - External Sort - Merge Files
10/25/2012 CS09 303 Data Structures - Module 4 2
Bubble Sort
Bubble sort is a comparison-sort algorithm. The algorithm starts at one end of the data set. It compares two adjacent elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the other end of the data set.
10/25/2012
CS09 303 Data Structures - Module 4
The algorithm gets its name from the way smaller elements "bubble" to the top of the list (or larger elements bubble to the end of the list). Because it only uses comparisons to operate on elements, it is a comparison sort.
10/25/2012
Bubble Sort (from the beginning)

The algorithm starts at the beginning of the data set. It compares first two elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the end of the data set. After the first pass, the largest element will be at the last position.
It then starts again with the first two elements, repeating the process until no swaps have occurred on the last pass. After the second pass, the second largest element will be in second last position in the array and so on.
10/25/2012
Bubble Sort (from the end)

The algorithm starts at the end of the data set. It compares the last two elements, and if they are in wrong order, it swaps them. It continues doing this, for each pair of adjacent elements, to the beginning of the data set. After the first pass, the smallest element will be at the first position.
It then starts again from the end, leaving behind the last element which is already sorted, compares the elements, repeating the process until no swaps have occurred on the last pass. After the second pass, the second smallest element will be in its right position in the array and so on.
10/25/2012
Algorithm 1
{Smaller elements bubble to the beginning of the list.} For i := 1 to n-1 do for j:= n downto i+1 do if A[j] < A[j-1] then temp := A[j]; A[j] := A[j-1]; A[j-1] := temp;
Algorithm 2
{Larger elements bubble to the end of the list.} For i := 1 to n-1 do for j:= 1 to n-i do if A[j] >A[j+1] then temp := A[j]; A[j] := A[j+1]; A[j+1] := temp;
First Pass : i=1

1 2 3 4 5 6
5
j 1 swap
1
j+1 2
1
j
5
swap
3
j+1
3
j
5
swap
4
j+1
10/25/2012
11
4
j
5
No swap 4
6
j+1 5
5
j
6
swap
2
j+1
sorted
Second Pass : i=2

1 2 3 4 5 6
1
j 1
3
No swap 2 j+1
1
j
3
No swap
4
j+1
1
10/25/2012
3
j
4
No swap
5
j+1
6
13
4
j
5
swap
2
j+1 5
sorted
10/25/2012
14
Third Pass : i =3
1 2 3 4 5 6
1
j 1
3
No swap 2 j+1
1
j
3
No swap
4
j+1
3
j
4
swap
2
j+1
10/25/2012
15
sorted
10/25/2012
16
Fourth Pass : i =4
1 2 3 4 5 6
1
j 1
3
No swap 2 j+1
1
j
3
swap
2
j+1
1
10/25/2012
5
sorted
6
17
Fifth Pass : i =5
1 2 3 4 5 6
1
j 1
2
No swap 2 j+1
sorted
10/25/2012
18
Selection Sort
The algorithm finds the minimum value, swaps it with the value in the first position, and repeats these steps for the remainder of the list. In the ith pass, lowest among A[i], , A[n] is selected and swapped with A[i]. After i passes, the lowest i keys will occupy A[1], A[2], , A[i] in sorted order. It does no more than n swaps for an array of n elements.
10/25/2012
20
First Pass : i=1

1 2 3 4 5 6
5
i min 1
1
j 2
3
3
4
4
6
5
2
6
5
i 1 min
1
2
3
j 3
4
4
6
5
2
6
5
i 1 min
1
2
3
3
4
j 4
6
5
2
6
5
i
10/25/2012
1
min
6
j
2
21
5
i min
2
j
5
i swap
1
min
1 sorted
2
22
10/25/2012
Second Pass : i=2

1 2 3 4 5 6
1
i 1
5
min 2
3
j 3
1
i
3
min
4
j
1
i
10/25/2012
5
min
6
j
2
23
1
i
5
min
2
j
1
i
4
swap
2
min
1
sorted
10/25/2012
24
Third Pass : i=3

1 2 3 4 5 6
1
1
2
i 2
3
min 3
4
j 4
6
5
5
6
1
1
2
i 2
3
min 3
4
4
6
j 5
5
6
2
i
3
min 3
6
No swap
5
j 6
1
10/25/2012
5
25
sorted
Fourth Pass : i=4

1 2 3 4 5 6
1
1
2
2
3
3
4
i min 4
6
j 5
5
6
1
1
2
2
3
3
4
i min 4
6
5
5
j 6
1
1
2
2
3
3
4
i min 4
6
No swap 5
5
6
1
10/25/2012
2
sorted
5
26
Fifth Pass : i=5

1 2 3 4 5 6
6
i min
5
j
6
i swap
5
min
3
sorted
10/25/2012
27
Insertion Sort
Insertion sort is a sorting algorithm that builds the final sorted array (or list) one item at a time. Here, on the ith pass, the ith element A[i] is inserted into its right position among A[1], , A[i-1], which were previously placed in sorted order. After inserting A[i], A[1], A[2], , A[i] are in sorted order.
i=2
1 i=2 2 3 4 5 6
18
i 1
20
3
11
4
15
5
9
6
j=2
18
j-1 swap 1
7
j
20
11
15
j=1
-
j-1 j
18
20
11
15
10/25/2012
sorted
29
i=3
1 i=3 2 3 4 5 6
7
1
18
i 2
20
3
11
4
15
5
9
6
j=3
18
j-1
20
j
11
15
18
20
sorted
11
15
10/25/2012
30
i=4
1 i=4 2 3 4 5 6
7
1
18
2
20
i 3
11
4
15
5
9
6
j=4
18
20
j-1 swap
11
j
15
1 j=3
18
j-1 swap
11
j
20
15
10/25/2012
31
1 j=2
7
j-1
11
j
18
20
15
11
18
20
15
sorted
10/25/2012
32
i=5
1 i=5 2 3 4 5 6
7
1
11
2
18
3
20
4
15
i 5
9
6
j=5
11
18
20
j-1 swap
15
j
1 j=4
11
18
j-1 swap
15
j
20
10/25/2012
33
1 j=3
11
j-1
15
j
18
20
11
15
18
sorted
20
10/25/2012
34
i=6
1 i=6 2 3 4 5 6
7
1
11
2
15
3
18
4
20
i 5
9
6
j=6
11
15
18
20
j-1 swap
9
j
1 j=5
11
15
18
j-1 swap
9
j
20
10/25/2012
35
1 j=4
11
j-1
15
9
j
18
20
swap 1 j=3 2 3 4 5 6
11
j-1 swap
9
j
15
18
20
1 j=2
-
j-1
7
1
9
j 2
11
3
15
4
18
5
20
6
-
10/25/2012
11
15
18
20
sorted
36
Quick Sort
Quicksort is a divide and conquer algorithm. The steps are:
Pick an element, called a pivot, from the list. Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation. Recursively, sort the sub-list of lesser elements and the sub-list of greater elements.
10/25/2012
38
10/25/2012
39
10/25/2012
40
10/25/2012
41
Heap
A heap is a specialized tree-based data structure in which all the nodes satisfy the heap property: Either the keys of parent nodes are always greater than or equal to those of the children and the highest key is in the root node (this kind of heap is called max heap) or the keys of parent nodes are less than or equal to those of the children (min heap).
Heap
Min-Heap A balanced, left-justified binary tree in which no node has a value lesser than the value in its parent.
10/25/2012
43
Max-Heap A balanced, left-justified binary tree in which no node has a value greater than the value in its parent.
10/25/2012
44
Constructing a Heap
Construct a heap by adding nodes one at a time: Add the node just to the right of the rightmost node in the deepest level. If the deepest level is full, start a new level.
Add a new node here Add a new node here
10/25/2012
45
Heap Sort
Heapsort is an in-place algorithm. It is a two step algorithm. The first step is to build a heap out of the data. The second step consists of two parts: Repeat until all the elements have been removed from the heap: Remove the smallest(or largest) element from the minheap(or max-heap) and insert it into the array. Reconstruct the heap. After all the elements have been removed from the heap, we have a sorted array.
10/25/2012
47
10/25/2012
48
10/25/2012
49
Merge Sort
Merge Sort is a O(n log n) comparison-based sorting algorithm. Merge sort is a divide and conquer algorithm. Merge sort works as follows: Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted). Repeatedly merge the sublists to produce new sublists until there is only 1 sublist remaining. This will be the sorted list.
10/25/2012
51
Mergesort
A divide-and-conquer algorithm: Divide the unsorted array into 2 halves until the subarrays only contain one element. Merge the sub-problem solutions together:
Compare the sub-arrays first elements Remove the smallest element and put it into the result array Continue the process until all elements have been put into the result array
37 23 6 89 15 12 2 19
10/25/2012
52
Informal Algorithm
Mergesort(Passed an array) if array size > 1 Divide array in half. Call Mergesort on first half. Call Mergesort on second half. Merge two halves. Merge(Passed two arrays) Compare leading element in each array. Select lower and place in new array.
10/25/2012
54
10/25/2012
55
10/25/2012
56
10/25/2012
57
10/25/2012
58
12
10/25/2012
59
12
12
10/25/2012
60
12
12
10/25/2012
61
12
12
10/25/2012
62
12
12
Merge
10/25/2012
63
12
12
2 Merge
10/25/2012
64
12
12
Merge
10/25/2012
65
12
12
10/25/2012
66
12
12
Merge
10/25/2012
67
12
12
Merge
10/25/2012
68
12
12
Merge
10/25/2012
69
12
12
Merge
10/25/2012
70
12
12
1 Merge
10/25/2012
71
12
12
2 Merge
10/25/2012
72
12
12
4 Merge
10/25/2012
73
12
12
4 Merge
10/25/2012
74
12
12
12
10/25/2012
75
12
12
12
10/25/2012
76
12
12
12
4 Merge
10/25/2012
77
12
12
12
6 Merge
10/25/2012
78
12
12
12
Merge
10/25/2012
79
12
12
12
12
10/25/2012
80
12
12
12
12
7 Merge
10/25/2012
81
12
12
12
12
3 Merge
10/25/2012
82
12
12
12
12
12
Merge
10/25/2012
83
12
12
12
12
12
9 Merge
10/25/2012
84
12
12
12
12
6 3
12
Merge
10/25/2012
85
12
12
12
12
6 3
7 6
12
Merge
10/25/2012
86
12
12
12
12
6 3
7 6 7
12
Merge
10/25/2012
87
12
12
12
12
6 3
7 6 7
3 12
12
Merge
10/25/2012
88
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
12
Merge
12
12
12
12
6 3
7 6 7
3 12
12
12
10/25/2012
98
12
12
10/25/2012
99
Divide the unsorted collection into two until the subarrays only contain one element. Then merge the sub-problem solutions together.
10/25/2012
100
External Sorting
Merge sort (Merge files) O(log n)
10/25/2012
101
10/25/2012
102
HASHING
Hashing is a method of Information Retrieval - typically used for database management systems, other systems in which rapid storage and retrieval of information is necessary.
Hashing is used to compute the location of the desired record in order to retrieve it in a single access. eg: empcode in employee file, which is called a key.
Hashing takes a potentially huge range of values and maps it to a much smaller range of values.
It is used to implement dictionaries.
10/25/2012
104
Hash Tables
Motivation: symbol tables A compiler uses a symbol table to relate symbols to associated data. Symbols: variable names, procedure names, etc. Associated data: memory location, call graph, etc. For a symbol table (also called a dictionary), we care about search, insertion, and deletion.
10/25/2012
105
HASH FUNCTIONS
Hash function is the transformation of key into corresponding location in the hash table.
A hash function H can be defined as function that takes key as input and transforms it into a hash table index.
H: KEY------> INDEX or ADDRESS
10/25/2012
106
There are mainly 3 hash functions: Division Method Mid square method Folding Method
10/25/2012
107
The Division Method (MODULO arithmetic)

h(k) = k mod m Hash the key k into a table with m slots using the slot given by: the remainder of (k divided by m). Pick table size m = prime number not too close to a power of 2 (or 10), to avoid maximum collision.
10/25/2012
108
A simple hash function H(k)

function H(x :array[1..10]of char ):0..n-1; Var i, sum:integer; begin sum:= 0; for i:=1 to 10 do sum = sum + ord(x[i]); h:= sum mod n end ;{H}
H(k) = k mod m HASH TABLE : Let m =7 where TABLE contains 5 records. (i.e , m should be selected such that it is greater than total number of records in the TABLE. Hash address 0 1 2 3 4
10/25/2012
Employee code (key ,K ) 49
Employee name John
500 11
Tom Bell
110
Table size of 100 3 Digit numbers are the keys 999 possible items Indices 0..99 on the table 999 % 100 = 99 (100 is Table size) 524 % 100 = 24 199 % 100 = 99 (COLLISION)
10/25/2012
111
Mid- Square Method

In midsquare hashing, the key is squared and the address is selected from the middle of the squared number. The most obvious limitation of this method is the size of the key. Given a key of 6 digits, the product will be 12 digits, which is beyond the maximum integer size of many computers.
10/25/2012
112
H(k) = k2 Same number of digits must be used for all of the keys.
K
K2 H(K)
14 196 9
15 225 2
26 676 7
10/25/2012
113
Hash address 0 1 2 3 4 5 6 7 8 9
10/25/2012
Employee code (key ,K )
Employee name
15
Anu
26 14
Sam Neenu
114
FOLDING Method
H(K) = K1 + K2+....+ Kr Key is partitioned into number of parts. The parts should have same number of digits, as the required hash address. Then the parts are added together ignoring the last carry.
K K1 K2 K3
K2 + K3
10/25/2012
2103 21 , 03
7148 71 , 48
12345 12 ,34 , 5
H(K) = K1 + 21+03=24
71+48=19 12+34+5= 51
115
H(K)=K1 + K2+....+ Kr Extra milling can also be applied to even numbered parts,ie.K2, K4 are reversed before addition
K 2103 7148 12345
K1 K2 K3 Reversing K2 ,K4.
21 , 03 21,30
71 , 48 71,84 71+84=55
12 ,34 , 5 12,43,5 12+43+5=60

116
H(K)= K1 + 21+30=51 K2 + K3
10/25/2012
Hash Collision
Sometimes, 2 different keys may hash to the same external location! This is called a COLLISION.
Hash address 0 Employee code (key ,K ) 49(if a key 14 occurs, there is a collision) Employee name anju
1 2 3 4
10/25/2012
500 11
Meena clark
117
Collision Resolution
Handling Collisions - Techniques: Two Major Strategies: 1) Open Addressing - Find another spot in the "Table" (same contiguous address space) 2) Chaining - Find another spot outside the "Table"
10/25/2012
118
Resolving Collisions
Solution 1: Chaining
Keep linked list of elements in slots Upon collision, just add new element to list
Solution 2: open addressing -To insert: if slot is full, try another slot, and another, until an open slot is found (Linear probing)
To search, follow same sequence of probes as would be used when inserting the element
Solution 3: bucket addressing

Chaining
How do we insert an element?
U (universe of keys) k1 k4 K (actual k7 keys) k6
10/25/2012
T k1 k5 k2 k7 k4
k5
k2
k8
k3
k3 k8 k6
120
Chaining
How do we search for a element with a T given key?
U (universe of keys) k1 k4 K (actual k7 keys) k6
10/25/2012
k1 k5 k2 k7 k4
k5
k2
k8
k3
k3 k8 k6
121
Variation of Open addressing

Quadratic probing Suppose a record with R with key k has the hash address H(k)=h. Then instead of searching the location with address h,h+1,h+2,.h+i., we search for free hash address h,h+1,h+4,h+9,.,h+i 2
10/25/2012
122
Variation of Open addressing

Double Hashing A Second hash function is used to resolve the collision. Suppose there is a primary hash function H(k)=(kmod)m. If any collision occurs, apply second hash function say H(k)= k mod m1
10/25/2012
123
BUCKET Addressing
Store colliding elements in the same position in table by introducing a bucket with each hash address. A bucket is a block of memory space ,which is large enough to store multiple items. If a bucket is full then the colliding item can be stored in new bucket by incorporating its link to previous bucket.
10/25/2012
124
Thank You
10/25/2012
125

M4 Sorting

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

M4 Sorting

Hochgeladen von

Copyright:

Verfügbare Formate

Resmi N.G. References: Data Structures and Algorithms: Alfred V. Aho, John E. Hopcroft, Jeffrey D.

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

Bubble Sort (from the beginning)

CS09 303 Data Structures - Module 4

Bubble Sort (from the end)

CS09 303 Data Structures - Module 4

First Pass : i=1

CS09 303 Data Structures - Module 4

Second Pass : i=2

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

First Pass : i=1

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

Second Pass : i=2

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

Third Pass : i=3

CS09 303 Data Structures - Module 4

Fourth Pass : i=4

CS09 303 Data Structures - Module 4

Fifth Pass : i=5

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4

CS09 303 Data Structures - Module 4