Sie sind auf Seite 1von 12

Contents

Introduction to Merge Sort ........................................................................................................................... 3 1.1 Algorithm ............................................................................................................................................ 3 1.2 Analysis ............................................................................................................................................... 5 Introduction to Binary Search ....................................................................................................................... 7 1.1 Analysis ............................................................................................................................................... 9 1.3 Big Oh ................................................................................................................................................ 11 1.4 Binary Search .................................................................................................................................... 11 1.5 Binary Search Tree ............................................................................................................................ 11 1.6 Conclusion ......................................................................................................................................... 12

1|Page

2|Page

Introduction to Merge Sort


Merge sort is an O(n log n) comparison-based sorting algorithm. Most implementations produce a stable sort, meaning that the implementation preserves the input order of equal elements in the sorted output. It is a divide and conquer algorithm. Merge sort was invented by John von Neumann in 1945. A detailed description and analysis of bottomup merge sort appeared in a report by Goldstein and Neumann as early as 1948.

1.1 Algorithm
Conceptually, a merge sort works as follows : i. ii. iii. iv. If the list is of length 0 or 1, then it is already sorted. Otherwise: Divide the unsorted list into two sublists of about half the size. Sort each sublist recursively by re-applying the merge sort. Merge the two sublists back into one sorted list.

Merge sort incorporates two main ideas to improve its runtime: i. ii. A small list will take fewer steps to sort than a large list. Fewer steps are required to construct a sorted list from two sorted lists than from two unsorted lists. For example, you only have to traverse each list once if they're already sorted (see the merge function below for an example implementation).

Example: Use merge sort to sort a list of integers contained in an array: Suppose we have an array A with n indices ranging from to . We apply merge sort to and where c is the integer part of When the two halves are returned they will have been sorted. They can now be merged together to form a sorted array. In a simple pseudo code form, the algorithm could look something like this: function merge_sort(m) if length(m) 1 return m var list left, right, result var integer middle = length(m) / 2 for each x in m up to middle add x to left 3|Page

for each x in m after or equal middle add x to right left = merge_sort(left) right = merge_sort(right) result = merge(left, right) return result The merge function needs to then merge both the left and right lists. It has several variations; one is shown: function merge(left,right) var list result while length(left) > 0 or length(right) > 0 if length(left) > 0 and length(right) > 0 if first(left) first(right) append first(left) to result left = rest(left) else append first(right) to result right = rest(right) else if length(left) > 0 append first(left) to result left = rest(left) else if length(right) > 0 append first(right) to result right = rest(right) end while return result

4|Page

1.2 Analysis

In sorting n objects, merge sort has an average and worst-case performance of O(n log n). If the running time of merge sort for a list of length n is T(n), then the recurrence T(n) = 2T(n/2) + n follows from the definition of the algorithm (apply the algorithm to two lists of half the size of the original list, and add the n steps taken to merge the resulting two lists). The closed form follows from the master theorem. In the worst case, the number of comparisons merge sort makes is equal to or slightly smaller than (n lg n - 2lg n + 1), which is between (n lg n - n + 1) and (n lg n + n + O(lg n)). For large n and a randomly ordered input list, merge sort's expected (average) number of comparisons approaches n fewer than the worst case where In the worst case, merge sort does about 39% fewer comparisons than quick sort does in the average case; merge sort always makes fewer comparisons than quick sort, except in extremely rare cases, when they tie, where merge sort's worst case is found simultaneously with quick sort's best case. In terms of moves, merge sort's worst case complexity is O(n log n)the same complexity as quick sort's best case, and merge sort's best case takes about half as many iterations as the worst case. Recursive implementations of merge sort make 2n 1 method calls in the worst case, compared to quicksort's n, thus merge sort has roughly twice as much recursive overhead as quick sort. However, iterative, non-recursive implementations of merge sort, avoiding method call overhead, are not difficult to code. Merge sort's most common implementation does not sort in place; therefore, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only n/2 extra spaces). Merge sort as described here also has an often overlooked, but practically important, best-case property. If the input is already sorted, its complexity falls to O(n). Specifically, n-1 comparisons and zero 5|Page

moves are performed, which is the same as for simply running through the input, checking if it is presorted. Sorting in-place is possible (e.g., using lists rather than arrays) but is very complicated, and offers little performance gains in practice, even if the algorithm runs in O(n log n) time. In these cases, algorithms like heap sort usually offer comparable speed, and are far less complex. Additionally, unlike the standard merge sort, in-place merge sort is not a stable sort. In the case of linked lists, the algorithm does not use more space than that the already used by the list representation, but the O(log(k)) used for the recursion trace. Merge sort is more efficient than quick sort for some types of lists if the data to be sorted can only be efficiently accessed sequentially, and is thus popular in languages such as Lisp, where sequentially accessed data structures are very common. Unlike some (efficient) implementations of quick sort, merge sort is a stable sort as long as the merge operation is implemented properly. Merge sort also has some demerits. One is its use of 2n locations; the additional n locations were needed because one couldn't reasonably merge two sorted sets in place. But despite the use of this space the algorithm still does a lot of work: The contents of m are first copied into left and right and later into the list result on each invocation of merge_sort (variable names according to the pseudo code above). An alternative to this copying is to associate a new field of information with each key (the elements in m are called keys). This field will be used to link the keys and any associated information together in a sorted list (a key and its related information is called a record). Then the merging of the sorted lists proceeds by changing the link values; no records need to be moved at all. A field which contains only a link will generally be smaller than an entire record so less space will also be used. Another alternative for reducing the space overhead to n/2 is to maintain left and right as a combined structure, copy only the left part of m into temporary space, and to direct the merge routine to place the merged output into m. With this version it is better to allocate the temporary space outside the merge routine, so that only one allocation is needed. The excessive copying mentioned in the previous paragraph is also mitigated, since the last pair of lines before the return result statement (function merge in the pseudo code above) become superfluous. Merge sort can also be done with merging more than two sub lists at a time, using the n-way merge algorithm. However, the number of operations is approximately the same. Consider merging k sub lists at a time, where for simplicity k is a power of 2. The recurrence relation becomes T(n) = k T(n/k) + O(n log k). (The last part comes from the merge algorithm, which when implemented optimally using a heap or self-balancing binary search tree, takes O (log k) time per element.) If you take the recurrence relation for regular merge sort (T(n) = 2T(n/2) + O(n)) and expand it out log2k times, you get the same recurrence relation. This is true even if k is not a constant.

6|Page

Introduction to Binary Search


A binary search is if we place our items in an array and sort them in either ascending or descending order on the key first, then we can obtain much better performance with an algorithm. In binary search, we first compare the key with the item in the middle position of the array. If there's a match, we can return immediately. If the key is less than the middle key, then the item sought must lie in the lower half of the array; if it's greater then the item sought must lie in the upper half of the array. So we repeat the procedure on the lower (or upper) half of the array. Our FindInCollection function can now be implemented:

static void *bin_search( collection c, int low, int high, void *key ) { int mid; /* Termination check */ if (low > high) return NULL; mid = (high+low)/2; switch (memcmp(ItemKey(c->items[mid]),key,c->size)) { /* Match, return item found */ case 0: return c->items[mid]; /* key is less than mid, search lower half */ case -1: return bin_search( c, low, mid-1, key); /* key is greater than mid, search upper half */ case 1: return bin_search( c, mid+1, high, key ); default : return NULL; } }

void *FindInCollection( collection c, void *key ) {

7|Page

/* Find an item in a collection Pre-condition: c is a collection created by ConsCollection c is sorted in ascending order of the key key != NULL Post-condition: returns an item identified by key if one exists, otherwise returns NULL */ int low, high; low = 0; high = c->item_cnt-1; return bin_search( c, low, high, key ); } Points to note: bin_search is recursive: it determines whether the search key lies in the lower or upper half of the array, then calls itself on the appropriate half. There is a termination condition (two of them in fact!) If low > high then the partition to be searched has no elements in it and If there is a match with the element in the middle of the current partition, then we can return immediately. AddToCollection will need to be modified to ensure that each item added is placed in its correct place in the array. The procedure is simple: Search the array until the correct spot to insert the new item is found, Move all the following items up one position and Insert the new item into the empty position thus created. bin_search is declared static. It is a local function and is not used outside this class: if it were not declared static, it would be exported and be available to all parts of the program. The static declaration also allows other classes to use the same name internally.

8|Page

static reduces the visibility of a function an should be used wherever possible to control access to functions!

1.1 Analysis

Each step of the algorithm divides the block of items being searched in half. We can divide a set of n items in half at most log2 n times. Thus the running time of a binary search is proportional to log n and we say this is a O(log n) algorithm.

9|Page

Binary search requires a more complex program than our original search and thus for small n it may run slower than the simple linear search. However, for large n,

Thus at large n, log n is muc h smaller than n, consequently an O(log n) algorith m is much faster Plot of n and log n vs n . than anO(n) one. We will examine this behaviour more formally in a later section. First, let's see what we can do about the insertion (AddToCollection) operation. In the worst case, insertion may require n operations to insert into a sorted list. We can find the place in the list where the new item belongs using binary search in O(log n) operations. However, we have to shuffle all the following items up one place to make way for the new one. In the worst case, the new item is the first in the list, requiring n move operations for the shuffle! A similar analysis will show that deletion is also an O(n) operation. If our collection is static, ie it doesn't change very often - if at all - then we may not be concerned with the time required to change its contents: we may be prepared for the initial build of the collection and the occasional insertion and deletion to take some time. In return, we will be able to use a simple data structure (an array) which has little memory overhead. However, if our collection is large and dynamic, ie items are being added and deleted continually, then we can obtain considerably better performance using a data structure called a tree.

10 | P a g e

1.3 Big Oh
A notation formally describing the set of all functions which are bounded above by a nominated function.

1.4 Binary Search


Techniques for searching an ordered list in which we first check the middle item and - based on that comparison - "discard" half the data. The same procedure is then applied to the remaining half until a match is found or there are no more items left.

1.5 Binary Search Tree


Binary Search Tree or BST are : i. ii. iii. A collection of nodes. May be empty or non-empty. The top of the tree is called parent while it subtree is called child.

Non-empty binary search tree properties : i. ii. iii. Every node has a value and no two nodes have the same value (all values are distinct/no duplicate nodes). Values in the left subtree of the root are smaller than the value of the root. Values in the right subtree of the root are larger than the value of the root.

Binary Search (Operation) : i. ii. iii. iv. v. Create create an empty binary search tree Search search and return value of element Insert insert element into the search tree Delete delete element from the search tree Traverse retrieve element either in-order(LVR), post-order(LRV) or pre-order(VLR).

11 | P a g e

Binary Search (Search Algorithm) : If(the tree is empty) { return NULL; } else if (the item in the node equals the target) { return the node value; } else if (the item in the node is greater than the target) { return the result of searching the left subtree; } else if (the item in the node is smaller than the target) { return the result of searching the right subtree; }

1.6 Conclusion
i. ii. iii. The smallest element in a binary search tree is the left-most node. The largest element is the right most node. In-order traversal of a binary search tree will giving a result in increasing order.

12 | P a g e

Das könnte Ihnen auch gefallen