Sie sind auf Seite 1von 47

CS2201 DATA STRUCTURES

UNIT III

UNIT III BALANCED TREES SYLLABUS: AVL Trees Splay Trees B-Tree - heaps binary heaps applications of binary heaps CHAPTER 1-AVL TREES

AVL Tree - (Adelson Velskill and Landis) An AVL tree is a binary search tree except that for every node in the tree, the height of the left and right subtrees can differ by atmost 1. The height of the empty tree is defined to be - 1. A balance factor is the height of the left subtree minus height of the right subtree. For an AVL tree all balance factor should be +1, 0, or-I. If the balance factor of any node in an AVL tree becomes less than - or greater than 1, the tree has to be balanced by making either single or double rotations

An AVL tree causes imbalance, when any one of the following conditions occur.

CS2201 DATA STRUCTURES

UNIT III

Case 1: An insertion into the left subtrce of the left child of node a. Case 2 : An insertion into the right subtree of the left child of node a. Case 3: An insertion into the left subtree of the right child of node a. Case 4 :An insertion into the right subtree of the right child of node a. These imbalances can be overcome by 1. Single Rotation 2. Double Rotation. 1.1 Single Rotation Single Rotation is performed to fix case I and case 4. Case 1. An insertion into the left subtree of the left child of K Single Rotation to fix Case I.

ROUTINE TO PERFORM SINGLE ROTATION WITH LEFT

CS2201 DATA STRUCTURES

UNIT III

Case 4: insertion into the right sub-tree of right child of node k2:

SingleRotatewithRight(position K1) { Position k2; K2=K1->Right; K1->Right=K2->Left; K2->Left=K1; K2->Height=Max(Height(K2->Left),Height(K2->Right))+1; K1->Height=Max(Height(K1->Left),Height(K1->Right))+1; Return K2;} 1.2 DOUBLE ROTATION: Case 2: Insertion into the right subtree of left child of node a

CS2201 DATA STRUCTURES

UNIT III

ROUTINE TO PERFORM DOUBLE ROTATION WITH LEFT

} Case 3: insertion into the left subtree of right child of node a

Double Rotate with Right(Position k3) { /*Rotation between K1 & K2 */ K3->Right=Single Rotate with Left(K3->Right); /*Rotation Between K3 and K2 */ Return Single Rotate with Right(K3); }

CS2201 DATA STRUCTURES

UNIT III

ROUTINE TO INSERT IN AN AVL TREE

CS2201 DATA STRUCTURES

UNIT III

EXAMPLE:

CS2201 DATA STRUCTURES

UNIT III

Consider the example of constructing an AVL tree by inserting the elements 1 to 7 and then inserting the elements from 16 to 8. The first problem occurs when it is time to insert key 3, because the AVL property is violated at the root. We perform a single rotation between the root and its right child to fix the problem. The tree is shown in the following figure, before and after the rotation:

Adashed line indicates the two nodes that are the subject of the rotation. Next, we insert the key 4, which causes no problems, but the insertion of 5 creates a violation at node 3, which is fixed by a single rotation. Here, this means that 2's right child must be reset to point to 4 instead of 3.

CS2201 DATA STRUCTURES

UNIT III

Next, we insert 6. This causes a balance problem for the root, since its left subtree is of height 0, and its right subtree would be height 2. Therefore, we perform a single rotation at the root between 2 and 4.

The rotation is performed by making 2 a child of 4 and making 4's original left subtree the new right subtree of 2. Every key in this subtree must lie between 2 and 4, so this transformation makes sense. The next key we insert is 7, which causes another rotation.

Inserting 15 is easy, since it does not destroy the balance property, but inserting 14 causes a height imbalance at node 7.

CS2201 DATA STRUCTURES

UNIT III

As the diagram shows, the single rotation has not fixed the height imbalance. The problem is that the height imbalance was caused by a node inserted into the tree containing the middle elements at the same time as the other trees had identical heights. The case is easy to check for, and the solution is called a double rotation, which is similar to a single rotation but involves four subtrees instead of three. Here in this diagram , the tree on the left is converted to the tree on the right. By the way, the effect is the same as rotating between k1 and k2 and then between k2 and k3.

the double rotation is a right-left double rotation and involves 7, 15, and 14. Here, k3 is the node with key 7, k1 is the node with key 15, and k2 is the node with key 14. Subtrees A, B, C, and D are all empty.

CS2201 DATA STRUCTURES

UNIT III

Next we insert 13, which requires a double rotation. Here the double rotation is again a right-left double rotation that will involve 6, 14, and 7 and will restore the tree. In this case, k3 is the node with key 6, k1 is the node with key 14, and k2 is the node with key 7. Subtree A is the tree rooted at the node with key 5, subtree B is the empty subtree that was originally the left child of the node with key 7, subtree C is the tree rooted at the node with key 13, and finally, subtree D is the tree rooted at the node with key 15.

If 12 is now inserted, there is an imbalance at the root. Since 12 is not between 4 and 7, we know that the single rotation will work.

Insertion of 11 will require a single rotation: To insert 10, a single rotation needs to be performed, and the same is true for the subsequent insertion of 9. We insert 8 without a rotation, creating the almost perfectly balanced tree that follows.

10

CS2201 DATA STRUCTURES

UNIT III

CHAPTER 2 SPLAY TREES

SPLAY TREES: The basic idea of the splay tree is that after a node is accessed, it is pushed to the root by a series of AVL tree rotations. Notice that if a node is deep, there are many nodes on the path that are also relatively deep, and by restructuring we can make future accesses cheaper on all these nodes. Thus, if the node is unduly deep, then we want this restructuring to have the side effect of balancing the tree . Splay trees also do not require the maintenance of height or balance information, thus saving space and simplifying the code to some extent. 2.1 Simple idea of splay tree: One way of performing the restructuring described above is to perform single rotations, bottom up. This means that we rotate every node on the access path with its parent. As an example, consider what happens after an access (a find) on k1 in the following tree.

The access path is dashed. First, we would perform a single rotation between k1 and its parent, obtaining the following tree.

11

CS2201 DATA STRUCTURES

UNIT III

Then, we rotate between k1 and k3, obtaining the next tree.

Then two more rotations are performed until we reach the root.

These rotations have the effect of pushing k1 all the way to the root, so that future accesses on k1 are easy (for a while). Unfortunately, it has pushed another node ( k3) almost as deep as k1 used to be. An access on that node will then push another node deep, and so on. Although this strategy makes future accesses of k1 cheaper, it has not significantly improved the situation for the other nodes on the (original) access path. It turns out that it is possible to prove that using this strategy, there is a sequence of m operations. The bad part is that accessing the node with key 1 takes n -1 units of time. After the rotations are complete, an access of the node with key 2 takes n - 2 units of time.

12

CS2201 DATA STRUCTURES

UNIT III

2.2 SPLAYING: Let x be a (nonroot) node on the access path at which we are rotating. If the parent of x is the root of the tree, we merely rotate x and the root. This is the last rotation along the access path. Otherwise, x has both a parent (p) and a grandparent (g), and there are two cases, plus symmetries, to consider. ZIG-ZAG ROTATION: Here x is a right child and p is a left child (or vice versa). If this is the case, we perform a double rotation, exactly like an AVL double rotation.

ZIG-ZIG ROTATION: Here x and p are either both left children or both right children. In that case, we transform the tree on the left to the tree on the right.

As an example, consider the tree from the last example, with a find on k1:

The first splay step is at k1, and is clearly a zig-zag, so we perform a standard AVL double rotation using k1, k2, and k3. The resulting tree follows.

13

CS2201 DATA STRUCTURES

UNIT III

The next splay step at k1 is a zig-zig, so we do the zig-zig rotation with k1, k4, and k5, obtaining the final tree.

Consider again the effect of inserting keys 1, 2, 3, . . . , n into an initially empty tree. This takes a total of O(n), as before, and yields the same tree as simple rotations. The diagram below shows the result of splaying at the node with key 1. The difference is that after an access of the node with key 1, which takes n -1 units, the access on the node with key 2 will only take about n/2 units instead of n - 2 units; there are no nodes quite as deep as before.

14

CS2201 DATA STRUCTURES

UNIT III

Result of splaying at node 1 a tree of all left children.

15

CS2201 DATA STRUCTURES

UNIT III

Result of splaying at node 2 of a tree.

When access paths are long, thus leading to a longer-than-normal search time, the rotations tend to be good for future operations. When accesses are cheap, the rotations are not as good and can be bad. The extreme case is the initial tree formed by the insertions. All the insertions were constant-time operations leading to a bad initial tree. At that point in time, we had a very bad tree, but we were running ahead of schedule and had the compensation of less total running time. Then a couple of really horrible accesses left a nearly balanced tree, but the cost was that we had to give back some of the time that had been saved. Result of splaying previous tree at node 3.

Result of splaying previous tree at node 4.

Result of splaying previous tree at node 5 16

CS2201 DATA STRUCTURES

UNIT III

Result of splaying previous tree at node 6

Result of splaying previous tree at node 7

Result of splaying previous tree at node 8

17

CS2201 DATA STRUCTURES


Type declarations for splay trees typedef struct splay_node *splay_ptr; struct splay_node { element_type element; splay_ptr left; splay-ptr right; splay-ptr parent; }; typedef splay_ptr SEARCH_TREE;

UNIT III

Basic splay routine


void splay( splay_ptr current ) { splay_ptr father; father = current->parent; while( father != NULL ) { if( father->parent == NULL ) single_rotate (current ); else double_rotate( current ); father = current->parent; } }

Single rotation
void single_rotate( splay_ptr x ) { if( x->parent->left == x) zig_left( x ); else zig_right( x ); }

Single rotation between root and its left child


void zig_left( splay_ptr x ) { splay ptr p, B; p = x->parent; B = x->right; x->right =f p; /* x's new right child is p*/ x->parent = NULL; /* x will now be a root */ if( B != NULL ) B->parent = p; p->left = B; p->parent = x; }

18

CS2201 DATA STRUCTURES


void zig_zig_left( splay_ptr x ) { splay_ptr p, g, B, C, ggp; p = x->parent; g = p->parent; B = x->right; C = p->right; ggp = g->parent; x->right = p; /* x's new right child is p*/ p->parent = x; p->right = g; /* p's new right child is g */ g->parent = p; if( B != NULL ) /* p's new left child is subtree B */ B->parent = p; p->left = B; if( C != NULL ) /* g's new left child is subtree C */ C->parent = g; g->left = C; x->parent = ggp; /* connect to rest of the tree */ if( ggp ! = NULL ) if( gpp->left == g ) ggp->left = x; else ggp->right = x; }

UNIT III

Routine to perform a zig-zig when both children are initially left children Deletion can be performed by accessing the node to be deleted. This puts the node at the root. If it is deleted, we get two subtrees TL and TR (left and right). If we find the largest element in TL (which is easy), then this element is rotated to the root of TL, and TL will now have a root with no right child. We can finish the deletion by making TR the right child. The analysis of splay trees is difficult, because it must take into account the ever-changing structure of the tree. On the other hand, splay trees are much simpler to program than AVL trees, since there are fewer cases to consider and no balance information to maintain.

CHAPTER 3 B TREES B TREES: Since the processor is much faster, it is more important to minimize the number of disk accesses by performing more cpu instructions. Idea: allow a node in a tree to have many children.

19

CS2201 DATA STRUCTURES

UNIT III

If each internal node in the tree has M children,the height of the tree would be logM n instead of log2 n. For example, if M = 20, then log20 220 < 5. Thus, we can speed up the search significantly. In practice: it is impossible to keep the same number of children per internal node. A B+-tree of order M _ 3 is an M-ary tree with the following properties: Each internal node has at most M children Each internal node, except the root, has between _M/2_-1 and M-1 keys. This guarantees that the tree does not degenerate into a binary tree The keys at each node are ordered The root is either a leaf or has between 1 and M-1 keys The data items are stored at the leaves. All leaves are at the same depth. Each leaf has between _L/2_-1 and L-1 data items, for some L (usually L << M, but we will assume M=L in most examples) Example

Here, M=L=5 Records are stored at the leaves, but we only show the keys here At the internal nodes, only keys (and pointers to children) are stored (also called separating keys). In practice: it is impossible to keep the same number of children per internal node. B+ TREES: A B tree with M=L=4:

20

CS2201 DATA STRUCTURES

UNIT III

E.g. the left child pointer of N is the same as the right child pointer of J We can also talk about the left subtree and right subtree of a key in internal Nodes Which keys are stored at the internal nodes? There are several ways to do it. Different books adopt different conventions. We will adopt the following convention: key i in an internal node is the smallest key in its i+1 subtree (i.e. right subtree of key i) Even following this convention, there is no unique B+-tree for the same set of records. Each internal node/leaf is designed to fit into one I/O block of data. An I/O block usually can hold quite a lot of data. Hence, an internal node can keep a lot of keys, i.e., large M. This implies that the tree has only a few levels and only a few disk accesses can accomplish a search, insertion, or deletion. B+-tree is a popular structure used in commercial databases. To further speed up the search, the first one or two levels of the B+-tree are usually kept in main memory. The disadvantage of B+-tree is that most nodes will have less than M-1 keys most of the time. This could lead to severe space wastage. Thus, it is not a good dictionary structure for data in main memory. The textbook calls the tree B-tree instead of B+-tree. In some other textbooks, B-tree refers to the variant where the actual records are kept at internal nodes as well as the leaves. Such a scheme is not practical. Keeping actual records at the internal nodes will limit the number of keys stored there, and thus increasing the number of tree levels. Searching Suppose that we want to search for the key K. The path traversed is shown in bold. Let x be the input search key. Start the searching at the root 21

CS2201 DATA STRUCTURES

UNIT III

If we encounter an internal node v, search (linear search or binary search) for x among the keys stored at v If x < Kmin at v, follow the left child pointer of Kmin If Ki _ x < Ki+1 for two consecutive keys Ki and Ki+1 at v, follow the left child pointer of Ki+1 If x _ Kmax at v, follow the right child pointer of Kmax If we encounter a leaf v, we search (linear search or binary search) for x among the keys stored at v. If found, we return the entire record; otherwise, report not found. 3.1 Insertion Suppose that we want to insert a key K and its associated record. Search for the key K using the search procedure This will bring us to a leaf x. Insert K into x Splitting (instead of rotations in AVL trees) of nodes is used to maintain properties of B+-trees. Insertion into a leaf If leaf x contains < M-1 keys, then insert K into x (at the correct position in node x) If x is already full (i.e. containing M-1 keys). Split x Cut x off its parent Insert K into x, pretending x has space for K. Now x has M keys. After inserting K, split x into 2 new leaves xL and xR, with Xl containing the _M/2_ smallest keys, and xR containing the remaining _M/2_ keys. Let J be the minimum key in xR Make a copy of J to be the parent of xL and xR, and insert the copy together with its child pointers into the old parent of x. Inserting into a non-full leaf

22

CS2201 DATA STRUCTURES

UNIT III

Splitting a leaf: inserting T

Two disk accesses to write the two leaves, one disk access to update the parent _ For L=32, two leaves with 16 and 17 items are created. We can perform 15 more insertions without another split Another example:

23

CS2201 DATA STRUCTURES

UNIT III

Splitting an internal node To insert a key K into a full internal node x: Cut x off from its parent Insert K and its left and right child pointers into x, pretending there is space. Now x has M keys. Split x into 2 new internal nodes xL and xR, with xL containing the ( _M/2_ - 1 ) smallest keys, and xR containing the _M/2_ largest keys. Note that the (_M/2_)th key J is not placed in xL or xR Make J the parent of xL and xR, and insert J together with its child pointers into the old parent of x. Example: splitting internal Node:

24

CS2201 DATA STRUCTURES

UNIT III

Termination Splitting will continue as long as we encounter full internal nodes If the split internal node x does not have a parent (i.e. x is a root), then create a new root containing the key J and its two children. 3.2 Deletion To delete a key target, we find it at a leaf x, and remove it. Two situations to worry about: (1) target is a key in some internal node (needs to be replaced, according to our convention) (2) After deleting target from leaf x, x contains less than _M/2_ - 1 keys (needs to merge nodes) Situation (1) By our convention, target can appear in at most one ancestor y of x as a key. Moreover, we must have visited node y and seen target in it when we searched 25

CS2201 DATA STRUCTURES

UNIT III

down the tree. So after deleting from node x, we can access y directly and replace target by the new smallest key in x. Situation (2): handling leaves with too few keys Suppose we delete the record with key target from a leaf. Let u be the leaf that has _M/2_ - 2 keys Let v be a sibling of u Let k be the key in the parent of u and v that separates the pointers to u and v. There are two cases Handling leaves with too few keys Case 1: v contains _M/2_ keys or more and v is the right sibling of u Move the leftmost record from v to u Set the key in parent of u that separates u and v to be the new smallest key in v Case 2: v contains _M/2_ keys or more and v is the left sibling of u Move the rightmost record from v to u Set the key in parent of u that separates u and v to be the new smallest key in u

Deletion example

26

CS2201 DATA STRUCTURES

UNIT III

27

CS2201 DATA STRUCTURES

UNIT III

Merging two leaves If no sibling leaf with at least _M/2_ keys exists, then merge two leaves. Case (1): Suppose that the right sibling v of u contains exactly _M/2_ -1 keys. Merge u and v Move the keys in u to v Remove the pointer to u at parent Delete the separating key between u and v from the parent of u Merging two leaves Case (2): Suppose that the left sibling v of u contains exactly _M/2_ -1 keys. Merge u and v Move the keys in u to v Remove the pointer to u at parent Delete the separating key between u and v from the parent of u

28

CS2201 DATA STRUCTURES

UNIT III

29

CS2201 DATA STRUCTURES

UNIT III

Deleting a key in an internal node Suppose we remove a key from an internal node u, and u has less than _M/2_ -1 keys afterwards. Case (1): u is a root If u is empty, then remove u and make its child the new root Deleting a key in an internal node Case (2): the right sibling v of u has _M/2_ keys or more Move the separating key between u and v in the parent of u and v down to u. Make the leftmost child of v the rightmost child of u Move the leftmost key in v to become the separating key between u and v in the parent of u and v. Case (2): the left sibling v of u has _M/2_ keys or more Move the separating key between u and v in the parent of u and v down to u. Make the rightmost child of v the leftmost child of u Move the rightmost key in v to become the separating key between u and v in the parent of u and v.

30

CS2201 DATA STRUCTURES

UNIT III

Case (3): all sibling v of u contains exactly _M/2_ - 1 keys Move the separating key between u and v in the parent of u and v down to u. Move the keys and child pointers in u to v Remove the pointer to u at parent.

31

CS2201 DATA STRUCTURES

UNIT III

32

CS2201 DATA STRUCTURES

UNIT III

CHAPTER 4 BINARY HEAP

4.1 Model
A priority queue is a data structure that allows at least the following two operations: insert, which does the obvious thing, and delete_min, which finds, returns and removes the minimum element in the heap. The insert operation is the equivalent of enqueue, and delete_min is the priority queue equivalent of the queue's dequeue operation. 33

CS2201 DATA STRUCTURES

UNIT III

The delete_min function also alters its input. Basic model of a priority queue

Priority queues have many applications besides operating systems. Priority queues are also important in the implementation of greedy algorithms, which operate by repeatedly finding a minimum; and external sorting.

4.2 Simple Implementations


There are several obvious ways to implement a priority queue. Linked List: Using linked list, performing insertions at the front in O(1) and traversing the list, which requires O(n) time, to delete the minimum. Alternatively, we could insist that the list be always kept sorted; this makes insertions expensive (O(n)) and delete_mins cheap (O(1)). The former is probably the better idea of the two, based on the fact that there are never more delete_mins than insertions. Binary search tree: If priority Queue is implemented using binary Search Tree, then the right side of the tree becomes too heavy if deletemin operation is performed as the minimum element will be present only at the left side of the tree.

4.3. Binary Heap


The implementation we will use is known as a binary heap. Its use is so common for priority queue implementations that when the word heap is used without a qualifier, it is generally assumed to be referring to this implementation of the data structure. Binary Heap Properties: 1. Structure Property 2. Heap Order Property 34

CS2201 DATA STRUCTURES

UNIT III

As with AVL trees, an operation on a heap can destroy one of the properties, so a heap operation must not terminate until all heap properties are in order. This turns out to be simple to do.

4.3.1. Structure Property


A Perfect binary tree A binary tree with all leaf nodes at the same depth. All internal nodes have 2 children.

A heap is a binary tree that is completely filled, with the possible exception of the bottom level, which is filled from left to right. Such a tree is known as a complete binary tree. A binary heap is a complete binary tree. Complete binary tree binary tree that is completely filled, with the possible exception of the bottom level, which is filled left to right. Examples:

It is easy to show that a complete binary tree of height h has between 2h and 2h+1 - 1 nodes. Representing Complete Binary Trees in an Array: For any element in array position i, the left child is in position 2i, the right child is in the cell after the left child (2i + 1), and the parent is in position i/2.

35

CS2201 DATA STRUCTURES

UNIT III

4.3.2. Heap Order Property


The property that allows operations to be performed quickly is the heap order property. Since we want to be able to find the minimum quickly, it makes sense that the smallest element should be at the root. If we consider that any subtree should also be a heap, then any node should be smaller than all of its descendants. For every non-root node X, the value in the parent of X is less than (or equal to) the value in X. By the heap order property, the minimum element can always be found at the root. Thus, we get the extra operation, find_min, in constant time.

typedef struct heap_struct *PRIORITY_QUEUE; struct heap_struct

Declaration for priority queue Struct HeapStruct { Int Capacity; Int size; ElementType *elements; };
PRIORITY_QUEUE create_pq( unsigned int max_elements )

36

CS2201 DATA STRUCTURES

UNIT III

{ PRIORITY_QUEUE H; /*1*/ if( max_elements < MIN_PQ_SIZE ) /*2*/ error("Priority queue size is too small"); /*3*/ H = (PRIORITY_QUEUE) malloc ( sizeof (struct heap_struct) ); /*4*/ if( H == NULL ) /*5*/ fatal_error("Out of space!!!"); /* Allocate the array + one extra for sentinel */ /*6*/ H->elements = (element_type *) malloc ( ( max_elements+1) * sizeof (element_type) ); /*7*/ if( H->elements == NULL ) /*8*/ fatal_error("Out of space!!!"); /*9*/ H->max_heap_size = max_elements; /*10*/ H->size = 0; /*11*/ H->elements[0] = MIN_DATA; /*12*/ return H; }

4.3.3. Basic Heap Operations


It is easy (both conceptually and practically) to perform the two required operations. All the work involves ensuring that the heap order property is maintained. Insert Delete_min

Insert
To insert an element x into the heap, we create a hole in the next available location, since otherwise the tree will not be complete. If x can be placed in the hole without violating heap order, then we do so and are done. Otherwise we slide the element that is in the hole's parent node into the hole, thus bubbling the hole up toward the root. We continue this process until x can be placed in the hole. The Example shows to insert 14, we create a hole in the next available heap location. Inserting 14 in the hole would violate the heap order property, so 31 is slid down into the hole. This strategy is continued until the correct location for 14 is found.

37

CS2201 DATA STRUCTURES

UNIT III

This general strategy is known as a percolate up; the new element is percolated up the heap until the correct location is found. If the element to be inserted is the new minimum, it will be pushed all the way to the top. At some point, i will be 1 and we will want to break out of the while loop. This value must be guaranteed to be smaller than (or equal to) any element in the heap; it is known as a sentinel. This idea is similar to the use of header nodes in linked lists. By adding a dummy piece of information, we avoid a test that is executed once per loop iteration, thus saving some time.

Procedure to insert into a binary heap


/* H->element[0] is a sentinel */ void insert( element_type x, PRIORITY_QUEUE H ) { unsigned int i; /*1*/ if( is_full( H ) ) /*2*/ error("Priority queue is full"); else { /*3*/ i = ++H->size; /*4*/ while( H->elements[i/2] > x ) { /*5*/ H->elements[i] = H->elements[i/2]; /*6*/ i /= 2; } /*7*/ H->elements[i] = x; } }

38

CS2201 DATA STRUCTURES

UNIT III

Delete_min
In Delete_mins Finding the minimum is easy; the hard part is removing it. When the minimum is removed, a hole is created at the root. Since the heap now becomes one smaller, it follows that the last element x in the heap must move somewhere in the heap. If x can be placed in the hole, then we are done. This is unlikely, so we slide the smaller of the hole's children into the hole, thus pushing the hole down one level. We repeat this step until x can be placed in the hole. Thus, our action is to place x in its correct spot along a path from the root containing minimum children. In the Example After 13 is removed, we must now try to place 31 in the heap. 31 cannot be placed in the hole, because this would violate heap order. Thus, we place the smaller child (14) in the hole, sliding the hole down one level .We repeat this again, placing 19 into the hole and creating a new hole one level deeper. We then place 26 in the hole and create a new hole on the bottom level. Finally, we are able to place 31 in the hole This general strategy is known as a percolate down.

39

CS2201 DATA STRUCTURES

UNIT III

Function to perform delete_min in a binary heap


element_type delete_min( PRIORITY_QUEUE H ) { unsigned int i, child; element_type min_element, last_element; /*1*/ if( is_empty( H ) ) { /*2*/ error("Priority queue is empty"); /*3*/ return H->elements[0]; } /*4*/ min_element = H->elements[1]; /*5*/ last_element = H->elements[H->size--]; /*6*/ for( i=1; i*2 <= H->size; i=child ) { /* find smaller child */ /*7*/ child = i*2; /*8*/ if( ( child != H->size ) && ( H->elements[child+1] < H->elements [child] ) ) /*9*/ child++; /* percolate one level */ /*10*/ if( last_element > H->elements[child] ) /*11*/ H->elements[i] = H->elements[child]; else /*12*/ break; } /*13*/ H->elements[i] = last_element; /*14*/ return min_element; }

4.3.4. Other Heap Operations Decrease_key Increase_key Delete Build_heap 40

CS2201 DATA STRUCTURES

UNIT III

Decrease_key The decrease_key(x, ,,H) operation lowers the value of the key at position x by a positive amount .Since this might violate the heap order, it must be fixed by a percolate up. This operation could be useful to system administrators: they can make their programs run with highest priority Increase_key The increase_key(x, , H) operation increases the value of the key at position x by a positive amount . This is done with a percolate down. Many schedulers automatically drop the priority of a process that is consuming excessive CPU time. Delete The delete(x, H) operation removes the node at position x from the heap. This is done by first performing decrease_key(x, , H) and then performing delete_min (H). When a process is terminated by a user (instead of finishing normally), it must be removed from the priority queue. Build_heap The build_heap(H) operation takes as input n keys and places them into an empty heap. Obviously, this can be done with n successive inserts. Since each insert will take O(1) average and O(log n) worst-case time, the total running time of this algorithm would be O(n) average but O(n log n) worst-case. Since this is a special instruction and there are no other operations intervening, and we already know that the instruction can be performed in linear average time, it is reasonable to expect that with reasonable care a linear time bound can be guaranteed. The general algorithm is to place the n keys into the tree in any order, maintaining the structure property. Then, if percolate_down(i) percolates down from node i, perform the algorithm. Sketch of build_heap
for(i=n/2; i>0; i-- ) percolate_down( i );

41

CS2201 DATA STRUCTURES

UNIT III

CHAPTER 5 APPLICATIONS OF PRIORITY QUEUE Priority queues are used to obtain solutions to two problems. 5.1. The Selection Problem 5.2. Event Simulation

42

CS2201 DATA STRUCTURES

UNIT III

5.1. The Selection Problem


The first problem we will examine is the selection problem from Chapter 1. Recall that the input is a list of n elements, which can be totally ordered, and an integer k. The selection problem is to find the kth largest element. Two algorithms were given in Chapter 1, but neither is very efficient. The first algorithm, which we shall call Algorithm 1A, is to read the elements into an array and sort them, returning the appropriate element. Assuming a simple sorting algorithm, the running time is O(n2). The alternative algorithm, 1B, is to read k elements into an array and sort them. The smallest of these is in the kth position. We process the remaining elements one by one. As an element arrives, it is compared with kth element in the array. If it is larger, then the kth element is removed, and the new element is placed in the correct place among the remaining k - 1 elements. When the algorithm ends, the element in the kth position is the answer.

Algorithm 6A
For simplicity, we assume that we are interested in finding the kth smallest element. The algorithm is simple. We read the n elements into an array. We then apply the build_heap algorithm to this array. Finally, we'll perform k delete_min operations. The last element extracted from the heap is our answer. It should be clear that by changing the heap order property, we could solve the original problem of finding the kth largest element. The correctness of the algorithm should be clear. The worst-case timing is O(n) to construct the heap, if build_heap is used, and O(log n) for each delete_min. Since there are k delete_mins, we obtain a total running time of O(n + k log n). If k = O(n/log n), then the running time is dominated by the build_heap operation and is O(n).

Algorithm 6B
For the second algorithm, we return to the original problem and find the kth largest element. We use the idea from Algorithm 1B. At any point in time we will maintain a set S of the k largest elements. After the first k elements are read, when a new element is read, it is compared with the kth largest element, which we denote by Sk. Notice that Sk is the smallest element in S. If the new element is larger, then it replaces Sk in S. S will then have a new smallest element, which may or may not be the newly added element. At the end of the input, we find the smallest element in S and return it as the answer. This is essentially the same algorithm described in Chapter 1. Here, however, we will use a heap to implement S. The first k elements are placed into the heap in total time O(k) with a call to build_heap. The time to process each of the remaining elements is O(1), to test if the element goes into S, plus O(log k), to delete Sk and insert the new element if this is necessary. Thus, the total time is O(k + (n - k ) log k ) = O (n log k ) .

43

CS2201 DATA STRUCTURES

UNIT III

5.2. Event Simulation


Recall that we have a system, such as a bank, where customers arrive and wait on a line until one of k tellers is available. Customer arrival is governed by a probability distribution function, as is the service time (the amount of time to be served once a teller is available). We are interested in statistics such as how long on average a customer has to wait or how long the line might be. With certain probability distributions and values of k, these answers can be computed exactly. However, as k gets larger, the analysis becomes considerably more difficult, so it is appealing to use a computer to simulate the operation of the bank. In this way, the bank officers can determine how many tellers are needed to ensure reasonably smooth service. A simulation consists of processing events. The two events here are (a) a customer arriving and (b) a customer departing, thus freeing up a teller. We can use the probability functions to generate an input stream consisting of ordered pairs of arrival time and service time for each customer, sorted by arrival time. We do not need to use the exact time of day. Rather, we can use a quantum unit, which we will refer to as a tick. Priority Queue - Applications A Priority Queue can be used in any application using a set of elements of various priorities where only the element of highest priority need be accessed. Huffman Codes Huffman Codes are used to compress a block of data into a smaller space. The algorithm starts by collecting the frequencies of all the characters in the data block and then processes each one in order of descending frequency - a perfect place to use a Priority Queue. Dijkstra's Algorithm for All Shortest Paths This graph algorithm always selects the next connected edge of lowest path cost from the staring node. These edges can be stored and retrieved in a Priority Queue. Prim's Algorithm Prim's is another graph algorithm which can utilize a Priority Queue. It works by always selecting the next connected edge of lowest path cost. CPU Scheduling A CPU can only run one process at a time, but there may be many jobs of various priorities waiting to be run. A Priority Queue can be used to quickly select the next process to run based upon its priority. 44

CS2201 DATA STRUCTURES

UNIT III

Edit Distance When using a greedy approach, a Priority Queue can be used to select the next node in the edit distance path. Artificial Intelligence - Informed Search When using a search tree, only the next best move from the current state need be considered at any point. In the A* algorithm, the next best move is computed by adding the heuristic function, the estimated cost to reach a goal state, with the path cost from the initial to current state. UNIT-III BALANCED TREES PART -A 1. Define AVL Tree A AVL tree is a binary search tree except that for every node in the tree,the height of the left and right subtrees can differ by atmost 1. 2. What are the various transformation performed in AVL tree? 1.single rotation - single L rotation -single R rotation 2.double rotation -LR rotation -RL rotation 3. What is priority queue? A priority queue is a data structure that allows at least the following two operations: insert which does the obvious thing; and Deletemin, which finds, returns, and removes the minimum element in the priority queue. The Insert operation is the equivalent of Enqueue 4. What are the Applecation of priority queues? 1.for scheduling purpose in operating system 2.used for external sorting 3.important for the implementation of greedy algorithm, which operate by repeatedly finding a minimum. 5. What are the main properties of a binary heap? 1.structure property

2.heaporder property

6. What are the other heap operations in Priority Queue?

45

CS2201 DATA STRUCTURES

UNIT III

Decrease_key Increase_key Delete Build_heap

7. What are the structural Properties of a B-Tree of order M? Each internal node has at most M children Each internal node, except the root, has between _M/2_-1 and M-1 keys. This guarantees that the tree does not degenerate into a binary tree The keys at each node are ordered The root is either a leaf or has between 1 and M-1 keys The data items are stored at the leaves. All leaves are at the same depth. Each leaf has between _L/2_-1 and L-1 data items, for some L (usually L << M, but we will assume M=L in most examples) 8. What are the procedure to be followed to insert a new element into the B-Tree? Suppose that we want to insert a key K and its associated record. Search for the key K using the search procedure This will bring us to a leaf x. Insert K into x Splitting (instead of rotations in AVL trees) of nodes is used to maintain properties of B+-trees. 9. What are the 4 different types of rotations followed in Splay Trees? Zig-Zig Rotation Zig-Zag Rotation Zag-Zig Rotation Zag-Zag Rotation 10. What is splay Tree? The basic idea of the splay tree is that after a node is accessed, it is pushed to the root by a series of AVL tree rotations. 11. What is splaying? Splaying is the process of pushing the accessing node towards the root. 12. What is a Balance Factor? 46

CS2201 DATA STRUCTURES

UNIT III

Balance Factor is calculated by Subtracting the Height of Left subtree and Height of Right subTree. 13. How does an imbalance occurs in an AVL Tree? Case 1: An insertion into the left subtrce of the left child of node a. Case 2 : An insertion into the right subtree of the left child of node a. Case 3: An insertion into the left subtree of the right child of node a. Case 4 :An insertion into the right subtree of the right child of node a. 14. How to declare nodes for AVL Trees? Struct AvlNode { Element Type Element; AVLTree Left; AVLTree Right; Int Height; }; 15. What is a B-Tree? B-tree is a tree which is not a Binary and contains the datas only at the Leaf Nodes and not at the intermediate nodes. PART B (BIG QUESTIONS) 1. Explain AVL tree in detail. Refer Page No: 1 to 5 2. Explain Splay Trees with an Example. Refer Page No: 11 to 15 3. Define B-Tree. How does insertion is performed in an B-Tree. Explain with an Example. Refer Page No: 19 to 24 4. What is a Priority Queue?What are basic operations?Explain with an Example? Refer Page No: 33 to 39 5. Construct an AVL Tree to insert Elements 1 to 7 and then 16 to 8. Refer Page No: 6 to 11 6. How is Deletion performed in B-tree. What are the different cases handled. Explain with an example. Refer Page No: 25 to 33

47

Das könnte Ihnen auch gefallen