Sie sind auf Seite 1von 35

ASSIGMENT ON ADVANCE DATA STRUCTURE

ROLLN0: - A07 GROUP: - 1 SECTION: - K2R21

SUBMITTED TO: SANDEEP SHARMA SIR

SUBMITTEDBY:ANKUR SAXENA Course code:CAP619


PROGRAM : MCA-MTECH

Q1: Create AVL tree from the following sequence of nodes: 10, 2, 20, 30, 25, 40, 8, 6, 55, 60, 19

ANS:

INSERT

10

INSERT 2

INSERT 20

1 0

0 0

1 0 2

+ 1

10
0

INSERT 30
-1

2
INSERT 25

20

10
-1

-2

10 2 0 3 0
0
0

2 0

-2

+ 1

INSERT

10

INSERT 2

INSERT 20

1 0

0 0

1 0 2

+ 1

10
0

INSERT 30
-1

2
INSERT 25

20

10
-1

-2

10 2 0 3 0
0
0

2 0

-2

3 0
2 5

+ 1

DOUBLE ROTATION >LL+RR

RL-

PERFORM FIRST ROTATION LL

1 0

-2

2 0

-2

-1 25

0
30

PERFORM SECOND ROTATION RR

1 0

-1

0 2

0
20

2 5

0 30

-2

1 0

INSERT 40

0 2

0
20

2 5

-1

RR -1 30 0

ROTATION

40

2 5
-1

0 10

3 0
0

0
2

0 40 20

+ 1

INSERT 8

2 5
-1

+1 10

3 0
0

+ 1
2

0 40 20 INSERT 6 0 8 25 +2

+ 2 10 30

-1

-2 2 20

0 40

8 0 6

+1

DOUBLE ROTATION RL>LL+RR

FIRST PERFORM LL +2 25

+ 2 10 30

-1

-2 2 20

0 40

-1

0 8

FIRST PERFORM RR +1 25

+ 1 10 30

-1

0 6 20

0 40

0 2 8 0

0 INSERT 55 + 1 10 30 RR ROTATION 25

-2

0 6 20

0 40

-1

0 0 2 8 0 55

+1 25

+ 1 10 40

0 6 0 8 0 20

0 55

30

0 25 INSERT 60 + 1 10 40 -1

0 6 20

-1 55

-1

30 0 2 8 0

60

0 INSERT 19 25

0 10 40

-1

0 6 0 1 9 0 2 8 0 20

+1 -1 55

30 0

0 60

FINAL AVL TREEE

Q2: How Binary search trees are different from Complete binary trees? Differentiate between the height and depth of a node by taking a suitable example.

ANS:

Binary Search Trees:

A tree is called binary search tree if each node of the tree has following properties.

The value at a node is greater than every value in the left subtree and is less than every value in the right subtree.
R T /\ I N S U /\ O P / L /\ \

R / L /\ I \ T

N /\ /\ A EO P

Binary Search tree: A binary search tree (BST) is a binary tree where each node has a key value that satisfies the following two properties:

A key value occurs only once in the tree (no duplicates).

The left child's key value is always less than its parent and the right child's key value is always greater than its parent. This the example of binary search tree.

Complete Binary Tree:


A binary tree is said to be complete if all its level except possibly the last, have maximum number of possible nodes, and if all the nodes at the last level appear as far left as possible.

Differentiate between the height and depth of a node: height and depth of a tree is equal but height and depth of a node is not equal because

the height is calculated by traversing from leaf to the given node depth is calculated from traversal from root to the given node

In above figure : The height of the tree is 3 i.e leaf node 5 to 15 The depth of the tree is 3 with the root node 15.
The depth or height of a tree is the maximum level number of the tree or maximum number of nodes in a branch of a tree.

Maximum depth or height of the below tree is 3.

Example Tree

Recursively calculate height of left and right subtrees of a node and assign height to the node as max of the heights of two children plus 1

Q3: Red Black trees are another type of self balancing trees. How balancing is performed in Red black trees? Explain by taking any illustrative example. ANS:

binary-search-tree representation of 2-3-4 tree

3- and 4-nodes are represented by equivalent binary trees red and black child pointers are used to distinguish between original 2-nodes and 2-nodes that represent 3- and 4-nodes

example:

Red-Black Tree Operations Traversals same as in binary search trees

Insertion and Deletion analog to 2-3-4 tree need to split 4-nodes need to merge 2-nodes Splitting a 4-node that is a root

Splitting a 4-node whose parent is a 2-node

Splitting a 4-node whose parent is a 3-node

Splitting a 4-node whose parent is a 3-node

Splitting a 4-node whose parent is a 3-node

Red-Black Tree
Till now, you have learned everything about Red-Black Tree. You may not believe this, because we just talked about how to represent 2-3-4 trees with Black and Red nodes. I'm telling the truth, however, because RBtrees are 23-4 trees. If you doubt it, let's have a look at the definitive description of Red-Black Tree 1. A node is either red or black. 2. The root is black. (This rule is sometimes omitted from other definitions. Since the root can always be changed from red to black but not necessarily vice-versa this rule has little effect on analysis.) 3. All leaves are black. 4. Both children of every red node are black. 5. Every simple path from a given node to any of its descendant leaves contains the same number of black nodes.

(Do not read the Red-Black Tree item in Wikipedia, it is not as good as mine :-) Let's see why RB trees are defined this way:
1. 2. 3. 4. 5.

A node is either red or black - red nodes is combined with its parent to form a 3-node or 4-node The root is black - the root has no parent, so a red root would be meaningless All leaves are black - note that the "leaves" are nil nodes Both children of every red node are black - we've explained this, bazinga Every simple path from a given node to any of its descendant leaves contains the same number of black nodes - because number of black nodes implies the height of the 2-3-4 tree

OK, it's time to have a glance at a bigger RB tree:

Red-Black Tree There are six 3-nodes and one 4-node in this RBtree. The 3-nodes are {3,7}, {10,14}, {15,16}, {17,26}, {19,20}, {30,41}. The 4-node is {35,38,39}

Q4: How Complexities for Insertion, deletion and searching operations in AVL tree are calculated in worst case scenarios? Also write pseudo codes for insertion in AVL.

ANS:

Q4: How Complexities for Insertion, deletion and searching operations in AVL tree are calculated in worst case scenarios? Also write pseudo codes for insertion in AVL. Ans4: Operations: Basic operations of an AVL tree involve carrying out the same actions as would be carried out on an unbalanced binary search tree, but modifications are preceded or followed by one or more operations called tree rotations, which help to restore the height balance of the subtrees.
1.

Searching: Lookup in an AVL tree is performed exactly like in any unbalanced binary search tree. Because of the height-balancing of the tree, a lookup takes O(log n) time. No special actions need to be taken, and the tree's structure is not modified by lookups. (This is in contrast to splay tree lookups, which do modify their tree's structure.) If each node additionally records the size of its subtree (including itself and its descendants), then the nodes can be retrieved by index in O(log n) time as well.

2.

Insertion:

Pictorial description of how rotations cause rebalancing tree, and then retracing one's steps toward the root updating the balance factor of the nodes. The numbered circles represent the nodes being balanced. The lettered triangles represent subtrees which are themselves balanced BSTs After inserting a node, it is necessary to check each of the node's ancestors for consistency with the rules of AVL. The balance factor is calculated as follows: balanceFactor = height(left-subtree) - height(right-subtree). For each node checked, if the balance factor remains 1, 0, or +1 then no rotations are necessary. However, if balance factor becomes less than -1 or greater than +1, the subtree rooted at this node is unbalanced. If insertions are performed serially, after each insertion, at most one of the following cases needs to be resolved to restore the entire tree to the rules of AVL.

There are four cases which need to be considered, of which two are symmetric to the other two. Let P be the root of the unbalanced subtree, with R and L denoting the right and left children of P respectively. Right-Right case and Right-Left case:

If the balance factor of P is -2 then the right subtree outweighs the left subtree of the given node, and the balance factor of the right child (R) must be checked. The left rotation with P as the root is necessary. If the balance factor of R is -1, a single left rotation (with P as the root) is needed (Right-Right case). If the balance factor of R is +1, two different rotations are needed. The first rotation is a right rotation with R as the root. The second is a left rotation with P as the root (Right-Left case).

Left-Left case and Left-Right case:

If the balance factor of P is 2, then the left subtree outweighs the right subtree of the given node, and the balance factor of the left child (L) must be checked. The right rotation with P as the root is necessary. If the balance factor of L is +1, a single right rotation (with P as the root) is needed (Left-Left case). If the balance factor of L is -1, two different rotations are needed. The first rotation is a left rotation with L as the root. The second is a right rotation with P as the root (Left-Right case). Deletion: If the node is a leaf or has only one child, remove it. Otherwise, replace it with either the largest in its left sub tree (in order predecessor) or the smallest in its right sub tree (in order successor), and remove that node. The node that was found as a replacement has at most one sub tree. After deletion, retrace the path back up the tree (parent of the replacement) to the root, adjusting the balance factors as needed. As with all binary trees, a node's in-order successor is the left-most child of its right subtree, and a node's in-order predecessor is the right-most child of its left subtree. In either case, this node will have zero or one children. Delete it according to one of the two simpler cases above.

3.

In addition to the balancing described above for insertions, if the balance factor for the tree is 2 and that of the left subtree is 0, a right rotation must be performed on P. The mirror of this case is also necessary. The retracing can stop if the balance factor becomes 1 or +1 indicating that the height of that subtree has remained unchanged. If the balance factor becomes 0 then the height of the subtree has decreased by one and the retracing needs to continue. If the balance factor becomes 2 or +2 then the subtree is unbalanced and needs to be rotated to fix it. If the rotation leaves the subtree's balance factor at 0 then the retracing towards the root must continue since the height of this subtree has decreased by one. This is in contrast to an insertion where a rotation resulting in a balance factor of 0 indicated that the subtree's height has remained unchanged. The time required is O(log n) for lookup, plus a maximum of O(log n) rotations on the way back to the root, so the operation can be completed in O(log n) time. Pseudo code:Insertion Insertion take place at leaf node. Step 1: First find whether the new node to be inserted by recursive call and then connect to parent Step 2: Also set the flag (height) of inserted node Step 3: Now check the balance of each node by testing theflag. Step 4 : If tree is out of balance then Transform to balanced one by one rotation Rotate the sub tree rotated of out of balance node.

Check which rotations to take. Rotations are four type Single right : left to left Sing left : right to right double (LR) : right to left double (RL) : left to right

Q5: Why various notations are used in defining algorithmic complexity? List and explain the purpose of using- Big O , Little O and Omega notations. ANS: Algorithmic complexity is concerned about how fast or slow
particular algorithm performs. We define complexity as a numerical function T(n) - time versus the input size n. We want to define time taken by an algorithm without depending on the implementation details. But you agree that T(n) does depend on the implementation! A given algorithm will take different amounts of time on the same inputs depending on such factors as: processor speed; instruction set, disk speed, brand of compiler and etc. The way around is to estimate efficiency of each algorithm asymptotically. We will measure time T(n) as the number of elementary "steps" (defined in any way), provided each such step takes constant time. Let us consider two classical examples: addition of two integers. We will add two integers digit by digit (or bit by bit), and this will define a "step" in our computational model. Therefore, we say that addition of two n-bit integers takes n steps. Consequently, the total computational time is T(n) = c * n, where c is time taken by addition of two bits. On different computers, additon of two bits might take different time, say c1 and c2, thus the additon of two nbit integers takes T(n) = c1 * n and T(n) = c2* n respectively. This shows that different machines result in different slopes, but time T(n) grows linearly as input size increases.

The process of abstracting away details and determining the rate of resource usage in terms of the input size is one of the fundamental ideas in computer science.

Asymptotic Notations
The goal of computational complexity is to classify algorithms according to their performances. We will represent the time function T(n) using the "big-O" notation to express an algorithm runtime complexity. For example, the following statement T(n) = O(n2) says that an algorithm has a quadratic time complexity.
Definition of "big Oh"

For any monotonic functions f(n) and g(n) from the positive integers to the positive integers, we say that f(n) = O(g(n)) when there exist constants c > 0 and n0 > 0 such that f(n) c * g(n), for all n n0 Intuitively, this means that function f(n) does not grow faster than g(n), or that function g(n) is an upper bound for f(n), for all sufficiently large n Here is a graphic representation of f(n) = O(g(n)) relation:

Examples:

1 = O(n)

n = O(n2) log(n) = O(n) 2 n + 1 = O(n)

The "big-O" notation is not symmetric: n = O(n2) but n2 O(n).


Constant Time: O(1)

An algorithm is said to run in constant time if it requires the same amount of time regardless of the input size. Examples:

array: accessing any element fixed-size stack: push and pop methods fixed-size queue: enqueue and dequeue methods

Linear Time: O(n)

An algorithm is said to run in linear time if its time execution is directly proportional to the input size, i.e. time grows linearly as input size increases. Examples:

array: linear search, traversing, find minimum ArrayList: contains method queue: contains method

Logarithmic Time: O(log n)


An algorithm is said to run in logarithmic time if its time execution is proportional to the logarithm of the input size. Example:

binary search

Recall the "twenty questions" game - the task is to guess the value of a hidden number in an interval. Each time you make a guess, you are told whether your guess iss too high or too low. Twenty questions game imploies a strategy that uses your guess number to halve the interval size. This is an example of the general problem-solving method known as binary search: locate the element a in a sorted (in ascending order) array by first comparing a with the middle element and then (if they are not equal) dividing the array into two subarrays; if a is less than the middle element you repeat the whole procedure in the left subarray, otherwise - in the right subarray. The procedure repeats until a is found or subarray is a zero dimension.

Note, log(n) < n, when n. Algorithms that run in O(log n) does not use the whole input.

Quadratic Time: O(n2)


An algorithm is said to run in logarithmic time if its time execution is proportional to the square of the input size. Examples:

bubble sort, selection sort, insertion sort

Definition of "little o ":

In Little-o, it must be that there is a minimum x after which the inequality holds no matter how small you make k, as long as it is not negative or zero.

Little-o notation The relation is read as " f( x) is little-o of g( x)". Intuitively, it means that g(x) grows much faster than f( x). It assumes that f and g are both functions of one variable. Formally, it states that the limit of f(x) /g( x) is zero, as x approaches infinity. For algebraically defined functions f(x) and g(x), is generally found using L' Hpital's rule. For example, Little-o notation is common in mathematics but rarer in computer science. In computer science the variable ( and function value) is most often a natural number. In math, the variable and function values are often real numbers. The following properties can be useful:
The following are true for little-o:
x^2 = o(x^3) x^2 = o(x!) ln(x) = o(x)

Note that if f = o(g), this implies f = O(g). e.g. x^2 = o(x^3) so it is also true that x^2 = O(x^3),

Definition of "big Omega"


We need the notation for the lower bound. A capital omega notation is used in this case. We say that f(n) = (g(n)) when there exist constant c that f(n) c*g(n) for for all sufficiently large n. Examples

n = (1) n2 = (n) n2 = (n log(n)) 2 n + 1 = O(n)

Q6: Perform a comparative study of various sorting algorithms and list which algorithm will be best suited in which scenario.

ANS:

Most sorting algorithms work by comparing the data being sorted. In some cases, it may be desirable to sort a large chunk of data (for instance, a struct containing a name and address) based on only a portion of that data. The piece of data actually used to determine the sorted order is called the key. Sorting algorithms are usually judged by their efficiency. In this case, efficiency refers to the algorithmic efficiency as the size of the input grows large and is generally based on the number of elements to sort. Most of the algorithms in use have an algorithmic efficiency of either O(n^2) or O(n*log(n)). A few special case algorithms (one example is mentioned in Programming Pearls) can sort certain data sets faster than O(n*log(n)). These algorithms are not based on comparing the items being sorted and rely on tricks. It has been shown that no keycomparison algorithm can perform better than O(n*log(n)).

Many algorithms that have the same efficiency do not have the same speed on the same input. First, algorithms must be judged based on their average case, best case, and worst case efficiency. Some algorithms, such as quick sort, perform exceptionally well for some inputs, but horribly for others. Other algorithms, such as merge sort, are unaffected by the order of input data. Even a modified version of bubble sort can finish in O(n) for the most favorable inputs. A second factor is the "constant term". As big-O notation abstracts away many of the details of a process, it is quite useful for looking at the big picture. But one thing that gets dropped out is the constant in front of the expression: for instance, O(c*n) is just O(n). In the real world, the constant, c, will vary across different algorithms. A well-implemented quicksort should have a much smaller constant multiplier than heap sort. A second criterion for judging algorithms is their space requirement -- do they require scratch space or can the array be sorted in place (without additional memory beyond a few variables)? Some algorithms never require extra space, whereas some are most easily understood when implemented with extra space (heap sort, for instance, can be done in place, but conceptually it is much easier to think of a separate heap). Space requirements may even depend on the data structure used (merge sort on arrays versus merge sort on linked lists, for instance). A third criterion is stability -- does the sort preserve the order of keys with equal values? Most simple sorts do just this, but some sorts, such as heap sort, do not. The following chart compares sorting algorithms on the various criteria outlined above; the algorithms with higher constant terms appear first, though this is clearly an implementation-dependent concept and should only be taken as a rough guide when picking between sorts of the same big-O efficiency.

Sort

Stability Remarks Always use a Bubble O(n^2) O(n^2) O(n^2) Constant Stable modified sort bubble sort Stops Modified after Bubble O(n^2) O(n) O(n^2) Constant Stable reaching sort a sorted array Even a perfectly sorted Selection input O(n^2) O(n^2) O(n^2) Constant Stable Sort requires scanning the entire array In the best case (already sorted), Insertion O(n^2) O(n) O(n^2) Constant Stable every Sort insert requires constant time Heap O(n*log(n)) O(n*log(n)) O(n*log(n)) Constant Instable By using Sort input array as storage for the heap, it is

Average

Time Best

Worst

Space

possible to achieve constant space On arrays, merge sort requires O(n) Merge space; on O(n*log(n)) O(n*log(n)) O(n*log(n)) Depends Stable Sort linked lists, merge sort requires constant space Randomly picking a pivot value (or shuffling the array prior to sorting) Quicksort O(n*log(n)) O(n*log(n)) O(n^2) Constant Stable can help avoid worst case scenarios such as a perfectly sorted array.