Sie sind auf Seite 1von 9

CMSC 341 Data Structures B-Tree Review

These questions will help test your understanding of the B-Tree material discussed in class and in the text. These questions are only a study guide. Questions found here may be on your exam, although perhaps in a different format. Questions NOT found here may also be on your exam. The following figure (also figure 3) shows a B-Tree with M = 4 and L = 3 The root node can have between 2 and M = 4 subtrees Each other interior node can have between Ciel( M / 2 ) = ciel( 4 / 2) = 2 and M = 4 subtrees and up to M 1 = 3 keys. Each exterior node (leaf) can hold between Ciel (L / 2) = ciel( 3 / 2) = 2 and L = 3 data elements

1. Define B-Tree. List all B-Tree properties. The B-tree is a generalization of a binary search tree in that a node can have more than two children and all of the data is stored in the leaves, usually used with large datasets to minimize accesses. Store data only at leaves; all leaves at same level interior and exterior nodes have different structure interior nodes store one key and two subtree pointers all search paths have same length: ciel(lg n) (assuming one element per leaf) can store multiple data elements in a leaf
A B-Tree of order M is an M-Way tree with the following constraints 1. The root is either a leaf or has between 2 and M subtrees 2. All interior node (except maybe the root) have between 3. ciel(M / 2 and M subtrees (i.e. each interior node is at least half full) 4. All leaves are at the same level. A leaf must store between ciel(L / 2 and L data elements, where L is a fixed constant >= 1 (i.e. each leaf is at least half full, except when the tree has fewer than L/2 elements)

2. What does it mean to say a B-Tree is order M? each interior node has M subtree pointers and M-1 keys

3. When describing a B-Tree, what does L represent? L is the number of data records that can be stored in each leaf. 4. Give the pseudo-code for finding a particular element in a B-Tree of order M.
function search(record r) u := root while (u is not a leaf) do choose the correct pointer in the node move to the first node following the pointer u := current node scan u for r

5. Given the drawing of a B-Tree, show the new B-Tree after inserting a given element.

Insertion
Perform a search to determine what bucket the new record should go into.

If the bucket is not full, add the record. Otherwise, split the bucket. o Allocate new leaf and move half the bucket's elements to the new bucket. o Insert the new leaf's smallest key and address into the parent. o If the parent is full, split it too. Add the middle key to the parent node. o Repeat until a parent is found that need not split. If the root splits, create a new root which has one key and two pointers.

Search to find the leaf into which X should be inserted If the leaf has room (fewer than L elements), insert X and write the leaf back to the disk. If the is leaf full, split it into two leaves, each with half of elements. Insert X into the appropriate new leaf and write new leaves back to the disk. Update the keys in the parent If the parent node is already full, split it in the same manner Splits may propagate all the way to the root, in which case, the root is split (this is how the tree grows in height)

6. Given the drawing of a B-Tree, show the new B-Tree after deleting a given element.

Deletion

Start at root, find leaf L where entry belongs. Remove the entry. o If L is at least half-full, done! o If L has fewer entries than it should, Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). If re-distribution fails, merge L and sibling. If merge occurred, must delete entry (pointing to L or sibling) from parent of L. Merge could propagate to root, decreasing height.

Find leaf containing element to be deleted. If that leaf is still full enough (still has L / 2 elements after remove) write it back to disk without that element. Then change the key in the ancestor if necessary. If leaf is now too empty (has less than L / 2 elements), borrow an element from a neighbor. If neighbor would be too empty, combine two leaves into one. This combining requires updating the parent which may now have too few subtrees. If necessary, continue the combining up the tree Does it matter which neighbor we borrow from?

7. Draw a valid B-Tree with M = 4 and L = 3 containing the integer values 1 25.

8. Show the result of inserting the elements 1, 3, 5, 7, 9, 11, 6 into an initially empty B-Tree with M = 3 and L = 3. Show the tree at the end of each insertion.

L=2

9. Given the following characteristics of an external storage problem, design a suitable BTree (i.e. calculate appropriate values of M and L). L = floor (disc block / size of each data item) M = floor( (block + key) / (4 + key) )

a. b. c. d.

The number of items to be stored The size (in bytes) of the key for each item The size (in bytes) of each item The size (in bytes) of a disk block

10. What is the minimum and maximum number of leaves in a B-Tree of height h = 2 when M = 3? Max leaves = 4 Min leaves = 2

11. The average case performance of the dictionary operations insert, find and delete is O(lg N) for balanced binary search trees like Red-Black trees. In a B-Tree, the average asymptotic performance for the dictionary operations is O(logM N) where M is the order of the B-Tree. Discuss the following. a. When M = 2, do the B-Tree and the RB Tree have equivalent asymptotic performance for the dictionary operations? Are there advantages of one over the other?

b. B-Tree height is proportional to logM N indicating that for a given N, a B-Tree of higher order will be shorter than one of lower order. Is this true? If so, why not always choose a very high value for M since the average asymptotic performance of the dictionary operations is in O(height).

It will have more disc reads. c. B-Trees find their greatest utility when data are stored externally (on disk rather than in memory). Why is this so?: It can allow them to store large amounts of data while limiting accesses to the hard drive. 7

Das könnte Ihnen auch gefallen