# BST Data Structure A BST node contains: A BST contains

## Presentation on theme: "BST Data Structure A BST node contains: A BST contains"— Presentation transcript:

BST Data Structure A BST node contains: A BST contains
A key (used to search) The data associated with that key Pointers to children, parent Leaf nodes have NULL pointers for children A BST contains A pointer to the root of the tree.

BST Operations: Insert
BST property must be maintained Algorithm sketch: To insert data with key k Compare k to root.key If k < root.key, go left If k > root.key, go right Repeat until you reach a leaf. That's where the new node should be inserted. Note: keep track of prospective parent along the way.

BST Operations: Insert
Running time: The new node is inserted at a leaf position, so this depends on the height of the tree. Worst case: Inserting keys 1,2,3,... in this order will result in a tree that looks like a chain: Tree has degenerated to list Height : linear Note also that such a tree is worse than a linked list since it takes up more space (more pointers) 1 2 3

BST Operations: Insert
Running time: The new node is inserted at a leaf position, so this depends on the height of the tree. Best case The top levels of the tree are filled up completely The height is then logn where n is the number of nodes in the tree. 12 4 14 2 8 16

BST Operations: Insert
The height of a complete (i.e. all levels filled up) BST with n nodes is logarithmic. Why? Level i has 2i nodes, for i=0 (top level) through h (=height) The total number of nodes, n, is then: n = h = (2h+1-1)/(2-1) = 2h+1-1 Solving for h gives us h  logn

BST Operations: Insert
Analysis conclusion An insert operation consists of two parts: Search for the position best case logarithmic worst case linear Physically insert the node constant

BST Operations: Insert
What if we allow duplicate keys? Idea #1 : Always insert in the right subtree Results in very unbalanced tree Idea #2 : Insert in alternate subtrees Makes it difficult to search for all occurrences Idea #3 : All elements with the same key are inserted in a single node Good idea! Easy to search, does not affect balance any more than non-duplicate insertion.

BST Operations: Insert
What if we allow variable number of children? (n-ary tree) Idea : Use a vector/list of pointers to children.

BST Operations: Search
Take advantage of the BST property. Algorithm sketch: Compare target to root If equal, return success If target < root, search left If target > root, search right Running time: Similar to insert

BST Operations: Delete
The Delete operation consists of two parts: Search for the node to be deleted best case constant (deleting the root) worst case linear Delete the node best case? worst case?

BST Operations: Delete
CASE #1 The node to be deleted is a leaf node. Easy! Physically remove the node. Constant time We are just resetting its parent's child pointer and deallocating memory

BST Operations: Delete
CASE #2 The node to be deleted has exactly one child Easy! Physically remove the node. Constant time We are just resetting its parent's child pointer, its child's parent pointer and deallocating memory

BST Operations: Delete
CASE #3 The node to be deleted has two children Not so easy If we physically delete the node, we'll have to place its two children somewhere. This seems to require too much tree restructuring. But we know it's easy to delete a node that has at most one child. What if we find such a node whose contents can be copied over without violating the BST property and then physically delete that node?

BST Operations: Delete
CASE #3, continued The node to be deleted, x, has two children Idea: Find the x's immediate successor, y. It is guaranteed to have at most one child Copy the y's contents over to x Physically delete y.

BST Operations: Delete
Finding the immediate successor: We know that the node has two children. Due to the BST property, the immediate successor will be in the right subtree. In particular, the immediate successor will be the smallest element in the right subtree. The smallest element in a BST is always the leftmost leaf.

BST Operations: Delete
Finding the immediate successor: Since it requires traveling down the tree from the current node to a leaf, it may take up to linear time in the worst case. In the best case it will take logarithmic time. The time to perform the copy and delete the successor is constant.

Binary Search Trees Traversing a tree = visiting its nodes
Three major ways to traverse a binary tree: preorder visit root visit left subtree visit right subtree postorder visit left subtree visit right subtree visit root When applied on a BST, it visits the nodes in order from smaller to larger inorder visit left subtree visit root visit right subtree

Binary Search Trees How long does this take?
void print_inorder(Node *subroot ) { if (subroot != NULL) { print_inorder(subroot  left); cout << subrootdata; print_inorder(subroot right); } How long does this take? There is exactly one call to print_inorder() for each node of the tree. There are n nodes, so the running time of this operation is (n)

Binary Search Trees A tree may also be traversed one "level" at a time (top to bottom, left to right). This is usually called a level-order traversal. It requires the use of a temporary queue: enqueue root while (queue is not empty) { get the front element, f print f enqueue f's children dequeue }

Binary Search Trees 12 4 14 2 8 16 6 10 in-order : pre-order: post-order: level-order:

Binary Search Trees Idea for sorting algorithm: Running time:
Given a sequence of integers, insert each one in a BST Perform an inorder traversal. The elements will be accessed in sorted order. Running time: In the worst case, the tree will degenerate to a list. Creation will take quadratic time and traversal will be linear. Total: O(n2) On average, the tree will be mostly balanced. Creation will take O(nlogn) and traversal will again be linear. Total: O(nlogn)

BSTs vs. Lists Time Space
In the worst case, all dictionary operations are linear. On average, BSTs are expected to do better. Space BSTs store an additional pointer per node. The BST seemed like a good idea, but in the end it doesn't offer much improvement. We must find a way to keep the tree balanced and guarantee logarithmic height.

Balanced Trees There are several ways to define balance Examples:
Force the subtrees of each node to have almost equal heights Place upper and lower bounds on the heights of the subtrees of each node. Force the subtrees of each node to have similar sizes (=number of nodes)

Download ppt "BST Data Structure A BST node contains: A BST contains"

Similar presentations