Balanced Binary Search Trees

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Chapter 4: Trees Part II - AVL Tree
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Jan Binary Search Trees What is a search binary tree? Inorder search of a binary search tree Find Min & Max Predecessor and successor BST insertion.
Balanced Binary Search Trees
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 10.
1 Brief review of the material so far Recursive procedures, recursive data structures –Pseudocode for algorithms Example: algorithm(s) to compute a n Example:
Red-Black Trees CIS 606 Spring Red-black trees A variation of binary search trees. Balanced: height is O(lg n), where n is the number of nodes.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 11.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Lecture 12: Balanced Binary Search Trees Shang-Hua Teng.
Chapter 6: Transform and Conquer Trees, Red-Black Trees The Design and Analysis of Algorithms.
Analysis of Algorithms CS 477/677 Red-Black Trees Instructor: George Bebis (Chapter 14)
4a-Searching-More1 More Searching Dan Barrish-Flood.
CSE 326: Data Structures AVL Trees
1 Red-Black Trees. 2 Black-Height of the tree = 4.
1 COSC 2P03 Lecture #5 – Trees Part III, Heaps. 2 Today Take up the quiz Assignment Questions Red-Black Trees Binary Heaps Heap sort D-Heaps, Leftist.
Department of Computer Eng. & IT Amirkabir University of Technology (Tehran Polytechnic) Data Structures Lecturer: Abbas Sarraf Search.
Dynamic Set AVL, RB Trees G.Kamberova, Algorithms Dynamic Set ADT Balanced Trees Gerda Kamberova Department of Computer Science Hofstra University.
Self-Balancing Search Trees Chapter 11. Chapter 11: Self-Balancing Search Trees2 Chapter Objectives To understand the impact that balance has on the performance.
Fall 2007CS 2251 Self-Balancing Search Trees Chapter 9.
Self-Balancing Search Trees Chapter 11. Chapter Objectives  To understand the impact that balance has on the performance of binary search trees  To.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 9 Balanced trees Motivation Red-black trees –Definition, Height –Rotations, Insert,
CSC 212 Lecture 19: Splay Trees, (2,4) Trees, and Red-Black Trees.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
David Luebke 1 7/2/2015 ITCS 6114 Red-Black Trees.
Design & Analysis of Algorithms Unit 2 ADVANCED DATA STRUCTURE.
Balanced Trees Balanced trees have height O(lg n).
Splay Trees and B-Trees
CS 61B Data Structures and Programming Methodology Aug 11, 2008 David Sun.
1 Red-Black Trees. 2 Definition: A red-black tree is a binary search tree where: –Every node is either red or black. –Each NULL pointer is considered.
Red-Black Trees Lecture 10 Nawazish Naveed. Red-Black Trees (Intro) BSTs perform dynamic set operations such as SEARCH, INSERT, DELETE etc in O(h) time.
David Luebke 1 9/18/2015 CS 332: Algorithms Red-Black Trees.
Chapter 12. Binary Search Trees. Search Trees Data structures that support many dynamic-set operations. Can be used both as a dictionary and as a priority.
Balancing Binary Search Trees. Balanced Binary Search Trees A BST is perfectly balanced if, for every node, the difference between the number of nodes.
Balanced Trees (AVL and RedBlack). Binary Search Trees Optimal Behavior ▫ O(log 2 N) – perfectly balanced tree (e.g. complete tree with all levels filled)
Mudasser Naseer 1 10/20/2015 CSC 201: Design and Analysis of Algorithms Lecture # 11 Red-Black Trees.
Lecture 10 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
1 Balanced Trees There are several ways to define balance Examples: –Force the subtrees of each node to have almost equal heights –Place upper and lower.
2IL50 Data Structures Fall 2015 Lecture 7: Binary Search Trees.
Search Trees Chapter   . Outline  Binary Search Trees  AVL Trees  Splay Trees.
CS 473Lecture X1 CS473-Algorithms Lecture RED-BLACK TREES (RBT)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
CS 61B Data Structures and Programming Methodology Aug 7, 2008 David Sun.
Red-Black Trees. Review: Binary Search Trees ● Binary Search Trees (BSTs) are an important data structure for dynamic sets ● In addition to satellite.
Red Black Trees. History The concept of a balancing tree was first invented by Adel’son-Vel’skii and Landis in They came up with the AVL tree. In.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
Analysis of Algorithms CS 477/677 Red-Black Trees Instructor: George Bebis (Chapter 14)
AVL Trees AVL (Adel`son-Vel`skii and Landis) tree = – A BST – With the property: For every node, the heights of the left and right subtrees differ at most.
CSC317 1 x y γ β α x y γ β x β What did we leave untouched? α y x β.
Balancing Binary Search Trees. Balanced Binary Search Trees A BST is perfectly balanced if, for every node, the difference between the number of nodes.
1 Red-Black Trees. 2 A Red-Black Tree with NULLs shown Black-Height of the tree = 4.
CS 332: Algorithms Red-Black Trees David Luebke /20/2018.
Red-Black Trees.
Summary of General Binary search tree
Slide Sources: CLRS “Intro. To Algorithms” book website
AVL Trees "The voyage of discovery is not in seeking new landscapes but in having new eyes. " - Marcel Proust.
Red-Black Trees Motivations
CS200: Algorithms Analysis
Slide Sources: CLRS “Intro. To Algorithms” book website
Red-Black Trees.
Lecture 9 Algorithm Analysis
Lecture 9 Algorithm Analysis
Lecture 9 Algorithm Analysis
Analysis of Algorithms CS 477/677
B-Trees.
Chapter 12&13: Binary Search Trees (BSTs)
Red-Black Trees CS302 Data Structures
Presentation transcript:

Balanced Binary Search Trees A binary search tree can implement any of the basic dynamic-set operations in O(h) time. These operations are O(lgn) if tree is “balanced”. There has been lots of research on developing algorithms to keep binary search trees balanced, including Those that insert nodes as is done in the BST insert, then rebalance tree: Red-Black trees AVL trees Splay trees Those that allow more than one key per node of the search tree: 2-3 trees 2-3-4 trees B-trees So RBT is a bst + 1 bit per node: an attribute color. All leaves are empty (nil) and colored black. The root's parent is also nil

Red-Black Trees (RBT) (Ch. 13) Red-Black tree: BST in which each node is colored red or black. Constraints on the coloring and connection of nodes ensure that no root to leaf path is more than twice as long as any other, so tree is approximately balanced. Each RBT node contains fields left, right, parent, color, and key. So RBT is a bst + 1 bit per node: an attribute color. All leaves are empty (nil) and colored black. The root's parent is also nil L E F T PARENT R I G H T KEY COLOR

Red-Black Properties Red-Black tree properties: Every node is either red or black. The root is black. Every leaf contains NIL and is black. If a node is red, then both its children are black. For each node x, all paths from x to its descendant leaves contain the same number of black nodes. A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. 20 22 25 21 18 17 19

Black Height bh(x) Black-height of a node x: bh(x) is the number of black nodes (including the NIL leaf) on the path from x to a leaf, not counting x itself. 2 20 Every node has a black-height, bh(x). For all NIL leaves, bh(x) = 0. For root x, bh(x) = bh(T). 2 18 1 22 1 1 17 19 1 21 1 25 A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. Black-heights shown in blue above. Claim: Any node with height h has black-height ≥ h/2.

Red-Black Tree Height Lemma: A red-black tree with n internal nodes has height at most 2lg(n+1). Proof: Claim 1: The subtree rooted at any node x contains at least 2bh(x) - 1 internal nodes. Proof is by induction on the height of the node x. Basis: height of x is 0 with bh(x) = 0. Then x is a leaf and its subtree contains 20-1=0 internal nodes. Inductive step: Consider a node x that is an internal node with 2 children. Each child of x has bh either equal to bh(x) (red child) or bh(x)-1 (black child). Do proof starting on page 13-2. (Page 274 of text)

Red-Black Tree Height Claim 1: The subtree rooted at any node x contains at least 2bh(x) - 1 internal nodes. We can apply the IHOP to the children of x to find that the subtree rooted at each child of x has at least 2bh(x)-1 - 1 internal nodes. Thus, the subtree rooted at x has at least 2(2bh(x)-1 - 1 ) + 1 internal nodes = 2bh(x) - 1 internal nodes. Do proof starting on page 13-2. (Page 274 of text)

Red-Black Tree Height Lemma: A red-black tree with n internal nodes has height at most 2lg(n+1). Rest of proof of lemma: Let h be the height of the tree. By property 4 of RBTs, at least 1/2 the nodes on any root to leaf path are black. Therefore, the black-height of the root must be at least h/2. Thus, by claims, n ≥ 2h/2 -1, so n+1 ≥ 2h/2, and, taking the lg of both sides, lg(n+1) ≥ h/2, which means that h ≤ 2lg(n+1). Do proof starting on page 13-2. (Page 274 of text)

Red-Black Tree Height Since a red-black tree is a BST, the dynamic-set operations for Search, Minimum, Maximum, Successor, and Predecessor for a BST can be implemented as-is on red-black trees, and since they take O(h) time on a BST, they take O(lgn) time on a red-black tree. The operations Tree-Insert and Tree-Delete can also be done in O(lgn) time on red-black trees. However, after inserting or deleting, the nodes of the tree must be moved around to ensure that the red-black properties are maintained. Do proof starting on page 13-2. (Page 274 of text)

Operations on Red-Black Trees All non-modifying bst operations (min, max, succ, pred, search) run in O(h) = O(lgn) time on red-black trees. Insertion and deletion are more complex. If we insert a node, what color do we make the new node? * If red, the node might violate property 4. * If black, the node might violate property 5. If we delete a node, what color was the node that was removed? * Red? OK, since we won't have changed any black- heights, nor will we have created 2 red nodes in a row. Also, if node removed was red, it could not have been the root. * Black? Could violate property 4, 5, or 2.

Red-Black Tree Rotations Algorithms to restore Red-Black tree property to tree after Tree-Insert and Tree-Delete include right and left rotations and re-coloring nodes. We will not cover these RBT algorithms. The number of rotations for insert and delete are constant, but they may take place at every level of the tree, so therefore the running time of insert and delete is O(lg(n)) x    y LR(T,x) RR(T,y) For Right-Rotate(T, y), we assume that the left child of y is not nil. y  x  

Operations on Red-Black Trees The red-black tree procedures are intended to preserve the lg(n) running time of dynamic set operations by keeping the height of a binary search tree low. You only need to read section 1 of Chapter 13 (unless, of course, you want to read more :o). Given a binary search tree with red and black nodes, you should be able to decide whether it follows all the rules for a red-black tree (i.e., given the keys and coloring of each node, you should be able to determine whether a given binary search tree is a red-black tree and show the black- height of each node in the tree).

AVL Trees Developed by Russians Adelson-Velsky and Landis (hence AVL). This algorithm is not covered in our text, but the operations insert/delete are easier to understand than red-black tree insert/delete. Goal: keep the height of a binary search tree low. Definition: An AVL tree is a BST in which the balance factor of every node is either 0, +1, or -1. The balance factor of node x is the difference in heights of nodes in x's left and right subtrees (the height of the right subtree minus the height of the left subtree at each node) 20 22 25 21 18 17 19

AVL Trees In the tree below, all nodes have a balance factor of 0, meaning that the subtrees at all nodes are balanced. 20 22 18 19 21 25 17

AVL Trees Inserting or deleting a node is done according to the regular BST insert or delete procedure and then the nodes are rotated (if necessary) to re-balance the tree. E.g., suppose we had the BST formed by inserting keys 20, 18, 22, 17, 19, 25. Now insert 23. Inserting or deleting a node from an AVL tree can cause some node to have a balance factor of +2 or -2, in which case the algorithm causes rotations of particular branches to restore the 0, +1 or -1 balance factor at each node. 20 22 25 23 18 17 19 -1 +2 A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Example of AVL rotations There are 4 basic types of rotation in an AVL tree: - - A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Example of AVL rotations + + A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Example of AVL rotations - + The LR rotation is a combination of 2 separate rotations: The first rotation produces a tree with an LL imbalance and then the tree is balanced using the rules for an LL imbalance.

Example of AVL rotations + - A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. The RL rotation is a combination of 2 separate rotations: The first rotation produces a tree with an RR imbalance and then the tree is balanced using the rules for an RR imbalance.

Each unbalanced state requires at most two rotations (or “transplants”), with the changing of a constant number of pointers. If a node is inserted into a balanced AVL tree, the tree is rebalanced by changing at most ?? pointers

AVL Trees When a node is inserted and causes an imbalance as shown on the last slide, two rotations take place to restore the balance in the tree. -1 20 22 25 23 18 17 19 +2 This is the RL case 20 22 25 23 18 17 19 Two rotations produces the balanced tree shown below. +1 20 22 25 23 18 17 19 +2 This is the RR case

AVL rotation practice -1 2 -3 -1 1 For each of these trees, show the balance factor at each node and indicate whether the tree is balanced. -1 2 -3 -1 1 A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Inserting nodes into AVL tree Insert the following nodes into an AVL tree, in the order specified. Show the balance factor at each node as you add each one. When an imbalance occurs, specify the rotations needed to restore the AVL property. Nodes = <30, 40, 35, 21, 28, 29> A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. Correcting the imbalance after each insert/delete ensures that all dictionary operations on an AVL tree will take ?? time in the worst-case.

Splay Trees Developed by Daniel Sleator and Robert Tarjan. Not covered in our text. Maintains a binary tree but makes no attempt to keep the height low. Uses a "Least Recently Used" (LRU) policy so that recently accessed elements are quick to access again because they are closer to the root. Basic operation is called splaying. Splaying the tree for a certain element rearranges it so that the splayed element is placed at the root of the tree. Performs a standard BST search and then does a sequence of rotations to bring the element to the root of the tree. Frequently accessed nodes will move nearer to the root where they can be accessed more quickly. These trees are particularly useful for implementing caches and garbage collection algorithms. A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Splay Trees Has amortized performance of O(lg(n)) if the same keys are searched for repeatedly. Amortized performance means that some operations can be very expensive, but those expensive operations can be averaged with the costs of many fast operations. A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

Splaying There are three types of splay steps (x is the node searched for and p is x’s parent): A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. Zig Step: This step is done when p is the root. The tree is rotated on the edge between x and p.

Splaying A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. Zig-zig Step: This step is done when p is not the root and x and p are either both right children or are both left children.

Splaying Zig-Zag Step: This step is done when p is not the root and x is a right child and p is a left child or vice versa. (Looks like the LR in AVL trees, right?) A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x. Splay trees yield O(lg(n)) running time only if the pattern of searches produces few rotations. This is a probabilistic algorithm. A very adversarial sequence of searches could result in a higher running time.

2-3 Trees Developed by John Hopcroft in 1974. This algorithm is not covered in our text. Another set of procedures to keep the height of a binary search tree low. Definition: A 2-3 tree is a tree that can have nodes of two kinds: 2-nodes and 3-nodes. A 2-node contains a single key K and has two children, exactly like any other binary search tree node. A 3-node contains two ordered keys K1 and K2 (K1 < K2) and has three children. The leftmost child is the root of a subtree with keys less than K1, the middle child is the root of a subtree with keys between K1 and K2, and the rightmost child is the root of a subtree with keys greater than K2. The last requirement of a 2-3 tree is that all its leaves must be on the same level, i.e., a 2-3 tree is always perfectly height-balanced.

2-3 Trees Search for a key k in a 2-3 tree: Start at the root. If the root is a 2-node, treat it exactly the same as a search in a regular binary search tree. Stop if k is equal to the root's key, go to the left child if k is smaller, and to to the right child if k is larger. If the root is a 3-node, stop if k is equal to either of the root's keys, go to the left child if k is less than K1, to the middle child if k is greater than K1 but less than K2, and go to the right child if k is greater than K2. Treat each new node on the search path exactly the same as the root. A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

2-3 Trees Insert a key k in a 2-3 tree: (node always inserted as leaf) Start at the root. Search for k until reaching a leaf. If leaf is a 2-node, insert k in proper position in leaf, either before or after key that already exists in the leaf, making the leaf a 3-node. If leaf is a 3-node, split the leaf in two: the smallest of the 3 keys (2 old ones and 1 new one) is put in the first leaf, the largest key is put in the second leaf, and the middle key is promoted to the old leaf's parent. This may cause overload on the parent leaf and can lead to several node splits along the chain of the leaf's ancestors. The splits go up the tree and the height expands upwards.

Inserting nodes into 2-3 tree Insert the following nodes into a 2-3 tree, in the order specified. When an overload occurs, specify the changes needed to restore the 2-3 property. Nodes = <50 60 70 40 30 20 10 80 90> What is the smallest number of keys in a tree of height h? What is the largest number of keys in a tree of height h? A 2-3 tree of height h with the smallest number of keys ??? (height = ???). A 2-3 tree of height h with largest number of keys is ??? (height = ???). Therefore, all operations are ???. A 2-3 tree of height h with the smallest number of keys is a full tree of 2 nodes (height = ???). A 2-3 tree of height h with largest number of keys is a full tree of 3 nodes, each with 2 keys and 3 children (height = ???). Therefore, all operations are ???.

B-Trees Developed by Bayer and McCreight in 1972. Our text covers these trees in Chapter 18. B-trees are balanced search trees designed to work well on magnetic disks or other secondary-storage devices to minimize disk I/O operations. Internal nodes can have a variable number of child nodes within some pre-defined range, m. B-trees have substantial advantages over alternative implementations when node access times far exceed access times within nodes. You are not expected to know anything more about B-Trees than that they are a strategy for maintaining a balanced search tree and are useful for accessing secondary-storage devices. A consequence of property 4 is that no two red nodes in a row exist on any root to leaf path. Do example from page 13-2. Height of a node is the number of edges in a longest path to a leaf. Black-height of a node x, bh(x), is the number of black nodes (including the nil node at height 0) on the path from x to a leaf, NOT COUNTING x.

End of balanced bst lecture