CSCE 3110 Data Structures & Algorithm Analysis

Slides:



Advertisements
Similar presentations
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Advertisements

AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
1 AVL Trees. 2 AVL Tree AVL trees are balanced. An AVL Tree is a binary search tree such that for every internal node v of T, the heights of the children.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
1 AVL Trees (10.2) CSE 2011 Winter April 2015.
Chapter 4: Trees Part II - AVL Tree
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Balanced Search Trees. 2-3 Trees Trees Red-Black Trees AVL Trees.
A balanced life is a prefect life.
Tirgul 5 AVL trees.
CSC311: Data Structures 1 Chapter 10: Search Trees Objectives: Binary Search Trees: Search, update, and implementation AVL Trees: Properties and maintenance.
CSC 213 Lecture 7: Binary, AVL, and Splay Trees. Binary Search Trees (§ 9.1) Binary search tree (BST) is a binary tree storing key- value pairs (entries):
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
BST Data Structure A BST node contains: A BST contains
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Binary Search Trees1 ADT for Map: Map stores elements (entries) so that they can be located quickly using keys. Each element (entry) is a key-value pair.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
AVL Trees v z. 2 AVL Tree Definition AVL trees are balanced. An AVL Tree is a binary search tree such that for every internal node v of T, the.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Tirgul 6 B-Trees – Another kind of balanced trees.
Splay Trees and B-Trees
1 Multiway trees & B trees & 2_4 trees Go&Ta Chap 10.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
CSCE 3110 Data Structures & Algorithm Analysis Binary Search Trees Reading: Chap. 4 (4.3) Weiss.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
More Trees Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees Main memory vs. disk ◦ Assumptions so far: ◦ We have assumed that we can store.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
AVL Trees Amanuel Lemma CS252 Algoithms Dec. 14, 2000.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
CSCE 3110 Data Structures & Algorithm Analysis AVL Trees Reading: Chap. 4, Weiss.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Search Trees. Binary Search Tree (§10.1) A binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying.
Comp 335 File Structures B - Trees. Introduction Simple indexes provided a way to directly access a record in an entry sequenced file thereby decreasing.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 12: Multi-way Search Trees Java Software Structures: Designing.
File Organization and Processing Week Tree Tree.
CS 253: Algorithms Chapter 13 Balanced Binary Search Trees (Balanced BST) AVL Trees.
Fall 2006 CSC311: Data Structures 1 Chapter 10: Search Trees Objectives: Binary Search Trees: Search, update, and implementation AVL Trees: Properties.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
CSC 213 Lecture 8: (2,4) Trees. Review of Last Lecture Binary Search Tree – plain and tall No balancing, no splaying, no speed AVL Tree – liberté, égalité,
© 2004 Goodrich, Tamassia Trees
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Balanced Search Trees Chapter 19 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
(2,4) Trees1 What are they? –They are search Trees (but not binary search trees) –They are also known as 2-4, trees.
AVL Trees and Heaps. AVL Trees So far balancing the tree was done globally Basically every node was involved in the balance operation Tree balancing can.
Binary Search Trees1 Chapter 3, Sections 1 and 2: Binary Search Trees AVL Trees   
Part-D1 Binary Search Trees
AVL Trees AVL Trees.
Multiway Search Trees Data may not fit into main memory
Search Trees.
AVL Trees 6/25/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
B+-Trees.
Red-Black Trees 9/12/ :44 AM AVL Trees v z AVL Trees.
(edited by Nadia Al-Ghreimil)
AVL Trees 4/29/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H.
Red-Black Trees 11/13/2018 2:07 AM AVL Trees v z AVL Trees.
Red-Black Trees 11/26/2018 3:42 PM AVL Trees v z AVL Trees.
Red-Black Trees 2018年11月26日3时46分 AVL Trees v z AVL Trees.
Copyright © Aiman Hanna All rights reserved
(edited by Nadia Al-Ghreimil)
Red-Black Trees 2/24/ :17 AM AVL Trees v z AVL Trees.
Red-Black Trees 5/19/2019 6:39 AM AVL Trees v z AVL Trees.
1 Lecture 13 CS2013.
CS210- Lecture 19 July 18, 2005 Agenda AVL trees Restructuring Trees
Presentation transcript:

CSCE 3110 Data Structures & Algorithm Analysis Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Search Trees Reading: Chap. 4, Weiss

Sorting with BST Use binary search trees for sorting Start with unsorted sequence Insert all elements in a BST Traverse the tree…. how ? Running time?

Better Binary Search Trees Prevent the degeneration of the BST : A BST can be set up to maintain balance during updating operations (insertions and removals) Types of BST which maintain the optimal performance: splay trees AVL trees Red-Black trees B-trees

AVL Trees Balanced binary search trees An AVL Tree is a binary search tree such that for every internal node v of T, the heights of the children of v can differ by at most 1.

Height of an AVL Tree Proposition: The height of an AVL tree T storing n keys is O(log n). Justification: The easiest way to approach this problem is to find n(h): the minimum number of internal nodes of an AVL tree of height h. n(1) = 1 and n(2) = 2 for n ≥ 3, an AVL tree of height h contains the root node, one AVL subtree of height n-1 and the other AVL subtree of height n-2.  n(h) = 1 + n(h-1) + n(h-2) given n(h-1) > n(h-2)  n(h) > 2n(h-2) n(h) > 2n(h-2) n(h) > 4n(h-4) … n(h) > 2in(h-2i) pick i = h/2 – 1  n(h) ≥ 2 h/2-1 follow h < 2log n(h) +2  height of an AVL tree is O(log n)

Insertion A binary search tree T is called balanced if for every node v, the height of v’s children differ by at most one. Inserting a node into an AVL tree involves performing an expandExternal(w) on T, which changes the heights of some of the nodes in T. If an insertion causes T to become unbalanced, we travel up the tree from the newly created node until we find the first node x such that its grandparent z is unbalanced node. Since z became unbalanced by an insertion in the subtree rooted at its child y, height(y) = height(sibling(y)) + 2 Need to rebalance...

Insertion: Rebalancing To rebalance the subtree rooted at z, we must perform a restructuring we rename x, y, and z to a, b, and c based on the order of the nodes in an in-order traversal. z is replaced by b, whose children are now a and c whose children, in turn, consist of the four other subtrees formerly children of x, y, and z.

Insertion (cont’d) unbalanced... ...balanced

Restructuring The four ways to rotate nodes in an AVL tree, graphically represented -Single Rotations:

Restructuring (cont’d) double rotations:

Restructure Algorithm Algorithm restructure(x): Input: A node x of a binary search tree T that has both a parent y and a grandparent z Output: Tree T restructured by a rotation (either single or double) involving nodes x, y, and z. 1: Let (a, b, c) be an inorder listing of the nodes x, y, and z, and let (T0, T1, T2, T3) be an inorder listing of the the four subtrees of x, y, and z, not rooted at x, y, or z. Replace the subtree rooted at z with a new subtree rooted at b Let a be the left child of b and let T0, T1 be the left and right subtrees of a, respectively. Let c be the right child of b and let T2, T3 be the left and right subtrees of c, respectively.

Cut/Link Restructure Algorithm Let’s go into a little more detail on this algorithm... Any tree that needs to be balanced can be grouped into 7 parts: x, y, z, and the 4 trees anchored at the children of those nodes (T0-3)

Cut/Link Restructure Algorithm Make a new tree which is balanced and put the 7 parts from the old tree into the new tree so that the numbering is still correct when we do an in-order-traversal of the new tree. This works regardless of how the tree is originally unbalanced. Let’s see how it works!

Cut/Link Restructure Algorithm Number the 7 parts by doing an in-order-traversal. (note that x,y, and z are now renamed based upon their order within the traversal)

Cut/Link Restructure Algorithm Now create an Array, numbered 1 to 7 (the 0th element can be ignored with minimal waste of space) 1 2 3 4 5 6 7 Cut() the 4 T trees and place them in their inorder rank in the array 1 2 3 4 5 6 7

Cut/Link Restructure Algorithm Now cut x,y, and z in that order (child,parent,grandparent) and place them in their inorder rank in the array. 1 2 3 4 5 6 7 Now we can re-link these subtrees to the main tree. Link in rank 4 (b) where the subtree’s root formerly

Cut/Link Restructure Algorithm Link in ranks 2 (a) and 6 (c) as 4’s children.

Cut/Link Restructure Algorithm Finally, link in ranks 1,3,5, and 7 as the children of 2 and 6. Now you have a balanced tree!

Cut/Link Restructure Algorithm This algorithm for restructuring has the exact same effect as using the four rotation cases discussed earlier. Advantages: no case analysis, more elegant Disadvantage: can be more code to write Same time complexity

Removal We can easily see that performing a removeAboveExternal(w) can cause T to become unbalanced. Let z be the first unbalanced node encountered while traveling up the tree from w. Also, let y be the child of z with the larger height, and let x be the child of y with the larger height. We can perform operation restructure(x) to restore balance at the subtree rooted at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached

Removal (cont’d) example of deletion from an AVL tree: y x T T y z x T 88 44 17 78 32 50 48 62 1 4 2 3 54 T y x T 2 1 88 17 78 50 48 62 2 3 54 T y x 44 4 z

Removal (cont’d) example of deletion from an AVL tree: x z y T 88 17 78 50 48 62 1 4 2 3 54 T y x z 2 44

Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children stores a collection of items of the form (k, x), where k is a key and x is an element contains d - 1 items, where d is the number of children  d-nodes “contains” 2 pseudo-items: k0=- , kd=  Children of each internal node are “between” items all keys in the subtree rooted at the child fall between keys of those items

Multi-way Searching Similar to binary searching If search key s<k1 search the leftmost child If s>kd-1 , search the rightmost child That’s it in a binary tree; what about if d>2? Find two keys ki-1 and ki between which s falls, and search the child vi. What would an in-order traversal look like? 3 4

2-4 Trees a. Nodes may contain 1, 2 or 3 items. b. A node with k items has k + 1 children c. All leaves are on same level.

Example 10 45 70 90 100 3 8 25 38

Insertion Insertion: Find the appropriate leaf. If there is only one item, just add to leaf. If no room, move middle item to parent and split remaining two items among two children.

Insertion insert 80 10 45 3 8 70 80 90 100 25 38 Overflow!

Insertion Split & move middle element to parent 10 45 80 90 100 25 38 10 45 80 3 8 90 100 25 38 70

Removal First : find the key with a simple multi-way search If the item to delete has non-external children, reduce to the case where item is at the bottom of the tree: Find item which precedes it in in-order traversal which one? Swap them Remove the item Alternative?

Removal Not enough items in the node Underflow! Pull an item from the parent, replace it with an item from a sibling - transfer Still not good enough! What happens if siblings are 2-nodes? Could we just pull one item from the parent?

Removal Remove 3 move 10 in the subtree move 25 in the parent 10 45 80 10 45 80 90 100 3 25 38 70

Removal If siblings are 2-nodes (i.e. contain only one key) cannot ‘steal’ from them Do node merging Remove 3 move 10 in the subtree merge 10 with 25 10 45 80 3 90 25 70

2-4 Trees More on removal: 2-4 trees are easy to maintain What if parent is a 2-node? Propagate underflow up the tree 2-4 trees are easy to maintain Insertion and deletion take O(log n) Balanced trees

B-Trees Up to now, all data that has been stored in the tree has been in memory. If data gets too big for main memory, what do we do? If we keep a pointer to the tree in main memory, we could bring in just the nodes that we need. For instance, to do an insert with a BST, if we need the left child, we do a disk access and retrieve the left child. If the left child is NIL, then we can do the insert, and store the child node on the disk. Not too good for a BST

B-Trees The problem with BST: storing the data requires disk accesses, which is expensive, compared to execution of machine instructions. If we can reduce the number of disk accesses, then the procedures run faster. The only way to reduce the number of disk accesses is to increase the number of keys in a node. The BST allows only one key per leaf. Very good and often used for Search Engines! (when collection size gets very big  the index does not fit in memory)

B-Trees If we increase the number of keys in the nodes, how will we do any tree operations effectively? 10 20 30 40 50 60 70 Above is a node with 7 keys. How do we add children? 10 20 30 40 50 60 70 1 2 3 4 5 6 7 8 9 11 22 33 44 55 66 77

B-Trees Clearly, the tree below is useless. 10 20 30 40 50 60 70 1 2 3 4 5 6 7 8 9 11 22 33 44 55 66 77 How many pointers do we need? Using the idea of BST, we need to be able to put nodes into the tree that have smaller, same and larger values than the node we are currently examining.

B-Trees: A General Case of Multi-Way Search Trees 70 60 50 40 30 20 10 8 7 9 6 4 3 2 1 11 22 77 33 44 55 66 We can easily find any value. We need to create operations, which require rules on what makes a tree a B-Tree. Clearly, having one key per node would be very bad. We need a mechanism to increase the height of the tree (since the number of keys in any node can get very high) so we can shift keys out of a node, making the nodes smaller.

B-Trees: Fields in a Node A B-Tree is a rooted tree (whose root is root[T])) having the following properties: 1. Every internal node x has the following fields: leaf[x] key8[x] key7[x] key6[x] key5[x] key4[x] key3[x] key2[x] key1[x] n[x] c9[x] c8[x] c7[x] c6[x] c5[x] c4[x] c3[x] c2[x] c1[x] n[x] is the number of keys in the node. n[x] = 8 above. leaf[x] = false for internal nodes, since x is not a leaf. The keyi[x] are the values of the keys, where keyi[x]  keyi+1[x]. ci[x] are pointers to child nodes. All the keys in ci[x] have values that are between keyi-1[x] and keyi[x].

B-Trees Leaf nodes have no child pointers leaf[x] = true for leaf nodes. All leaf nodes are at the same level leaf[x] key8[x] key7[x] key6[x] key5[x] key4[x] key3[x] key2[x] key1[x] n[x]

B-Trees There are lower and upper bounds on the number of keys a node can contain. This depends on the “minimum degree” t  2, which we must specify for any given B-Tree. a. Every node other than the root must have at least t-1 keys. Every internal node other than the root thus has t children. If the tree is nonempty, the root must have at least one key. b. Every node can contain at most 2t-1 keys. Therefore, an internal node can have at most 2t children. A node is full if it contains exactly 2t-1 keys.

Height of B-Tree If n  1, then for any n-key B-tree of height h and mimimum degree t  2, height = h  logt[(n+1)/2] The important thing to notice is that the height of the tree is log base t. So, as t increases, the height, for any number of nodes n, will decrease. Using the formula logax = (logbx)/(logba), we can see that log2106 = (log10106)/(log102)  6/0.30102999566398  19 log10106 = 6 So, 13 less disk accesses to get to the leafs!

Basic Operations The root of the B-tree is always in main memory, so that a Disk-Read on the root is never required; a Disk-Write of the root is required, however, whenever the root node is changed. Any nodes that are passed as parameters must already have had a Disk-Read operation performed on them.

Searching a B-Tree Start at the leftmost key in the node, and go to the right until you go too far. If it is a leaf node, then you are done, as there is no leaf to inspect Otherwise, retrieve the child node from the disk, and put it into memory

Inserting into B-trees Really very easy. Very similar with (2,4) trees. Just keep in mind that you are starting at the root, and then finding the subtree where the key should be inserted, and following the pointer. A deletion may eventually occur, and sometimes deletions force keys into their parents. So, if we encounter a full node on our way to the node where the insertion will take place, we must split that node into two.

Otherwise, call Nonfull() Inserting into B-trees (cont’d) B-Tree-Insert If the node has 2t-1 keys, it can’t accept any more keys, so you need to split it into 2 nodes before doing the insert. Otherwise, call Nonfull()

Deleting Keys from Nodes

Deleting Keys from Nodes