@ Zhigang Zhu, 2002-2015 1 CSC212 Data Structure - Section FG Lecture 17 B-Trees and the Set Class Instructor: Zhigang Zhu Department of Computer Science.

Slides:



Advertisements
Similar presentations
Exam Review 3 Chapters 10 – 13, 15 CSC212 Section FG CS Dept, CCNY.
Advertisements

COL 106 Shweta Agrawal and Amit Kumar
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
They’re not just binary anymore!
@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 22 Recursive Sorting, Heapsort & STL Quicksort Instructor: Zhigang Zhu Department.
@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 19 Searching Instructor: Zhigang Zhu Department of Computer Science City College.
Exam Review 3 Chapters 10 – 13, 15 CSC212 Section FG CS Dept, CCNY.
Trees II Kruse and Ryba Ch 10.1,10.2,10.4 and 11.3.
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 19: B-Trees: Data Structures for Disk.
Btrees21 B-trees: The rest of the story. btrees22 Review of B-tree rules All nodes except root must have at least MINIMUM data entries No node may exceed.
@ Zhigang Zhu, CSC212 Data Structure - Section RS Lecture 18a Trees, Logs and Time Analysis Instructor: Zhigang Zhu Department of Computer.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
General Trees and Variants CPSC 335. General Trees and transformation to binary trees B-tree variants: B*, B+, prefix B+ 2-4, Horizontal-vertical, Red-black.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
1 BST Trees A binary search tree is a binary tree in which every node satisfies the following: the key of every node in the left subtree is.
1 Joe Meehean.  Important and common problem  Given a collection, determine whether value v is a member  Common variation given a collection of unique.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B+ Tree What is a B+ Tree Searching Insertion Deletion.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
Spring 2006 Copyright (c) All rights reserved Leonard Wesley0 B-Trees CMPE126 Data Structures.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
TREES A tree's a tree. How many more do you need to look at? --Ronald Reagan.
CS Data Structures Chapter 15 Trees Mehmet H Gunes
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
A Binary Search Tree Binary Search Trees.
BINARY SEARCH TREE. Binary Trees A binary tree is a tree in which no node can have more than two children. In this case we can keep direct links to the.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Data Structures Week 8 Further Data Structures The story so far  Saw some fundamental operations as well as advanced operations on arrays, stacks, and.
COSC 2007 Data Structures II Chapter 15 External Methods.
P p Chapter 10 has several programming projects, including a project that uses heaps. p p This presentation shows you what a heap is, and demonstrates.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Starting at Binary Trees
Binary Search Tree Traversal Methods. How are they different from Binary Trees?  In computer science, a binary tree is a tree data structure in which.
CS 206 Introduction to Computer Science II 02 / 13 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
CSC211 Data Structures Lecture 18 Heaps and Priority Queues Instructor: Prof. Xiaoyan Li Department of Computer Science Mount Holyoke College.
1 CSC212 Data Structure - Section AB Lecture 22 Recursive Sorting, Heapsort & STL Quicksort Instructor: Edgardo Molina Department of Computer Science City.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
CSC212 Data Structure Lecture 16 Heaps and Priority Queues Instructor: George Wolberg Department of Computer Science City College of New York.
Binary Search Trees (BST)
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
(c) University of Washington20c-1 CSC 143 Binary Search Trees.
AA Trees.
Data Structures and Algorithms for Information Processing
CSC212 Data Structure - Section AB
BST Trees
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Binary Search Tree Chapter 10.
Lecture 22 Binary Search Trees Chapter 10 of textbook
Trees.
Revised based on textbook author’s notes.
Binary Search Trees Why this is a useful data structure. Terminology
CSC212 Data Structure - Section RS
CSC212 Data Structure - Section RS
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
B-Trees This presentation shows you the potential problem of unbalanced tree and show one way to fix it This lecture introduces heaps, which are used.
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
CSC 143 Binary Search Trees.
CSC212 Data Structure - Section KL
Mark Redekopp David Kempe
B-Trees.
Trees.
B-Trees This presentation shows you the potential problem of unbalanced tree and show one way to fix it This lecture introduces heaps, which are used.
Presentation transcript:

@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 17 B-Trees and the Set Class Instructor: Zhigang Zhu Department of Computer Science City College of New York

@ Zhigang Zhu, Topics p Why B-Tree p The problem of an unbalanced tree p The B-Tree Rules p The Set Class ADT with B-Trees p Search for an Item in a B-Tree p Insert an Item in a B-Tree (*) p Remove a Item from a B-Tree (*)

@ Zhigang Zhu, The problem of an unbalanced BST p Maximum depth of a BST with n entires: n-1 p An Example: Insert 1, 2, 3,4,5 in that order into a bag using a BST p Run BagTest!

@ Zhigang Zhu, Worst-Case Times for BSTs p Adding, deleting or searching for an entry in a BST with n entries is O(d) in the worst case, where d is the depth of the BST p Since d is no more than n-1, the operations in the worst case is (n-1). p Conclusion: the worst case time for the add, delete or search operation of a BST is O(n)

@ Zhigang Zhu, Solutions to the problem p Solution 1 p Periodically balance the search tree p Project 10.9, page 516 p Solution 2 p A particular kind of tree : B-Tree p proposed by Bayer & McCreight in 1972

@ Zhigang Zhu, The B-Tree Basics p Similar to a binary search tree (BST) p where the implementation requires the ability to compare two entries via a less-than operator (<) p But a B-tree is NOT a BST – in fact it is not even a binary tree p B-tree nodes have many (more than two) children p Another important property p each node contains more than just a single entry p Advantages: p Easy to search, and not too deep

@ Zhigang Zhu, Applications: bag and set p The Difference  two or more equal entries can occur many times in a bag, but not in a set p C++ STL: set and multiset (= bag) p The B-Tree Rules for a Set  We will look at a “ set formulation” of the B- Tree rules, but keep in mind that a “ bag formulation” is also possible

@ Zhigang Zhu, The B-Tree Rules p The entries in a B-tree node  B-tree Rule 1 : The root may have as few as one entry (or 0 entry if no children); every other node has at least MINIMUM entries  B-tree Rule 2 : The maximum number of entries in a node is 2* MINIMUM.  B-tree Rule 3 : The entries of each B-tree node are stored in a partially filled array, sorted from the smallest to the largest.

@ Zhigang Zhu, The B-Tree Rules (cont.) p The subtrees below a B-tree node  B-tree Rule 4 : The number of the subtrees below a non-leaf node with n entries is always n+1  B-tree Rule 5 : For any non-leaf node: p (a). An entry at index i is greater than all the entries in subtree number i of the node p (b) An entry at index i is less than all the entries in subtree number i+1 of the node

@ Zhigang Zhu, An Example of B-Tree 93 and 107 subtree number 0 subtree number 1 subtree number 2 [0] [1] each entry < 93 each entry  (93,107) each entry > 107 What kind traversal can print a sorted list?

@ Zhigang Zhu, The B-Tree Rules (cont.) p A B-tree is balanced  B-tree Rule 6 : Every leaf in a B-tree has the same depth p This rule ensures that a B-tree is balanced

@ Zhigang Zhu, Another Example, MINIMUM = 1 Can you verify that all 6 rules are satisfied? 2 and and

@ Zhigang Zhu, The set ADT with a B-Tree p Combine fixed size array with linked nodes p data[] p *subset[] p number of entries vary p data_count p up to 200! p number of children vary p child_count p = data_count+1? set.hset.h (p ) set.h template template class set class set { public: public: bool insert(const Item& entry); bool insert(const Item& entry); std::size_t erase(const Item& target); std::size_t erase(const Item& target); std::size_t count(const Item& target) const; std::size_t count(const Item& target) const; private: private: // MEMBER CONSTANTS // MEMBER CONSTANTS static const std::size_t MINIMUM = 200; static const std::size_t MINIMUM = 200; static const std::size_t MAXIMUM = 2 * MINIMUM; static const std::size_t MAXIMUM = 2 * MINIMUM; // MEMBER VARIABLES // MEMBER VARIABLES std::size_t data_count; std::size_t data_count; Item data[MAXIMUM+1]; // why +1? -for insert/erase Item data[MAXIMUM+1]; // why +1? -for insert/erase std::size_t child_count; std::size_t child_count; set *subset[MAXIMUM+2]; // why +2? - one more set *subset[MAXIMUM+2]; // why +2? - one more};

@ Zhigang Zhu, Invariant for the set Class p The entries of a set is stored in a B-tree, satisfying the six B-tree rules. p The number of entries in a node is stored in data_count, and the entries are stored in data[0] through data[data_count-1] p The number of subtrees of a node is stored in child_count, and the subtrees are pointed by set pointers subset[0] through subset[child_count-1]

@ Zhigang Zhu, Search for a Item in a B-Tree p Prototype: p std::size_t count(const Item& target) const; p Post-condition: p Returns the number of items equal to the target p (either 0 or 1 for a set).

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10);

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10); i = 1

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10); i = 1 subset[1]

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10); i = 0

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10); i = 0 subset[0]

@ Zhigang Zhu, Searching for an Item: count Start at the root. 1) 1)locate i so that !(data[i]<target) 2) 2)If (data[i] is target) return 1; else if (no children) return 0; else return subset[i]->count (target); 19 and 22 6 and 17 2 and search for 10: cout << count (10); i = 0 data[i] is target !

@ Zhigang Zhu, Insert a Item into a B-Tree p Prototype: p bool insert(const Item& entry); p Post-condition: p If an equal entry was already in the set, the set is unchanged and the return value is false. p Otherwise, entry was added to the set and the return value is true.

@ Zhigang Zhu, Insert an Item in a B-Tree Start at the root. 1) 1)locate i so that !(data[i]<entry) 2) 2)If (data[i] is entry) return false; // no work! else if (no children) insert entry at i; return true; else return subset[i]->insert (entry); 19 and 22 6 and 17 2 and insert (11); i = 0 data[i] is target ! i = 0 i = 1

@ Zhigang Zhu, Insert an Item in a B-Tree Start at the root. 1) 1)locate i so that !(data[i]<entry) 2) 2)If (data[i] is entry) return false; // no work! else if (no children) insert entry at i; return true; else return subset[i]->insert (entry); 19 and 22 6 and 17 2 and insert (11); // MIN = 1 -> MAX = 2 i = 1 data[0] < entry ! i = 0 i = 1

@ Zhigang Zhu, Insert an Item in a B-Tree Start at the root. 1) 1)locate i so that !(data[i]<entry) 2) 2)If (data[i] is entry) return false; // no work! else if (no children) insert entry at i; return true; else return subset[i]->insert (entry); insert (11); // MIN = 1 -> MAX = 2 19 and 22 6 and 17 2 and & i = 1 put entry in data[1] i = 0 i = 1

@ Zhigang Zhu, Insert an Item in a B-Tree Start at the root. 1) 1)locate i so that !(data[i]<entry) 2) 2)If (data[i] is entry) return false; // no work! else if (no children) insert entry at i; return true; else return subset[i]->insert (entry); insert (1); // MIN = 1 -> MAX = 2 19 and 22 6 and 17 2 and & i = 0 => put entry in data[0] i = 0

@ Zhigang Zhu, Insert an Item in a B-Tree Start at the root. 1) 1)locate i so that !(data[i]<entry) 2) 2)If (data[i] is entry) return false; // no work! else if (no children) insert entry at i; return true; else return subset[i]->insert (entry); insert (1); // MIN = 1 -> MAX = 2 a node has MAX+1 = 3 entries! 19 and 22 6 and 17 1, 2 and & i = 0

@ Zhigang Zhu, Insert an Item in a B-Tree Fix the node with MAX+1 entries ¶ ¶split the node into two from the middle ¶ ¶move the middle entry up insert (1); // MIN = 1 -> MAX = 2 a node has MAX+1 = 3 entries! 19 and 22 6 and 17 1, 2 and &

@ Zhigang Zhu, Insert an Item in a B-Tree Fix the node with MAX+1 entries ¶ ¶split the node into two from the middle ¶ ¶move the middle entry up insert (1); // MIN = 1 -> MAX = 2 Note: This shall be done recursively... the recursive function returns the middle entry to the root of the subset. 19 and 22 6 and and &

@ Zhigang Zhu, Inserting an Item into a B-Tree p What if the node already have MAXIMUM number of items?  Solution – loose insertion (p 551 – 557) p A loose insert may results in MAX +1 entries in the root of a subset p Two steps to fix the problem: p fix it – but the problem may move to the root of the set p fix the root of the set

@ Zhigang Zhu, Erasing an Item from a B-Tree p Prototype: p std::size_t erase(const Item& target); p Post-Condition: p If target was in the set, then it has been removed from the set and the return value is 1. p Otherwise the set is unchanged and the return value is zero.

@ Zhigang Zhu, Erasing an Item from a B-Tree p Similarly, after “loose erase”, the root of a subset may just have MINIMUM –1 entries p Solution: (p557 – 562) p Fix the shortage of the subset root – but this may move the problem to the root of the entire set p Fix the root of the entire set (tree)

@ Zhigang Zhu, Summary p A B-tree is a tree for sorting entries following the six rules p B-Tree is balanced - every leaf in a B-tree has the same depth p Adding, erasing and searching an item in a B-tree have worst-case time O(log n), where n is the number of entries p However the implementation of adding and erasing an item in a B-tree is not a trivial task.