Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.

Slides:



Advertisements
Similar presentations
B-Trees. Motivation When data is too large to fit in the main memory, then the number of disk accesses becomes important. A disk access is unbelievably.
Advertisements

Chapter 4: Trees Part II - AVL Tree
Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
B-Trees. Motivation for B-Trees Index structures for large datasets cannot be stored in main memory Storing it on disk requires different approach to.
Tirgul 6 B-Trees – Another kind of balanced trees Some notes regarding Home Work.
Data Structures and Algorithms1 B-Trees with Minimum=1 2-3 Trees.
Other time considerations Source: Simon Garrett Modifications by Evan Korth.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
COMP 171 Data Structures and Algorithms Tutorial 9 B-Trees.
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
B + -Trees (Part 2) Lecture 21 COMP171 Fall 2006.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
Chapter 10 Search Structures Instructors: C. Y. Tang and J. S. Roger Jang All the material are integrated from the textbook "Fundamentals of Data Structures.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Tirgul 6 B-Trees – Another kind of balanced trees.
AVL Trees / Slide 1 Deletion  To delete a key target, we find it at a leaf x, and remove it. * Two situations to worry about: (1) target is a key in some.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
Different Tree Data Structures for Different Problems
Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
AVL Tree Definition: Theorem (Adel'son-Vel'skii and Landis 1962):
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
Starting at Binary Trees
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
2-3 Trees Extended tree.  Tree in which all empty subtrees are replaced by new nodes that are called external nodes.  Original nodes are called internal.
B-trees Eduardo Laber David Sotelo. What are B-trees? Balanced search trees designed for secondary storage devices Similar to AVL-trees but better at.
 B-tree is a specialized multiway tree designed especially for use on disk  B-Tree consists of a root node, branch nodes and leaf nodes containing the.
Data Structures – Week #6 Special Trees. January 14, 2016Borahan Tümer, Ph.D.2 Outline Adelson-Velskii-Landis (AVL) Trees Splay Trees B-Trees.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
B-Tree.
B+-Tree Deletion Underflow conditions B+ tree Deletion Algorithm
ITEC 2620M Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: ec2620m.htm Office: TEL 3049.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
B-Tree Michael Tsai 2017/06/06.
Chapter 18: B-Trees Example: M Note: Each leaf has the same depth D H
B-Trees Example: Comp 750, Fall 2009 M Note: Each leaf
Chapter 18 B-Trees Lee, Hsiu-Hui
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Slide Sources: CLRS “Intro. To Algorithms” book website
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Slide Sources: CLRS “Intro. To Algorithms” book website
2-3 (or B-) Trees, in lieu of Chapter 18
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
B-Tree Presenter: Jun Tao.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
B-Trees B-trees are characterized in the following way:
Presentation transcript:

Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro. To Algorithms” book website (copyright McGraw Hill) adapted and supplemented

CLRS “Intro. To Algorithms” Ch. 2: B-Trees

Disc rotation speed: 7200 RPM, typical Arm movement: few ms. Total access time for data on disc: 3-9 ms., typical Versus access time in RAM: < 100 ns. B-tree objective: to optimize disc access by using the full amount of information retrieved per disc access, i.e., one page. So: Size of 1 node of the B-tree = 1 page

Definition of B-trees A B-tree T is a rooted tree (root = root[T]) have the properties. 1. Every node x has the following fields a. n[x], the number of keys currently stored in node x. b. The n[x] keys themselves, stored in non-decreasing order so that key 1 [x] ≤ key 2 [x]≤ … ≤ key n[x] [x] c. leaf[x], a boolean that is TRUE if x is a leaf and FALSE otherwise. 2. Each internal (=non-leaf) node contains n[x]+1 pointers c 1 [x], c 2 [x], …, c n[x]+1 [x] to children. Leaf nodes have no such ptrs. 3. The keys key i [x] separate the ranges of keys stored in each subtree: if k i is any key stored in the subtree with root c i [x] then k 1 ≤ key 1 [x] ≤ k 2 ≤ key 2 [x] ≤ … ≤ key n[x] [x] ≤ k n[x]+1 4. All leaves have the same depth, which is the tree’s height h. 5. A fixed integer t  2 is called the minimum degree of the B-tree: a. Every node other than the root must have at least t-1 keys. So, every internal node other than the root has at least t children. If the tree is non- empty the root must have at least one key. b. Every node can contain at most 2t-1 keys. Therefore, an internal node can have at most 2t children. A node is said to be full if it contains 2t-1 keys.

Th. 18.1: If n  1, then for any n-key B-tree of height h and min degree t  2, h ≤ log t (n+1)/2. Proof: At least 2 nodes at depth 1, at least 2t at depth 2, 2t 2 at depth 3, …, 2t h-1 at depth h. Therefore, n  1 + (t-1) ∑ i=1..h 2t i-1 = 1 + 2(t-1)(t h -1)/(t-1) = 2t h – 1  h ≤ log t (n+1)/2

Searching a B-tree

Creating an empty B-tree

Splitting a Full Node Assumption is that parent is not full. Will be justified!

Inserting into a B-tree

Always want to insert into a leaf to avoid having to create new children for an internal node (remember no. of children = 1 + no. of keys).

Deletion Strategy Deletion strategy is a mirror-image of insertion. Recall in insertion that before the new key k moves to the next node (in a downward pass starting just before the root), it checks to see if that node is full; if it is full, then the node is first split to make it non-full. This guarantees that when k finally reaches the target leaf node n, then it can be simply inserted because n will not be full. For deletion, as we move downward searching for the key k to be deleted starting at the root, we check if the next node is minimal (= opposite of full = has t – 1 keys). If it is minimal, then the node is first made non-minimal: If there is a non-minimal sibling then take a key and child ptr. from that sibling. (Case 3a) x x xβ x x x x α x x c i [x] is minimal Non-minimal sibling 1. Move α down from the parent of c i [x] to c i [x]. 2. Move β from the non-minimal sibling of c i [x] up to the parent to take the place of α. 3. Move the child ptr. next to β over to c i [x]. Many child ptrs. not drawn to avoid clutter. Case 3a

Deletion Strategy, cont. x x x x x α x x c i [x] is minimal 1. Copy data (keys/ptrs) from one sibling (doesn’t matter which) to c i [x]. 2. Move the separating key α from the parent down to the median position in c i [x] and delete the pointer to the chosen sibling. 3. Delete the chosen sibling. The above merges c i [x] with its sibling and α. Many child ptrs. not drawn to avoid clutter. Case 3b x x x Left sibling minimal Right sibling minimal Merge If each sibling is minimal, then the node is merged with one sibling together with the separating key from the parent. (Case 3b)

When we reach the node x which actually contains the key k to be deleted there are two possibilities: x is a leaf: simply delete k. By previous step x is guaranteed to be non-minimal so k can indeed be deleted. (Case 1) x is an internal node: we can’t simply delete k even if x is not minimal – we have to replace k with another key to separate the child of x before k and the child after k: If the child y that precedes k is non-minimal, then the replacement k’ of k is chosen from the subtree rooted at y. Particularly, k’ is the greatest element of the subtree rooted at y. This replacement k’ is recursively deleted from its original location and put in place of k. (Case 2a) x x x k x x x Non-minimal y Case 2a Deletion Strategy, cont. k’ 1. Delete k’ from its original position. 2. Replace k with k’. Subtree rooted at y = largest in subtree

Otherwise, if the child z that follows k is non-minimal, then the replacement k’ of k is chosen from the subtree rooted at z. Particularly, k’ is the smallest element of the subtree rooted at z. This replacement k’ is recursively deleted from its original location and put in place of k. (Case 2b, symmetric to 2a) x x x k x x x Non-minimal z Case 2b Deletion Strategy, cont. k’ Subtree rooted at z 1. Delete k’ from its original position. 2. Replace k with k’. = smallest in subtree

Otherwise, if both the child y before k and the child z after k are minimal then merge them with k into one node, so that x loses a child, and then recursively delete k from the new non-minimal node. (Case 2c) Deletion Strategy, cont. x x k x x y is minimal Many child ptrs. not drawn to avoid clutter. Case 2c x x x z is minimal Merge 1. Copy data (keys/ptrs) from z to y. 2. Move k from the parent down to the median position in y and delete the pointer to z. 3. Delete z. The above merges y with z and k. 4. Delete k from the new y. x

3

Problems Ex Ex Ex Ex Ex Ex Ex Prob. 18-2