Presentation is loading. Please wait.

Presentation is loading. Please wait.

B-Trees.

Similar presentations


Presentation on theme: "B-Trees."— Presentation transcript:

1 B-Trees

2 But first, a little note about data structures
Not all data structures work well as file structures Example: Binary Search Tree Knight Gibson Sanders Coleman Hudson Monroe

3 Motivation for B-Trees
Index too large for memory search time better than binary search not just fast search, but also fast delete and fast insert What's the "B" stand for? Bayer and McCreight Boeing balanced, bushy, broad

4 Indexed Files Searching Deleting Adding Dannelly Duncan Walters
Data File Indexed Files Level Two Key DRRN Adams 6 1 Barnes 2 Bell 18 3 Bishop 8 4 Camp 80 5 Carey Conner 19 7 Critter Crook 99 9 Dannelly 21 10 Davis 20 11 Dinkins 12 Duncan . . . Faulk Farrow Foster Fuller 98 ... West 81 Wilks Zane Zinn Name Yadda yadda Carey 1 Foster 2 Barnes 3 Zinn 4 Critter 5 Faulk 6 Adams 7 Wilks 8 Bishop 9 Farrow 10 Duncan 11 Dinkins 12 West . . . 18 Bell 19 Conner 20 Davis 21 Dannelly ... 80 Camp 81 Zane 98 Fuller 99 Crook Searching Dannelly Deleting Duncan Adding Walters Aardvark Level One Key IRRN Adams 1 Davis 10 2 Foster 20 3 Ingram 30 4 Lambert 40 5 Norris 50 6 Randall 60 7 Tyler 70 8 West 80 9 Young 90

5 B-Tree Informal Definition
multi-level indexes nodes are indexes, indexes are nodes "Order" - maximum references in a node, minimum references is ½ the order When node fills, split it and move up largest key When node is too empty, combine it with parent

6 Example Insertion Insert these letters into an 4-order B-Tree
C S D T A M P I B W N G U R After C S D and T After A, node splits and largest keys move up M and P are added to right node, but so is I C D S T D T A C D S T D P T A C D I M P S T

7 C S D T A M P I B W N G U R B, W, and N are no problem
Insertion of G causes another split, then U is no problem Inserting R causes right node to split, then root to split D P W A B C D I M N P S T W D M P W A B C D G I M N P S T U W P W D M P T W A B C D N P U W G I M R S T

8 Analysis Order Size? Number of file accesses to Search
match to disk cluster size and memory Number of file accesses to Search depth of tree so bigger the Order the better Order = 100 and Levels = 4 == 100million records Number of file accesses to Delete search downward to the leaf modify node if it was largest, adjust parent node

9 Analysis Number of file accesses to Add best case worst case (split)
search downward adjust the node worst case (split) search downward to the leaf insert, overflow detect, split upward create new root node

10 Definition of B-Tree In general, a B-Tree of Order N has the following properties: the root has at least two descendants, or is a leaf each node has no mode that N descendents each node that is not the root or a leaf has at least N/2 descendants all leaf nodes are at the same level a nonleaf node with k descendants contains k-1 key values

11 B+ Trees Since search time = depth of tree, we need to keep the tree short and wide Uneven tree (some full nodes and some near empty nodes, or leans to one side) creates poor performance Using a slightly smarter split method during add keeps the tree short and balanced B+ Tree is the de facto standard for databases

12 Storing a B-Tree in Files
Key IRN RRN P 4 1 W 8 2 -- 3 D 12 5 M 16 6 20 7 T 24 9 28 10 11 A 13 B 14 C 15 G 17 I 18 19 N 21 22 23 R 25 S 26 27 Data File order does not matter Index File lists of indexes each index: key, RRN When inserting a new node, does its placement in the index file matter? P W D M P T W A B C D N P U W G I M R S T

13 Next Class Review of the entire semester Sorting and Binary Searching
FAT v. NTFS v. Linux File Access Times Fragmented Files which is best storage method Indexed B-Tree Hashed


Download ppt "B-Trees."

Similar presentations


Ads by Google