CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #7
B+Trees Support fast search Support range search Support dynamic changes Could be either dense or sparse A B+tree node: 2 pnpn knkn k2k2 p1p1 k1k1 p0p0 p2p2 ……
Root B+Tree Examplen=
Sample non-leaf to keysto keysto keys to keys < 5757 k<8181 k<95
Sample leaf node: From non-leaf node to next leaf in sequence To record with key 57 To record with key 81 To record with key 85 5
Size of nodes:n+1 pointers n keys FIXED 6
Don’t want nodes to be too empty Use at least Non-leaf: (n+1)/2 pointers Leaf: (n+1)/2 pointers to data 7
Full nodeMin. node Non-leaf Leaf n=
B+tree rulestree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” 9
(3) Number of pointers/keys for B+tree Non-leaf (non-root) n+1n (n+1)/ 2 (n+1)/ 2 – 1 Leaf (non-root) n+1n Rootn+1n21 Max Max Min Min ptrs keys ptrs data keys (n+ 1) / 2 10 could be 1
Search in a B+tree Start from the root Search in a leaf block May not have to go to the data file 11
Pseudo Code for Search in a B+tree Search(ptr, k); \\ search a record of key value k in the subtree rooted at *ptr Case 1. *ptr is a leaf IF (k = k i ) for a key k i in *ptr THEN return(p i ); ELSE return(Null); Case 2. *ptr is not a leaf find a key k i in *ptr such that k i ≤ k < k i+1 ; return(Search(p i, k)); 12
Insert into B+tree (a) simple case –space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root 13
Pseudo Code for Insertion in a B+tree Insert(ptr, (k,p), (k',p')); \\ p is a pointer to a data record with key value k, which are inserted into the subtree rooted at \\ *ptr; p' is a pointer to a new brother of *ptr, if created, in which k' is the smallest key value; Case 1. *ptr is a leaf IF there is room in *ptr, THEN insert (k,p) into *ptr; return(k'=0, p'=Null); ELSE re-arrange the content in *ptr and (k,p) into (p 0, k 1, p 1,..., k n+1, p n+1 ); leave (p 0, k 1,..., p r-2, k r-1, p r-1 ) in *ptr; \\ r = (n+1)/2 create a new leaf q; put (p r, k r+1,..., p n, k n+1, p n+1 ) in *q; return( k'= k r, p' = q ); Case 2. *ptr is not a leaf find a key k i in *ptr such that k i ≤ k < k i+1 ; Insert(p i, (k,p), (k",p")); IF (p" = Null) THEN return(k'=0, p'=Null); ELSE IF there is room in *ptr, THEN insert (k",p") into *ptr; return(k'=0, p'=Null); ELSE re-arrange the content in *ptr and (k",p") into (p 0, k 1, p 1,..., k n+1, p n+1 ); leave (p 0, k 1,..., p r-2, k r-1, p r-1 ) in *ptr; \\ r = (n+1)/2 create a new leaf q; put (p r, k r+1,..., p n, k n+1, p n+1 ) in *q; return( k'= k r, p' = q ). 14
(a) Insert key = 32 n=
(a) Insert key = 32 n=
(b) Insert key = 7 n=
(b) Insert key = 7 n=
(c) Insert key = 160 n=
(c) Insert key = 160 n=
(d) New root, insert 45 n=
(d) New root, insert 45 n= new root
(a) Simple case (no example) (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf Deletion from B+tree 23
(b) Coalesce with sibling –Delete n=4 24
(b) Coalesce with sibling –Delete n=
(c) Redistribute keys –Delete n=4 26
(c) Redistribute keys –Delete n=
(d) Non-leaf coalesce –Delete 37 n= new root 28
B+tree deletions in practice –Often, coalescing is not implemented –Too hard and not worth it! 29
Variation on B+tree: B-tree (no +) Idea: –Avoid duplicate keys –Have record pointers in non-leaf nodes 30
to record to record to record with K1 with K2 with K3 to keys to keys to keys to keys k3 K1 P1K2 P2K3 P3 31
B-tree examplen=
B-tree examplen= sequence pointers not useful now! 33
Tradeoffs: B-trees have faster lookup than B+trees in B-tree, non-leaf & leaf different sizes in B-tree, insertion and deletion more complicated B+trees preferred! 34