Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ Data Structures B-tree Jibrael Jos : Sep 2009. + Agenda Introduction Multiway Trees B Tree Application Structure Algo : Insert / Delete Avoid Taking.

Similar presentations


Presentation on theme: "+ Data Structures B-tree Jibrael Jos : Sep 2009. + Agenda Introduction Multiway Trees B Tree Application Structure Algo : Insert / Delete Avoid Taking."— Presentation transcript:

1 + Data Structures B-tree Jibrael Jos : Sep 2009

2 + Agenda Introduction Multiway Trees B Tree Application Structure Algo : Insert / Delete Avoid Taking Printout : Use RTF Outline in case needed 2

3 + B Tree CriticMathsSummattionSeriesVariationsB*, B+Application Industry Please Do Not Take Printout : Use RTF Outline in case needed 3

4 + Binary Search Tree What happens if data is loaded in a binary search tree in this order 23, 32, 45, 11, 43, 41 1,2,3,4,5,6,7,8 What is AVL tree Please Do Not Take Printout : Use RTF Outline in case needed 4

5 + Multiway Trees Please Do Not Take Printout : Use RTF Outline in case needed 5 < K1 >= K2 K1K1 K2K2 >= K1 <K2

6 + m-way trees Reduce the depth of the tree to O(log m n) with m-way trees m children, m-1 keys per node m = 10 : 10 6 keys in 6 levels vs 20 for a binary tree but........ K1K2K3 K1K1 K2K2 K3K3 K1K1 K2K2 K3K3 K1K1 K2K2 K3K3 K1K1 K2K2 K3K3

7 + m-way trees But you have to search through the m keys in each node! Reduces your gain from having fewer levels!

8 + m-way trees 5010015 0 3545 110120 6070 12 5 1358595 90 75 175

9 + B-trees All leaves are on the same level All nodes except for the root and the leaves have at least m/2 children at most m children Anand B Each node is at least half full of keys

10 + BTREE 74788597 1114 12 5 135 21102

11 + Multiway Tree M – ary tree 3 levels : Cylinder, Track, Record : Index Seq (RDBMS) Tables with less change Please Do Not Take Printout : Use RTF Outline in case needed 11

12 + BTree If level is 3, m =199 then what is N How many split per insertion ? Please Do Not Take Printout : Use RTF Outline in case needed 12

13 + Multiway Trees : Application NDPL, Delhi: Electricity Billing 3 lakh consumers Table indexed as BTREE UCO Bank, Jaipur One DD takes 10 minutes to print Saviour : BTREE Please Do Not Take Printout : Use RTF Outline in case needed 13

14 + B-trees - Insertion Insertion B-tree property : block is at least half-full of keys Insertion into block with m keys block overflows split block promote one key split parent if necessary if root is split, tree becomes one level deeper

15 + Insert Node 74788597 1114 12 5 135 21102 63

16 + After Insert 63 1114 12 5 135 6374 217810 2 8597

17 + Insert Node 74788597 1114 12 5 135 21102 99

18 + After Insert 99 1114 12 5 135 7478 218510 2 9799

19 + Split Node 74788597 74788597 4 node 0 63

20 + Structure of Btree Avoid Taking Printout : Use RTF Outline in case needed 20 node firstPtr numEntries Entries[1.. M-1] End Entry key rightPtr End Entry

21 + Split Node : Final 78 6374 3 node 0 8597 2 rightPtr 43 2 median entry toNdx fromNdx

22 + Split Node : Final 85 7478 3 node 4 9799 2 rightPtr 43 1 median entry toNdx fromNdx

23 + Traversal 42456374 1114 8595 2178

24 + Agenda Delete Delete Walk Through Reflow Borrow Left Borrow Right Combine Delete Mid Avoid Taking Printout : Use RTF Outline in case needed 24

25 + Delete : For 78 Btree Delete Delete() Delete Mid() Reflow() If shorter delete root Please Do Not Take Printout : Use RTF Outline in case needed 25 42 1 162121 2 577878 2 455252 2 637474 2 859797 2

26 + Btree Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root Please Do Not Take Printout : Use RTF Outline in case needed 26 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 Target = 78 B

27 + Delete(root, deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (left) if underflow underflow=reflow() Please Do Not Take Printout : Use RTF Outline in case needed 27 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 Target = 78 B D

28 + Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow() Return underflow Please Do Not Take Printout : Use RTF Outline in case needed 28 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 Target = 78 B D

29 + Delete(root, deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx) Please Do Not Take Printout : Use RTF Outline in case needed 29 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 Target = 78 B D D DM

30 + Delete(root, deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx) Please Do Not Take Printout : Use RTF Outline in case needed 30 42 1 162121 2 577474 2 455252 2 63 1 859797 2 74 replaces 78 B D D

31 + Delete(root, deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx) Please Do Not Take Printout : Use RTF Outline in case needed 31 42 1 162121 2 455252 2 After Reflow 57 1 637474 8597 4 B D D

32 + Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow Please Do Not Take Printout : Use RTF Outline in case needed 32 Before Reflow 42 1 162121 2 455252 2 57 1 637474 8597 4 B D

33 + Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow Please Do Not Take Printout : Use RTF Outline in case needed 33 After Reflow 0 455252 2 637474 8597 4 162121 4257 4 B D

34 + BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root Please Do Not Take Printout : Use RTF Outline in case needed 34 0 455252 2 637474 8597 4 162121 4257 4 B

35 + BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root Please Do Not Take Printout : Use RTF Outline in case needed 35 455252 2 637474 8597 4 162121 4257 4 B

36 + Templates Please Do Not Take Printout : Use RTF Outline in case needed 36 747878 8597 4 747878 2 1 7878 85 3

37 + Delete Please Do Not Take Printout : Use RTF Outline in case needed 37 42 1 162121 2 577878 2 455252 2 637474 2 859797 2

38 + Delete : For 78 Btree Delete Delete() Delete Mid() Reflow() If shorter delete root Please Do Not Take Printout : Use RTF Outline in case needed 38 42 1 162121 2 577878 2 455252 2 637474 2 859797 2

39 + Delete : Reflow 1: Try to borrow right. 2: If 1 failed try to borrow from left 3: Cannot Borrow (1,2 failed) Combine Please Do Not Take Printout : Use RTF Outline in case needed 39

40 + Delete Reflow Underflow=false If RT->no > min Entries BorrowRight (root,entryNdx,LT,RT) Else If LT->no > min Entries BorrowLeft (root,entryNdx,LT,RT) Else combine (root,entryNdx,LT,RT) if root->no < min entries underflow=True Return underflow Please Do Not Take Printout : Use RTF Outline in case needed 40

41 + Borrow Left Please Do Not Take Printout : Use RTF Outline in case needed 41 87878 2 85 1 456363 74 3 Node >= 74 < 78 Node >= 78 < 85

42 + Combine Please Do Not Take Printout : Use RTF Outline in case needed 42 657171 2 63 1 215757 78 3 424545 2 596161 2

43 + Combine Please Do Not Take Printout : Use RTF Outline in case needed 43 657171 2 63 1 215757 78 3 596161 2 424545 57 3

44 + Combine Please Do Not Take Printout : Use RTF Outline in case needed 44 657171 2 215757 78 3 596161 2 42455763 4

45 + Combine Please Do Not Take Printout : Use RTF Outline in case needed 45 657171 2 2178 2 596161 2 42455763 4

46 + Delete Mid If leaf exchange data and delete leaf entry Else traverse right to locate predecessor deleteMid(right) if underflow reflow Please Do Not Take Printout : Use RTF Outline in case needed 46

47 + Delete Mid Please Do Not Take Printout : Use RTF Outline in case needed 47 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 Case 1: To Delete 78 we replace with 74

48 + Delete Mid Please Do Not Take Printout : Use RTF Outline in case needed 48 42 1 162121 2 577878 2 455252 2 637474 2 859797 2 757676 2 Case 2: To Delete 78 we replace with 76 Hence recursive call of Delete Mid to locate predecessor

49 + order OrderMinMax 323 424 535 636 ……… mm/2m Please Do Not Take Printout : Use RTF Outline in case needed 49

50 + Get the Order Right Keys are 4 Subtrees Max is 5 = Order is 5 Minimum = 3 (which is subtrees) Min Keys is 2 Please Do Not Take Printout : Use RTF Outline in case needed 50 455252 2 637474 8597 4 162121 4257 4

51 + 2-3 Tree Order 3 ….. So how many keys in a node This rule is valid for non root leaf Root can have 0, 2, 3 subtrees Please Do Not Take Printout : Use RTF Outline in case needed 51

52 + 2 -3 Tree Please Do Not Take Printout : Use RTF Outline in case needed 52 42 1 16 2 577878 2 455252 2 63 2 859797 2

53 + 2-3-4 Tree Order 4 ….. So how many keys in a node This rule is valid for non root leaf Root can have 0, 2, 3 subtrees Please Do Not Take Printout : Use RTF Outline in case needed 53

54 + Structure of B + tree Non leaf node firstPtr numEntries Entries[1.. M-1] End Entry key rightPtr End Entry Avoid Taking Printout : Use RTF Outline in case needed 54  Leaf node  firstPtr  numEntries  Entries[1.. M-1]  Next Leaf Node  End

55 + B + Tree Please Do Not Take Printout : Use RTF Outline in case needed 55 42 1 577878 2 455252 2 637474 2 859797 2 Implies there are more nodes

56 + B * Tree Space Usage BTREE nodes can be 50% Empty (1/2) So rule modified to two third (2/3) Also when node overflows instead of being split immed distributed with siblings And even when split happens all siblings are equally distributed (pg 462) Please Do Not Take Printout : Use RTF Outline in case needed 56

57 + B+-trees B+ trees All the keys in the nodes are dummies Only the keys in the leaves point to “real” data Linking the leaves Ability to scan the collection in order without passing through the higher nodes


Download ppt "+ Data Structures B-tree Jibrael Jos : Sep 2009. + Agenda Introduction Multiway Trees B Tree Application Structure Algo : Insert / Delete Avoid Taking."

Similar presentations


Ads by Google