Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 255: Database System Principles slides: B-trees

Similar presentations


Presentation on theme: "CS 255: Database System Principles slides: B-trees"— Presentation transcript:

1 CS 255: Database System Principles slides: B-trees
By:- Arunesh Joshi Id:

2 Agenda The features and different functionalities of B- Tree in terms of index structure The Structure of B-Trees Applications of B-Trees Lookup in B-Trees Range Queries Insertion into B-Trees Deletion from a B-Tree Efficiency of B-Trees

3 B-Trees B-tree organizes its blocks into a tree. The tree is balanced, meaning that all paths from the root to a leaf have the same length. Typically, there are three layers in a B-tree: the root, an intermediate layer, and leaves, but any number of layers is possible.

4 functionalities of B- Tree
B-Trees automatically maintain as many levels of index as is appropriate for the size of the file being indexed. B-Trees manage the space on the blocks they use so that every block is between half used and completely full. No overflow blocks are needed.

5 Structure of B-Trees There are three layers in binary trees- the root, an intermediate layer and leaves In a B-Tree each block have space for n search-key values and n+1 pointers [next slide explains the structure of a B-Tree]

6 B-Tree Example n=3 Root 100 120 150 180 30 3 5 11 120 130 180 200 30 35 100 101 110 150 156 179

7 Sample non-leaf 57 81 95 to keys to keys to keys to keys to keys
<  k<81 81k<95 95

8 Sample leaf node: From non-leaf node to next leaf in sequence 57 81 95
with key 57 with key 81 To record with key 85

9 In textbook’s notation n=3
Leaf: Non-leaf: 30 35 30 35 30 30

10 Size of nodes: n+1 pointers n keys
(fixed)

11 Don’t want nodes to be too empty
Use at least Non-leaf: (n+1)/2 pointers Leaf: (n+1)/2 pointers to data

12 Full node min. node Non-leaf Leaf 120 150 180 30 3 5 11 30 35 n=3
counts even if null

13 B-tree rules tree of order n
(1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer”

14 Number of pointers/keys for B+tree
Max Max Min Min ptrs keys ptrsdata keys Non-leaf (non-root) n+1 n (n+1)/2 (n+1)/2- 1 Leaf (non-root) n+1 n (n+1)/2 (n+1)/2 Root n+1 n 1 1

15 Applications of B-trees
1. The search key of the B-tree is the primary key for the data file, and the index is dense. That is, there is one key-pointer pair in a leaf for every record of the data file. The data file may or may not be sorted by primary key. 2. The data file is sorted by its primary key, and the B-tree is a sparse index with one key-pointer pair at a leaf for each block of the data file. 3. The data file is sorted by an attribute that is not a key, and this attribute is the search key for the B-tree. For each key value K that appears in the data file there is one key-pointer pair at a leaf. That pointer goes to the first of the records that have K as their sort-key value.

16 Lookup in B-Trees Suppose we want to find a record with search key 40.
We will start at the root , the root is 13, so the record will go the right of the tree. Then keep searching with the same concept.

17 Looking for block “40”<not present>
13 31 7 29 23 19 17 11 5 3 2 43 41 37 47

18 Range Queries B-trees are used for queries in which a range of values are asked for. Like, SELECT * FROM R WHERE R. k >= 10 AND R. k <= 25;

19 Insert into B-tree (a) simple case (b) leaf overflow
space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root

20 (a) Insert key = 32 n=3 100 30 3 5 11 30 31 32

21 (a) Insert key = 7 n=3 100 30 7 3 5 11 30 31 3 5 7

22 (c) Insert key = 160 n=3 100 160 120 150 180 180 150 156 179 180 200 160 179

23 (d) New root, insert 45 n=3 30 new root 10 20 30 40 1 2 3 10 12 20 25
32 40 40 45

24 Deletion from B-tree (a) Simple case - no example
(b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf CS 245 Notes 4

25 (b) Coalesce with sibling
Delete 50 n=4 10 40 100 40 10 20 30 40 50

26 (c) Redistribute keys Delete 50 n=4 10 40 100 35 10 20 30 35 40 50

27 (d) Non-leaf coalese Delete 37 new root 25 25 10 20 30 40 40 30 25 26
14 20 22 30 37 40 45

28 B-tree deletions in practice
Often, coalescing is not implemented Too hard and not worth it!

29 Why we take 3 as the number of levels of a B-tree?
Suppose our blocks are 4096 bytes. Also let keys be integers of 4 bytes and let pointers be 8 bytes. If there is no header information kept on the blocks, then we want to find the largest integer value of n such that (n + 1) That value is n = key-pointer pairs could fit in one block for our example data. Suppose that the average block has an occupancy midway between the minimum and maximum. i.e.. a typical block has 255 pointers. With a root 255 children and 255*255= leaves. We shall have among those leaves cube of 253. or about 16.6 million pointers to records. That is, files with up to 16.6 million records can be accommodated by a 3-level B-tree.

30 Thank you for bearing me.


Download ppt "CS 255: Database System Principles slides: B-trees"

Similar presentations


Ads by Google