EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems
Instructor: Luke Huan Spring 2007

Luke Huan Univ. of Kansas
Administrative Homework 4 was assigned Due April 9th The programming part of homework 4 is a team project with team size 2 I need a volunteer to do the homework alone Or, I may create a three-member team The book Database Management Systems has been reserved at the engineering library Additional source for discussions about index structure in database systems 5/6/2019 Luke Huan Univ. of Kansas

B+-tree Organization Internal node Leaf node 5/6/2019 Luke Huan Univ. of Kansas

Review B+-tree Insert Find correct leaf L. Put data entry onto L. If L has enough space, done! Else, must split L (into L and a new node L2) Distribute entries evenly, copy up middle key. Insert index entry pointing to L2 into parent of L. This can happen recursively Tree growth: gets wider and (sometimes) one level taller at top. 5/6/2019 Luke Huan Univ. of Kansas

Review B+-tree Delete Start at root, find leaf L where entry belongs. Remove the entry. If L is at least half-full, done! If L has only d-1 entries, Try to redistribute, borrowing from sibling (adjacent node with same parent as L). If re-distribution fails, merge L and sibling. If merge occurred, must delete entry (pointing to L or sibling) from parent of L. Tree shrink: gets narrower and (sometimes) one level lower at top. 5/6/2019 Luke Huan Univ. of Kansas

Example B+ Tree - Inserting 8*
2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 Notice that root was split, leading to increase in height. In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice. 13

Example Tree (including 8*) Delete 19* and 20* ...
2* 3* Root 17 30 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 24* 27 27* 29* 2* 3* Root 17 24 30 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 5 7* 5* 8* Deleting 19* is easy. Deleting 20* is done with re-distribution. Notice how middle key is copied up. 15

... And Then Deleting 24* Must merge.
Observe `toss’ of index entry (key 27 on right), and `pull down’ of index entry (below). 30 22* 27* 29* 33* 34* 38* 39* Root 5 13 17 30 2* 3* 5* 7* 8* 14* 16* 22* 27* 29* 33* 34* 38* 39* 16

Today’s Topic Recap of B+-tree File and Index organization External sorting 5/6/2019 Luke Huan Univ. of Kansas

Performance analysis How many I/O’s are required for each operation? h, the height of the tree (more or less) Plus one or two to manipulate actual records Plus O(h) for reorganization (should be very rare if f is large) Minus one if we cache the root in memory How big is h? Roughly logfan-out N, where N is the number of records B+-tree properties guarantee that fan-out is least f / 2 for all non-root nodes Fan-out is typically large (in hundreds)—many keys and pointers can fit into one block A 4-level B+-tree is enough for typical tables Sounds suspiciously similar to the famous Bill Gate quote: 64K memory ought to be enough But you can work out the calc… 200 fan-out => 1.6 billion records 5/6/2019 Luke Huan Univ. of Kansas

B+-tree in practice Complex reorganization for deletion often is not implemented (e.g., Oracle, Informix) Leave nodes less than half full and periodically reorganize Most commercial DBMS use B+-tree instead of hashing-based indexes because B+-tree handles range queries There is actually a deeper reason… Now it’s empty now it’s full Suppose two leaves are exactly at the ½ threshold Delete -> merge into a full Insert -> split into half full Repeat! 5/6/2019 Luke Huan Univ. of Kansas

The Halloween Problem Story from the early days of System R… UPDATE Payroll SET salary = salary * 1.1 WHERE salary >= ; There is a B+-tree index on Payroll(salary) The update never stopped (why?) Solutions? Scan index in reverse Before update, scan index to create a complete “to-do” list During update, maintain a “done” list Tag every row with transaction/statement id Scan reverse doesn’t work if you are decreasing salary If I am really malicious, I salary * rand() 5/6/2019 Luke Huan Univ. of Kansas

B+-tree versus B-tree B-tree: why not store records (or record pointers) in non-leaf nodes? These records can be accessed with fewer I/O’s Problems? Storing more data in a node decreases fan-out and increases h Records in leaves require more I/O’s to access Vast majority of the records live in leaves! 5/6/2019 Luke Huan Univ. of Kansas

Beyond ISAM, B-, and B+-trees
Other tree-based indexes: R-trees and variants, GiST, etc. Hashing-based indexes: extensible hashing, linear hashing, etc. Text indexes: inverted-list index, suffix arrays, etc. Other tricks: bitmap index, bit-sliced index, etc. How about indexing subgraph search? 5/6/2019 Luke Huan Univ. of Kansas

Index and Data File Organization
Heap file vs. B+-tree Index file Header Page Data Pages with Free Space Full Pages 5/6/2019 Luke Huan Univ. of Kansas

Alternatives for Data Entry in Index
Three alternatives: Primary indexes Secondary indexes Clustering indexes Can have multiple (different) indexes per file. E.g. Employee = (EID, name, age, salary) Primary key is EID, name is unique, age and salary are non-key attributes We may (or not) sort the file by EID, with a B+-tree index on EID (primary), name (secondary) and age (secondary), and a hash index on salary. 5/6/2019 Luke Huan Univ. of Kansas

Primary indexes Index for primary key: <k, bid of sorted data records> File is sorted according to the primary key The first record in each block is called the anchor record and is used to build the index Sometime the thing goes the other way around (called B+-tree based sorting) CLUSTERED 123, Susan, 30, 50K 124, John, 40, 80K ….

Secondary Index Alternative 2: <k, rid of matching data record>
Easier to maintain (do not need to sort the data file) k must be a candidate key of the relation, e.g. unique attribute Data pointer is the RID of the record indexed by k E.g. (Block#, slot#) UNCLUSTERED Leaf node 123, Susan, 30, 50K 124, John, 40, 80K …. Data file

Clustering Index Alternative 3: <k, list of rids of matching data records> Also need to sort the data file k must be a non-key attribute of the relation Data pointer is the BID of the first RID that is indexed by k CLUSTERED Leaf node Block 2, slot 1 Block 1, slot 1 Block 1, slot 2 ….

Index Classification Clustered vs. unclustered: If order of data records is the same as, or `close to’, order of index data entries, then called clustered index. A file can be clustered on at most one search key. Cost of retrieving data records through index varies greatly based on whether index is clustered or not!

Clustered vs. Unclustered Index
Suppose that clustering index is used for data entries, and that the data records are stored in a Heap file. To build clustering index, first sort the Heap file (with some free space on each block for future inserts). Overflow blocks may be needed for inserts. (Thus, order of data recds is `close to’, but not identical to, the sort order.) Index entries UNCLUSTERED CLUSTERED direct search for data entries Data entries Data entries (Index File) (Data file)

Comparison Index Dense (D)/Sparse(S) Clustered (C) or not (N) Sorting
required Primary S C Yes Secondary D N No Clustering 5/6/2019 Luke Huan Univ. of Kansas

Cost of Operations B: The number of data pages
R: Number of records per page D: (Average) time to read or write disk page Cost of Operations Heap File Sorted File Clustered File Scan all records BD 1.5 BD Equality Search 0.5 BD (log2 B) * D (logF 1.5B) * D Range Search search + #match pg*D search+ #match pg*D Insert 2D search + BD search+ D Delete search + D

Why Sort? A classic problem in computer science!
Data requested in sorted order e.g., find students in increasing gpa order Sorting is first step in bulk loading B+ tree index. Sorting useful for eliminating duplicate copies in a collection of records (Why?) Sorting is useful for summarizing related groups of tuples Sort-merge join algorithm involves sorting. Problem: sort 100Gb of data with 1Gb of RAM. why not virtual memory? 4

2-Way Sort: Requires 3 Buffers
Pass 0: Read a page, sort it, write it. only one buffer page is used (as in previous slide) Pass 1, 2, 3, …, etc.: requires 3 buffer pages merge pairs of runs into runs twice as long three buffer pages used. INPUT 1 OUTPUT INPUT 2 Main memory buffers Disk Disk 5

Two-Way External Merge Sort
Each pass we read + write each page in file. N pages in the file => the number of passes So total cost is: Idea: Divide and conquer: sort subfiles and merge 3,4 6,2 9,4 8,7 5,6 3,1 2 Input file PASS 0 3,4 2,6 4,9 7,8 5,6 1,3 2 1-page runs PASS 1 2,3 4,7 1,3 2-page runs 4,6 8,9 5,6 2 PASS 2 2,3 4,4 1,2 4-page runs 6,7 3,5 8,9 6 PASS 3 1,2 2,3 3,4 8-page runs 4,5 6,6 7,8 9 6

Using B+ Trees for Sorting
Scenario: Table to be sorted has B+ tree index on sorting column(s). Idea: Can retrieve records in order by traversing leaf pages. Is this a good idea? Cases to consider: B+ tree is clustered Good idea! B+ tree is not clustered Could be a very bad idea! 15

Clustered B+ Tree Used for Sorting
Cost: root to the left-most leaf, then retrieve all leaf pages (primary index) If clustering index is used? Additional cost of retrieving data records: each page fetched just once. Index (Directs search) Data Entries ("Sequence set") Data Records Always better than external sorting! 16

Unclustered B+ Tree Used for Sorting
Unclustered index for data entries; each data entry contains rid of a data record. In general, one I/O per data record! Index (Directs search) Data Entries ("Sequence set") Data Records 17

Summary Index can be used to organize files (clustered file)
External sorting is important tool for many applications including building index and answering query (next time) External merge sort minimizes disk I/O cost: Pass 0: Produces sorted runs of size B (# buffer pages). Later passes: merge runs. # of runs merged at a time depends on B, and block size. Larger block size means less I/O cost per page. Larger block size means smaller # runs merged. In practice, # of runs rarely more than 2 or 3. 19

EECS 647: Introduction to Database Systems

Similar presentations

Presentation on theme: "EECS 647: Introduction to Database Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EECS 647: Introduction to Database Systems

Similar presentations

Presentation on theme: "EECS 647: Introduction to Database Systems"— Presentation transcript:

Similar presentations

About project

Feedback