Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Similar presentations

Presentation on theme: "1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke."— Presentation transcript:

1 1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke

2 2 DBMS Architecture Disk Space Manager DB Access Methods Buffer Manager Query Parser Query Rewriter Query Optimizer Query Executor Lock Manager Log Manager

3 3 Data on External Storage  Disks: Can retrieve random pages at (almost) fixed cost  But reading several consecutive pages is much faster than reading them in random order.  Tapes: Can only read pages in sequence  Slower but cheaper than disks; used for archival storage  Page: Unit of information read from or written to disk  Size of page: DBMS parameter, 4KB, 8KB  Disk space manager:  Abstraction: a collection of pages.  Allocate/de-allocate a page.  Read/write a page.  Page I/O:  Pages read from disk and pages written to disk  Dominant cost of database operations

4 4 Access Methods  Access methods: routines to manage various disk-based data structures.  Files of records  Various kinds of indexes  File of records:  Important abstraction of external storage in a DBMS!  Record id (rid) is sufficient to physically locate a record  Indexes:  Auxiliary data structures  Given a value in the index search key field(s), find the record ids of records with this value

5 5 Buffer Management  Architecture:  Data is read into memory for processing  Data is written to disk for persistent storage  Buffer manager stages pages between external storage and main memory buffer pool.  Access method layer makes calls to the buffer manager.

6 6 Access Methods Many alternatives exist, each ideal for some situations, and not so good for others:  Heap (unorder) files: Suitable when typical access is a file scan retrieving all records.  Sorted Files: Best if records are retrieved in order of the sorting attribute(s), or only a `range’ of records is needed.  Indexes: Data structures to organize records via trees or hashing. Like sorted files, they speed up searches for a subset of records, based on values in certain (“search key”) fields Updates are much faster than in sorted files.

7 7 Indexes  An index on a file speeds up selections on the search key field(s) for the index.  Any subset of the fields of a relation can be the search key for an index on the relation.  Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation).  An index contains a collection of data entries, and supports efficient retrieval of all data entries k* with a given key value k.  Data entry (in an index) versus data record (in a file).  Given a data entry k*, we can find record(s) with the key k in at most one disk I/O. (Details soon …)

8 8 B+ Tree Indexes  Leaf pages contain data entries : Leaf pages are chained using prev & next pointers Data entries are sorted by the search key value Non-leaf Pages (Sorted by search key) Leaf

9 9 B+ Tree Indexes  Non-leaf pages have index entries; only used to direct searches P 0 K 1 P 1 K 2 P 2 K m P m index entry Non-leaf Pages (Sorted by search key) Leaf

10 10 Example B+ Tree  Equality selection: find 28*? 29*?  Range selection: find all > 15* and < 30*  Insert/delete: Find the data entry in a leaf, then change it.  Need to adjust the parent sometimes.  And the change sometimes bubbles up the tree. 2*3* Root 17 30 14*16* 33*34* 38* 39* 135 7*5*8*22*24* 27 27*29* Entries <= 17Entries > 17 Data entries in leaf nodes are sorted.

11 11 Hash-Based Indexes  Good for equality selections.  A hash index is a collection of buckets.  Bucket = primary page plus zero or more overflow pages.  Buckets contain data entries. h 1 2 3 … … … … … … N-1 Search key value k  Hashing function h : h ( k ) = bucket of data entries that may contain the search key value k.  No need for “index entries” in this scheme.

12 12 Alternatives for Data Entry k* in Index  In a data entry k* we can store:  Alt1: Data record with key value k  Alt2:  Alt3:  Choice of an alternative for data entries is orthogonal to an indexing technique used.  Indexing techniques: B+ tree, hashing  Every indexing technique can be combined with any alternative.

13 13 Alternative 1 for Data Entries  Alternative 1:  Index structure itself is a file organization for data records (instead of a Heap or sorted file).  At most one index on a given collection of data records can use Alternative 1. Otherwise, data records are duplicated, leading to redundant storage and potential inconsistency.  If data records are very large, # of pages containing data entries is high. Size of the index structure to direct searches (e.g., non-leaf nodes in B+ tree) can also be large.

14 14 Alternatives 2, 3 for Data Entries  Alternatives 2 and 3:  Data entries,, typically much smaller than data records. Index structure used to direct search is much smaller than with Alternative 1. So, better than Alternative 1 with large data records, especially if search keys are small.  Alternative 3 more compact than Alternative 2 In the presence of multiple K* entries!  Alternative 3 leads to variable sized data entries, even if search keys are of fixed length. Variable sizes records/data entries are hard to manage.

15 15 Index Classification  Primary index vs. secondary index:  If search key contains the primary key, then called primary index.  Other indexes are called secondary indexes.  Unique index: Search key contains a candidate key.  No data entries can have the same search key value.  Recall: Key? Primary key? Candidate key?

16 16 Index Classification (Contd.)  Clustered index: order of data records is the same as, or `close to’, order of (sorted) data entries.  Alternative 1 implies clustered  Alternatives 2 and 3 are clustered only if data records are sorted on the search key field.  A data file can be clustered on at most one search key.  Retrieving data records: fast, with one or a few I/Os. Why?  Unclustered index:  Multiple unclustered indexes on a data file.  Retrieving data records: can be slow, toughing many pages.

17 17 Clustered vs. Unclustered Indexes  Suppose that Alternative (2) is used for data entries, and data records are stored in a Heap file.  To build a clustered index, first sort the Heap file (with some free space on each page for future inserts).  Overflow pages may be needed for inserts. (Thus, order of data records is `close to’, but not identical to, the sort order.)

18 18 Index Classification (Contd.)  Dense index : contains (at least) one data entry for every search key value that appears in a record in the indexed file  Alternatives 1, 2, 3  Sparse index : contains one entry for each page of records in the indexed file.  Sparse index has to be clustered!  Alternative 1?  Alternative 2?  Alternative 3?

19 19 Cost Model for Our Analysis We ignore CPU costs, for simplicity:  B: The number of data pages  R: Number of records per page  D: (Average) time to read or write disk page  I/O cost: Measuring # of page I/O’s ignores pre-fetching of a sequence of pages; thus, approximate I/O cost.  Average-case analysis, based on simplistic assumptions. * Good enough to show the overall trends!

20 20 Comparing File Organizations  Heap files (random order; insert at eof)  Sorted files, sorted on  Clustered B+ tree file, Alternative (1), search key  Unclustered B + tree index on a heap file, search key  Unclustered hash index on a heap file, search key

21 21 Operations to Compare  Scan: Fetch all records from disk  Equality search: e.g. (age, sal) = (24, 50K)  Range selection: e.g. (age, sal)  [(22, 20K), (30, 200K)]  Insert a record  Delete a record

22 22 Assumptions in Our Analysis  Heap Files:  Equality selection on key; exactly one match.  Sorted Files:  Files compacted after deletions.  Indexes:  Alt (2), (3): size of data entry = 10% size of data record  Hash: No overflow chains. 80% page occupancy => File size = 1.25 data size  B+Tree: 67% occupancy (typical): implies file size = 1.5 data size Balanced with fanout F (133 typical) at each non-leaf level

23 23 Assumptions (contd.)  Scans:  Leaf nodes of a tree-index are chained.  Index data entries plus actual file scanned for unclustered indexes.  Range searches:  We use tree indexes to restrict the set of data records fetched, but ignore hash indexes.

24 24 Cost of Operations * Several assumptions underlie these (rough) estimates!

25 25 Cost of Operations * Several assumptions underlie these (rough) estimates! # of leaf pages: B*0.1/67%=0.15B; Cost of index I/O: 0.15BD # of data entries on a leaf page: R*10*0.67=6.7R; Cost of data I/O: 0.15B*6.7R*D=BDR Key: unclustered B+tree, one I/O per data entry! # of data entry pages: B*0.1/80%=0.125B; Cost of index I/O: 0.125BD # of data entries on a hash page: R*10*0.80=8R; Cost of data I/O: 0.125B*8R*D=BDR Key: unclustered hash index, one I/O per data entry!

26 26 Summary  Many alternative access methods exist, each appropriate in some situation.  Index is a collection of data entries plus a way to quickly find entries with given key values.  If selection queries are frequent, building an index is important.  Hash-based indexes only good for equality search.  Tree-based indexes best for range search; also good for equality search.

27 27 Summary (Contd.)  Data entries can be actual data records, pairs, or pairs.  Choice orthogonal to indexing technique used to locate data entries with a given key value.  Can have several indexes on a given file of data records, each with a different search key.  Indexes can be classified as clustered vs. unclustered, primary vs. secondary, etc. Differences have important consequences for utility/performance.

28 28 Questions

Download ppt "1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke."

Similar presentations

Ads by Google