Making B+-Trees Cache Conscious in Main Memory

Slides:



Advertisements
Similar presentations
Chapter 4: Trees Part II - AVL Tree
Advertisements

0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
B+-trees. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity:
1 Lecture 8: Data structures for databases II Jose M. Peña
COMP 451/651 Indexes Chapter 1.
B+-tree and Hashing.
Tree-Structured Indexes. Introduction v As for any index, 3 alternatives for data entries k* : À Data record with key value k Á Â v Choice is orthogonal.
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Effect of Node Size on the Performance of Cache- Conscious B+ Trees Written by: R. Hankins and J.Patel Presented by: Ori Calvo.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
B-Trees (continued) Analysis of worst-case and average number of disk accesses for an insert. Delete and analysis. Structure for B-tree node.
Index Structures Parin Shah Id:-207. Topics Introduction Structure of B-tree Features of B-tree Applications of B-trees Insertion into B-tree Deletion.
1 B+ Trees. 2 Tree-Structured Indices v Tree-structured indexing techniques support both range searches and equality searches. v ISAM : static structure;
CS4432: Database Systems II
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Storage and Indexing February 26 th, 2003 Lecture 19.
Introduction to Database Systems1 B+-Trees Storage Technology: Topic 5.
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Indexing and Hashing (emphasis on B+ trees) By Huy Nguyen Cs157b TR Lee, Sin-Min.
1 Multiway trees & B trees & 2_4 trees Go&Ta Chap 10.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Spring 2006 Copyright (c) All rights reserved Leonard Wesley0 B-Trees CMPE126 Data Structures.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.
B + TREE. INTRODUCTION A B+ tree is a balanced tree in which every path from the root of the tree to a leaf is of the same length, and each non leaf node.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
Fractal Prefetching B + -Trees: Optimizing Both Cache and Disk Performance Author: Shimin Chen, Phillip B. Gibbons, Todd C. Mowry, Gary Valentin Members:
COSC 2007 Data Structures II Chapter 15 External Methods.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Adapted from Mike Franklin
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
B-Tree – Delete Delete 3. Delete 8. Delete
Optimizing Multidimensional Index Trees for Main Memory Access Author: Kihong Kim, Sang K. Cha, Keunjoo Kwon Members: Iris Zhang, Grace Yung, Kara Kwon,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
Indexing. 421: Database Systems - Index Structures 2 Cost Model for Data Access q Data should be stored such that it can be accessed fast q Evaluation.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Content based on Chapter 10 Database Management Systems, (3 rd.
Tree-Structured Indexes Chapter 10
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
ITEC 2620M Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: ec2620m.htm Office: TEL 3049.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 10.
Tree-Structured Indexes. Introduction As for any index, 3 alternatives for data entries k*: – Data record with key value k –  Choice is orthogonal to.
CSC 4250 Computer Architectures
CS522 Advanced database Systems
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CACHE-CONSCIOUS INDEXES
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Database Design and Programming
Chapter 1 Computer System Overview
Presentation transcript:

Making B+-Trees Cache Conscious in Main Memory Author:Jun Rao, Kenneth A. Ross Members: Iris Zhang, Grace Yung, Kara Kwon, Jessica Wong

Outline 1. Introduction 2. Related Work 3. Cache Sensitive B+-Trees 4. Conclusion

Motivation Significant portion of execution time: second level data cache misses first level instruction cache misses System Hierarchy

Motivation (Cont’d) 2. CPU speeds have been increasing at a much faster rate than memory speeds Conclusion: improving cache behavior is going to be an imperative task in main memory data processing Resolution: using memory index structure

Cache Memories Cache memories are small fast static RAM memories that improve performance by holding recently referenced data. Parameter: Capacity Block Size (cache line) Associativity Memory reference: Hit Miss

Cache Optimization on Index Structures—B+-Trees Height-balanced tree Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree. (n=2d) Each node is 1 cache line (cache-line based) Full pointer B+-Tree (n =2)

Cache Optimization on Index Structures—CSS-Trees Similar as B+-tree Eliminating child pointers Storing child nodes in a fixed sized array. Nodes are numbered & stored level by level, left to right. Position of child node can be calculated via arithmetic. No pointer CSS-Tree

Comparison between B+-Trees and CSS-Trees Cache Line Size=12 bytes, Key Size=Pointer Size=4 bytes Search key =3 B+-Tree CSS-Tree

Comparison between B+-Trees and CSS-Trees(cont’d) full pointer more cache access and more cache misses efficient for updating operation, e.g. insertion and deletion CSS tree no pointer fewer cache access and fewer cache misses acceptable for static data updated in batches Pointer elimination is important in cache optimization, but removing pointer completely introduces some restriction, so we use partial elimination Conclusion: partial pointer elimination

Cache Sensitive B+-Trees Cache Sensitive B+-Trees with One Child Pointer Segmented CSB+-Trees Full CSB+-Trees

Cache Sensitive B+-Trees with One Pointer Similar as B+-tree All the child nodes of any given node are put into a node group with one pointer Nodes within a node group are stored continuously and can be accessed using an offset to the first node in the group

Cache Sensitive B+-Trees with One Pointer (cont’d) Cache misses are reduced because a cache line can hold more keys than B+-Trees and can satisfy one more level comparison. CSB+-Tree can support incremental updates in a way similar to B+-Tree Cache Line Size=64 bytes, Key Size=Pointer Size=4 bytes B+-Tree: 7 keys per node CSB+-Tree: 14 keys per node

Operations on CSB+-Tree—Bulkload 22| 7| 30| 3| 13|19 25| 33| 2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 36|39

Operations on CSB+-Tree— Insertion Search the leaf node n to insert the new entry If n is not full, insert the new entry in the appropriate place Otherwise, split n. Let p be n’ parent node, f be the first-child pointer in p and g be the node-group pointed by f If p is not full, copy g to g' in which n is split in two nodes. Let f point to g' If p is full, copy half g to g'. Let f point to g'. Split the node-group of p according to step a

Operations on CSB+-Tree— Insertion (cont’d) 22| key = 34 7| 30| 3| 13|19 25| 33| 2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 36|39 a CSB+-Tree of Order 1

Operations on CSB+-Tree— Insertion (cont’d) 22| key = 34 7| 30| 3| 13|19 25| 33|36 2|3 5|7 12|13 16|19 20|22 24|25 27|30 31|33 34|36 39|

Operations on CSB+-Tree—Search Determine the rightmost key K in the node that is smaller than the search key Get the address of the child node Goto first step until find the search key or there is no other node can be checked Search method in a node basic approach uniform approach variable approach

Segmented Cache Sensitive B+-Trees Problem: it’s time consuming to split a node group Resolution:SCSB+-Tree method: divide node group into two segments with one child pointer per segment result: better split performance, but worse search

Full CSB+-Tree Motivation: reduce the split cost Method: Result: pre-allocate space for a full node group shift part of the node group along by one node when a node split Result: reduce the split cost, but increase the space complexity

Conclusion CSB+-Trees are more cache conscious than B+-Tree because of partial pointer elimination CSB+-Trees support efficient incremental updates, but CSS-Trees do not Partial pointer elimination is a general technique which can be applied to other memory structures