ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis.
CSE544 Database Statistics Tuesday, February 15 th, 2011 Dan Suciu , Winter
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
A Ternary Unification Framework for Optimizing TCAM-Based Packet Classification Systems Author: Eric Norige, Alex X. Liu, and Eric Torng Publisher: ANCS.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
BTrees & Bitmap Indexes
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
Chapter 8 File organization and Indices.
Quick Review of Apr 17 material Multiple-Key Access –There are good and bad ways to run queries on multiple single keys Indices on Multiple Attributes.
1 Advanced Database Technology Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Spring 2004 February 19, 2004 INDEXING I Lecture based on [GUW,
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
Quick Review of Apr 15 material Overflow –definition, why it happens –solutions: chaining, double hashing Hash file performance –loading factor –search.
1 Query Optimization In Compressed Database Systems Zhiyuan Chen and Johannes Gehrke Cornell University Flip Korn AT&T Labs.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
1 CS143: Index. 2 Topics to Learn Important concepts –Dense index vs. sparse index –Primary index vs. secondary index (= clustering index vs. non-clustering.
1 CS 728 Advanced Database Systems Chapter 17 Database File Indexing Techniques, B- Trees, and B + -Trees.
CS4432: Database Systems II
Access Path Selection in a Relation Database Management System (summarized in section 2)
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
CS 345: Topics in Data Warehousing Thursday, October 21, 2004.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Database Management 9. course. Execution of queries.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
ITCS 6163 Lecture 5. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
Bitmap Indices for Data Warehouse Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
C-Store: How Different are Column-Stores and Row-Stores? Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 8, 2009.
Indexing Structures for Files
Using Bitmap Index to Speed up Analyses of High-Energy Physics Data John Wu, Arie Shoshani, Alex Sim, Junmin Gu, Art Poskanzer Lawrence Berkeley National.
Prof. Bayer, DWH, Ch.5, SS Chapter 5. Indexing for DWH D1Facts D2.
BITMAPS & Starjoins. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
File Organizations and Indexing
March, 2002 Efficient Bitmap Indexing Techniques for Very Large Datasets Kesheng John Wu Ekow Otoo Arie Shoshani.
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.
Improved Query Performance With Variant Indexes Patrick O’Neil, Dallan Quass Presented by Bo Han.
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
5 Trends in the Data Warehousing Space Source: TDWI Report – Next Generation DW.
CPS216: Data-intensive Computing Systems
Indexing Structures for Files and Physical Database Design
CS 540 Database Management Systems
ITIS 5160 Indexing.
COMP 430 Intro. to Database Systems
Chapter 15 QUERY EXECUTION.
File organization and Indexing
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Realtime Analytics OLAP & OLTP in the mix
Dual Bitmap Index: Space-Time Efficient Bitmap
Database Management Systems (CS 564)
BITMAP INDEXES E0 261 Jayant Haritsa Computer Science and Automation
Overview of Query Evaluation
MIS 451 Building Business Intelligence Systems
Presentation transcript:

ITIS 5160 Indexing

Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of indexed keys. Dynamic, stable and exhibit good performance under updates. (But OLAP is not about updates….) Bitmaps: Space efficient Difficult to update (but we don’t care in DW). Can effectively prune searches before looking at data.

Bitmaps R = (…., A,….., M)  R (A) B 8 B 7 B 6 B 5 B 4 B 3 B 2 B 1 B

Query optimization Consider a high-selectivity-factor query with predicates on two attributes. Query optimizer: builds plans (P1) Full relation scan (filter as you go). (P2) Index scan on the predicate with lower selectivity factor, followed by temporary relation scan, to filter out non- qualifying tuples, using the other predicate. (Works well if data is clustered on the first index key). (P3) Index scan for each predicate (separately), followed by merge of RID.

Query optimization (continued) (P2) Blocks of data Pred. 2 answer t1 tn Index Pred1 (P3) t1 tn Index Pred2 Tuple list1 Tuple list2 Merged list

Query optimization (continued) When using bitmap indexes (P3) can be an easy winner! CPU operations in bitmaps (AND, OR, XOR, etc.) are more efficient than regular RID merges: just apply the binary operations to the bitmaps (In B-trees, you would have to scan the two lists and select tuples in both -- merge operation--) Of course, you can build B-trees on the compound key, but we would need one for every compound predicate (exponential number of trees…).

Bitmaps and predicates A = a1 AND B = b2 Bitmap for a1Bitmap for b2 AND = Bitmap for a1 and b2

Tradeoffs Dimension cardinality small dense bitmaps Dimension cardinality large sparse bitmaps Compression (decompression)

Star-Joins Select F.S, D1.A1, D2.A2, …. Dn.An from F,D1,D2,Dn where F.A1 = D1.A1 F.A2 = D2.A2 … F.An = Dn.An and D1.B1 = ‘c1’ D2.B2 = ‘p2’ …. Likely strategy: For each Di find suitable values of Ai such that Di.Bi = ‘xi’ (unless you have a bitmap index for Bi). Use bitmap index on Ai’ values to form a bitmap for related rows of F (OR-ing the bitmaps). At this stage, you have n such bitmaps, the result can be found AND-ing them.

Bitmaps R = (…., A,….., M) value-list index  R (A) B 8 B 7 B 6 B 5 B 4 B 3 B 2 B 1 B

Example sequence value-list index (equality)  R (A) B 2 2 B 1 2 B 0 2 B 2 1 B 1 1 B (1x3+0)

Encoding scheme Equality encoding: all bits to 0 except the one that corresponds to the value Range Encoding: the vi rightmost bits to 0, the remaining to 1

Range encoding single component, base-9  R (A) B 8 B 7 B 6 B 5 B 4 B 3 B 2 B 1 B

RangeEval Evaluates each range predicate by computing two bitmaps: BEQ bitmap and either BGT or BLT RangeEval-Opt uses only <= A < v is the same as A <= v-1 A > v is the same as Not( A <= v) A >= v is the same as Not (A <= v-1)

Example (revisited) sequence value-list index(Equality)  R (A) B 2 2 B 1 2 B 0 2 B 2 1 B 1 1 B (1x3+0)

Example sequence range-encoded index  R (A) B 1 2 B 0 2 B 1 1 B

RangeEval-OPT