Presentation on theme: "Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis."— Presentation transcript:
Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis
Introduction Query performance issues On Line Transaction Processing. Read write database. Decision Support System. Read mostly environments, with high selectivity factor.
Bitmap In Simple Form Every value has it’s own column == bitmap. Value List Index
Advantages Compact size.Compact size. Efficient hardware support for bitmap operations (AND, OR, XOR, NOT).Efficient hardware support for bitmap operations (AND, OR, XOR, NOT). Fast search.Fast search. Multiple differentiate bitmap indexes for different kind of queries.Multiple differentiate bitmap indexes for different kind of queries.
Selection queries. Queries of the form “A op v”Queries of the form “A op v” A refers to indexed attribute. Op Range predicates Equality predicates
Space time tradeoff of bitmap indexes, for selection queries. Space optimal bitmap index.Space optimal bitmap index. Time optimal bitmap index under a given space constraint.Time optimal bitmap index under a given space constraint. Bitmap index withBitmap index with optimal space time tradeoff. Time optimal bitmapTime optimal bitmapindex.
Attribute Value Decomposition.
Bitmap Encoding Scheme Equality Encoding: bi bits one for each possible value, all 0, vi 1. Range Encoding: vi right most bits 0, rest 1.
Evaluation Algorithm for Range- Encoded Bitmap Indexes. RangeEval - O’Neil and QuassRangeEval - O’Neil and Quass RangeEval-Opt:RangeEval-Opt: –number bitmap operation 50% off –less bitmap scans for range predicate evaluation –caluclating only the requested bitmap –avoids the intermediate equality predicate evaluation by evaluating each range query in term only off <= based on: A < v == A<=v-1A < v == A<=v-1 A > v == ! (A v == ! (A<=v) A>=v == A =v == A<=v-1 –Working with only one bitmap B vs. working with at least two [Beq and ( Blt or Bge)]
Example: A<=864 using a 3 component base-10 index.A<=864 using a 3 component base-10 index. RaneEval-Opt:RaneEval-Opt: 4 operation 5 scans RangeEval:RangeEval: 10 operations 6 scans
Cost Model for Space-Time Tradeoff Analysis Space(I)Space(I) Space metric is in term of number of bitmaps stored. Time(I)Time(I) Time metric is in term of expected number of bitmap scans for a selection query evaluation.
Comparison of Bitmap Encoding Scheme Equality encoded:Equality encoded: S(I) ~ C T(I) ~ n*b/2 Range encoded:Range encoded: S(I) ~ C-n T(I) ~ 2n
Space Optimal:Space Optimal: –number of bitmap in n-component space optimal = n(b-2) b~ –space efficiency is non-decreasing function of the number of components. –The ultimate optimal is when n=log(C) Time Optimal:Time Optimal: –the optimal base in n-component base is <2,2,2,…,C/2^N> –time efficiency is non-increasing function of the number of components. –The ultimate optimal is when n=1
Optimal Space-Time Tradeoff (knee). Based on experimental, guessing and guts filling. 2 component index The base of the most time-efficient 2-component space-optimal index is given by:
Time Optimal Bitmap Index Under Space Constraint
Bitmap Index Storage Schems Bitmap Level Storage (BS)Bitmap Level Storage (BS) each bitmap his own file Component Level Storage (CS)Component Level Storage (CS) each index component has its own file Index Level Storage (IS)Index Level Storage (IS) all together in one file
Compression of each file CS has the best Space(I) tradeoff after compression.CS has the best Space(I) tradeoff after compression. BS has the best Time(I) tradeoff after compression.BS has the best Time(I) tradeoff after compression.