Presentation on theme: "C-Store: Self-Organizing Tuple Reconstruction Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 17, 2009."— Presentation transcript:
C-Store: Self-Organizing Tuple Reconstruction Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 17, 2009
Review of Tuple Reconstruction Stitch together separate column values of the same logical tuple. Join on Tuple IDs/positions. Two Strategies Early materialization Late matertialization
Motivation Tuple Reconstruction is easy if columns are sorted in the same order However the pre-requisite can not always be preserved. During query processing, many operators (joins, group by, order by, etc.) are not tuple order- preserving.
The Ultimate Access Pattern For each relation R, we have one copy for each attribute in R. each copy is pre-sorted on the corresponding attribute. All tuple reconstruction initiated by a restriction on an attribute R.a, can be done using the copy that is sorted on R.a. The Limitations: Space constraint Idle time for the pre-sortings.
The Proposed Solution Partial Sideways Cracking Uses auxiliary self-organizing data structures to materialize mappings between pairs of attributes used together for tuple reconstruction. Background MonetDB Selection-based Cracking
MonetDB: http://monetdb.cwi.nl/ Every relation table is represented as a collection of Binary Association Tables (BATs). Each BAT is a set of two columns For a relation R of k attributes, there exists k BATs. Each BAT stores (key, attr) pairs. In each BAT, keys are system generated tuple IDs. For base BAT, the key column is typically virtual. Like STORAGE KEY in Read Store of C-Store.
MonetDB ’ s Basic Operators (1) select(A, v1, v2) Searches all (key, attr) pairs in base column A for attribute values between v1 and v2. Output: A list of keys/positions. In the output, the tuple order is usually preserved.
MonetDB ’ s Basic Operators (2) join(j1, j2) Performs a join between attr1 of j1 and attr2 of j2. Output: A list of (key1, key2) pairs. In the output, the tuple order is mainly preserved for outer join.
Outer Join An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record—even if no other matching record exists. Left outer join Right outer join Full outer join
MonetDB ’ s Basic Operators (3) reconstruct(A, r) Output: All (key, attr) pairs of base column A at the position specified by r.
Selection-Based Cracking Cracker column The first time an attribute A is required by a query, a copy of column A is created, called the cracker column C A of A. Each selection operator on A triggers a range-based physical reorganization of C A. Each cracker column, has a cracker index (AVL-tree) to maintain partitioning information. Future queries benefit from the physically clustered data and do not need to access the whole column.
AVL-Tree An AVL tree is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by at most one.
Order for Tuple Reconstruction The order in which tuples are inserted is used for tuple construction. Physical reorganization happens only on cracker columns.
The crackers.select Operator crackers.select(A, v1, v2) First, it creates C A if it does not exist. It searches the index of C A for the area where v1 and v2 fall. If the bounds do not exist, i.e., no query used them in the past, then C A is physically reorganized to cluster all qualifying tuples into a contiguous area. Output: A list of keys/positions.
Cracker Map A cracker map M AB is defined as a two- column table over two attributes A and B of a relation R. Values of A are stored in the left column, called head. Values of B are stored in the right column, called tail. Values of A and B in the same position of M AB belong to the same tuple.
Maps Are Created on Demand Only When a query q needs access to attribute B based on a restriction on attribute A and M AB does not exist, then q will create M AB by performing a scan over base columns A and B. For each cracker map M AB, there is a cracker index (AVL-tree) that maintains information about how A values are distributed over M AB.
Queries Trigger Cracking Query Style Access B based on A. Each such query triggers cracking (physical reorganization) of M AB based on the restriction applied to A. Cracking All tuples with values of A that qualify the restriction are in a contiguous area in M AB. Realized by splitting a piece of M AB into two or three new pieces.
The sideways.select(A, v1, v2, B) Operator Returns tuples of attribute B of relation R based on a predicate on attribute A of R as follows: (1) If there is no cracker map M AB, then create one. (2) Search the index of M AB to find the contiguous area w of the pieces related to the restriction σ on A. If σ does not match existing piece boundaries, (3) Physically reorganize w to move false hits out of the contiguous area of qualifying tuples. (4) Update the cracker index of M AB accordingly. (5) Return a non-materialized view of the tail of w.
Multi-Projection Queries A single-selection query q that projects n attributes requires n maps, one for each attribute to be projected. Select B, C From R Where A < 4; For this query, we need 2 maps M AB and M Ac. All maps that have been created using A as head are collected in the map set S A.
Adaptive Alignment The Problem Naïve use of the sideways.select operator may lead to non-aligned cracker maps. The Solution Extend the sideways.select operator with an alignment step to keep the alignment maps. The Basic Idea Is to apply all physical reorganizations, due to selections on an attribute A, in the same order to all maps in the map set S A.
Cracker Tape For each map set S A, introduce a cracker tape T A. T A logs (in order of their occurrence) all selections on attribute A that trigger cracking of any map in S A. Each map M Ax is equipped with a cursor pointing to the entry in T A that represents the last crack on M Ax. Given a tape T A, a map M Ax is aligned (synchronized) by successively forwarding its cursor towards the end of M Ax and incrementally cracking M Ax according to all selections it passes on its way. All maps whose cursors point to the same position in T A, are physically aligned.
Map Set Choice: Self-organizing Histograms Following the “cracking philosophy” In an unpredictable environment with no idle system time, always perform the minimum investment. In this way, for a query q, a set SA is chosen such that the restriction on A is the most selective in q. Yielding a minimal bit vector The most selective restriction can be found using the cracker indices.
Complex Queries No other (relational) operators, rather than tuple reconstruction, depends on tuple insertion order. Joins,aggregations, groupings, etc. Potentially many operators can exploit the clustering information in the maps. A MAX operator can consider only the last piece of a map. Such directions are for future work.
Experimental Analysis Compare the implementation of selection and sideways cracking on top of MonetDB, Against the latest non-cracking version of MonetDB, And against MonetDB on presorted data. Results Sideways cracking achieves similar performance to presorted data. But does not have the heavy initial cost and the restrictions on updates and workload prediction.
Partial Sideways Cracking Consider storage restriction Partial Maps Maps are only partially materialized driven by the workload. A map consists of several chunks. Each chunk is a separate two-column table. Each chunk contains a given value range of the head attribute of this map. Each chunk is cracked separately.
A Research Direction Improving performance by compression C-Store uses compression heavily. Can we integrate compression with cracking?
References S. Idreos, M. L. Kersten, S. Manegold. Self- organizing Tuple Reconstruction in Column-stores. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Providence, RI, USA, Accepted for publication, June 2009. Daniel J. Abadi, Daniel S. Myers, David J. DeWitt, and Samuel R. Madden 。 Materialization Strategies in a Column-Oriented DBMS. Proceedings of ICDE, April, 2007, Istanbul, Turkey.