Download presentation
Presentation is loading. Please wait.
Published byIrma Morgan Modified over 8 years ago
1
1 C-Store: A Column-oriented DBMS By New England Database Group
2
2 M.I.T Current DBMS Gold Standard Current DBMS Gold Standard Store fields in one record contiguously on disk Use B-tree indexing Use small (e.g. 4K) disk blocks Align fields on byte or word boundaries Conventional (row-oriented) query optimizer and executor (technology from 1979) Aries-style transactions Store fields in one record contiguously on disk Use B-tree indexing Use small (e.g. 4K) disk blocks Align fields on byte or word boundaries Conventional (row-oriented) query optimizer and executor (technology from 1979) Aries-style transactions
3
3 M.I.T Terminology -- “Row Store” Record 2 Record 4 Record 1 Record 3 E.g. DB2, Oracle, Sybase, SQLServer, …
4
4 M.I.T Row Stores are Write Optimized Row Stores are Write Optimized Can insert and delete a record in one physical write Good for on-line transaction processing (OLTP) But not for read mostly applications Data warehouses CRM Can insert and delete a record in one physical write Good for on-line transaction processing (OLTP) But not for read mostly applications Data warehouses CRM
5
5 M.I.T Elephants Have Extended Row Stores With Bitmap indices Better sequential read Integration of “datacube” products Materialized views With Bitmap indices Better sequential read Integration of “datacube” products Materialized views But there may be a better idea…….
6
6 M.I.T Column Stores
7
7 M.I.T At 100K Feet…. Ad-hoc queries read 2 columns out of 20 In a very large warehouse, Fact table is rarely clustered correctly Column store reads 10% of what a row store reads Ad-hoc queries read 2 columns out of 20 In a very large warehouse, Fact table is rarely clustered correctly Column store reads 10% of what a row store reads
8
8 M.I.T C-Store (Column Store) Project Brandeis/Brown/MIT/UMass-Boston project Usual suspects participating Enough coded to get performance numbers for some queries Complete status later Brandeis/Brown/MIT/UMass-Boston project Usual suspects participating Enough coded to get performance numbers for some queries Complete status later
9
9 M.I.T We Build on Previous Pioneering Work…. Sybase IQ (early ’90s) Monet (see CIDR ’05 for the most recent description) Sybase IQ (early ’90s) Monet (see CIDR ’05 for the most recent description)
10
10 M.I.T C-Store Technical Ideas Code the columns to save space No alignment Big disk blocks Only materialized views (perhaps many) Focus on Sorting not indexing Automatic physical DBMS design Code the columns to save space No alignment Big disk blocks Only materialized views (perhaps many) Focus on Sorting not indexing Automatic physical DBMS design
11
11 M.I.T C-store (Column Store) Technical Ideas Optimize for grid computing Innovative redundancy Xacts – but no need for Mohan Data ordered on anything, Not just time Column optimizer and executor Optimize for grid computing Innovative redundancy Xacts – but no need for Mohan Data ordered on anything, Not just time Column optimizer and executor
12
12 M.I.T How to Evaluate This Paper…. None of the ideas in isolation merit publication Judge the complete system by its (hopefully intelligent) choice of Small collection of inter-related powerful ideas That together put performance in a new sandbox None of the ideas in isolation merit publication Judge the complete system by its (hopefully intelligent) choice of Small collection of inter-related powerful ideas That together put performance in a new sandbox
13
13 M.I.T Code the Columns Work hard to shrink space Use extra space for multiple orders Fundamentally easier than in a row store E.g. RLE works well Work hard to shrink space Use extra space for multiple orders Fundamentally easier than in a row store E.g. RLE works well
14
14 M.I.T No Alignment Densepack columns E.g. a 5 bit field takes 5 bits Current CPU speed going up faster than disk bandwidth Faster to shift data in CPU than to waste disk bandwidth Densepack columns E.g. a 5 bit field takes 5 bits Current CPU speed going up faster than disk bandwidth Faster to shift data in CPU than to waste disk bandwidth
15
15 M.I.T Big Disk Blocks Tunable Big (minimum size is 64K) Tunable Big (minimum size is 64K)
16
16 M.I.T Only Materialized Views Projection (materialized view) is some number of columns from a fact table Plus columns in a dimension table – with a 1-n join between Fact and Dimension table Stored in order of a storage key(s) Several may be stored!!!!! With a permutation, if necessary, to map between them Projection (materialized view) is some number of columns from a fact table Plus columns in a dimension table – with a 1-n join between Fact and Dimension table Stored in order of a storage key(s) Several may be stored!!!!! With a permutation, if necessary, to map between them
17
17 M.I.T Only Materialized Views Table (as the user specified it and sees it) is not stored! No secondary indexes (they are a one column sorted MV plus a permutation, if you really want one) Table (as the user specified it and sees it) is not stored! No secondary indexes (they are a one column sorted MV plus a permutation, if you really want one)
18
18 M.I.T Example User view: EMP (name, age, salary, dept) Dept (dname, floor) Possible set of MVs: MV-1 (name, dept, floor) in floor order MV-2 (salary, age) in age order MV-3 (dname, salary, name) in salary order
19
19 M.I.T Different Indexing Few valuesMany values Sequential RLE encoded Conventional B-tree at the value level Delta encoded Conventional B-tree at the block level Non sequentialBitmap per value Conventional Gzip Conventional B-tree at the block level
20
20 M.I.T Automatic Physical DBMS Design Not enough 4-star wizards to go around Accept a “training set” of queries and a space budget Choose the MVs auto-magically Re-optimize periodically based on a log of the interactions Not enough 4-star wizards to go around Accept a “training set” of queries and a space budget Choose the MVs auto-magically Re-optimize periodically based on a log of the interactions
21
21 M.I.T Optimize for Grid Computing I.e. shared-nothing Dewitt (Gamma) was right Horizontal partitioning and intra-query parallelism as in Gamma I.e. shared-nothing Dewitt (Gamma) was right Horizontal partitioning and intra-query parallelism as in Gamma
22
22 M.I.T Innovative Redundancy Hardly any warehouse is recovered by a redo from the log Takes too long! Store enough MVs at enough places to ensure K-safety Rebuild dead objects from elsewhere in the network K-safety is a DBMS-design problem! Hardly any warehouse is recovered by a redo from the log Takes too long! Store enough MVs at enough places to ensure K-safety Rebuild dead objects from elsewhere in the network K-safety is a DBMS-design problem!
23
23 M.I.T XACTS – No Mohan Undo from a log (that does not need to be persistent) Redo by rebuild from elsewhere in the network Undo from a log (that does not need to be persistent) Redo by rebuild from elsewhere in the network
24
24 M.I.T XACTS – No Mohan Snapshot isolation (run queries as of a tunable time in the recent past) To solve read-write conflicts Distributed Xacts Without a prepare message (no 2 phase commit) Snapshot isolation (run queries as of a tunable time in the recent past) To solve read-write conflicts Distributed Xacts Without a prepare message (no 2 phase commit)
25
25 M.I.T Storage (sort) Key(s) is not Necessarily Time That would be too limiting So how to do fast updates to densepack column storage that is not in entry sequence? That would be too limiting So how to do fast updates to densepack column storage that is not in entry sequence?
26
26 M.I.T Solution – a Hybrid Store Read-optimized Column store Write-optimized Column store Tuple mover (Much like Monet) (What we have been talking about so far) (Batch rebuilder)
27
27 M.I.T Column Executor Column operations – not row operations Columns remain coded – if possible Late materialization of columns Column operations – not row operations Columns remain coded – if possible Late materialization of columns
28
28 M.I.T Column Optimizer Chooses MVs on which to run the query Most important task Build in snowflake schemas Which are simple to optimize without exhaustive search Looking at extensions Chooses MVs on which to run the query Most important task Build in snowflake schemas Which are simple to optimize without exhaustive search Looking at extensions
29
29 M.I.T Current Performance 100X popular row store in 40% of the space 10X popular column store in 70% of the space 7X popular row store in 1/6 th of the space Code available with BSD license 100X popular row store in 40% of the space 10X popular column store in 70% of the space 7X popular row store in 1/6 th of the space Code available with BSD license
30
30 M.I.T Structure Going Forward Vertica Very well financed start-up to commercialize C-store Doing the heavy lifting University Research Funded by Vertica Vertica Very well financed start-up to commercialize C-store Doing the heavy lifting University Research Funded by Vertica
31
31 M.I.T Vertica Complete alpha system in December ‘05 Everything, including DBMS designer With current performance! Looking for early customers to work with (see me if you are interested) Complete alpha system in December ‘05 Everything, including DBMS designer With current performance! Looking for early customers to work with (see me if you are interested)
32
32 M.I.T University Research Extension of algorithms to non-snowflake schemas Study of L2 cache performance Study of coding strategies Study of executor options Study of recovery tactics Non-cursor interface Study of optimizer primitives Extension of algorithms to non-snowflake schemas Study of L2 cache performance Study of coding strategies Study of executor options Study of recovery tactics Non-cursor interface Study of optimizer primitives
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.