Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.Kersten 2008 1 The MonetDB Architecture Martin Kersten CWI Amsterdam.

Similar presentations


Presentation on theme: "M.Kersten 2008 1 The MonetDB Architecture Martin Kersten CWI Amsterdam."— Presentation transcript:

1 M.Kersten 2008 1 The MonetDB Architecture Martin Kersten CWI Amsterdam

2 M.Kersten 2008 2 Execution Paradigm Database Structures Query optimizer Try to keep things simple DBMS Architecture

3 M.Kersten Sep 2008 MonetDB quickstep MonetDB kernel MAPI protocol JDBC C-mapi lib Perl End-user application ODBC PHP Python SQL XQuery RoR

4 M.Kersten Sep 2008 The MonetDB Software Stack XQuery MonetDB 4 MonetDB 5 MonetDB kernel SQL 03 Optimizers SQL/XML SOAP Open-GIS An advanced column-oriented DBMS X100 Compile?

5 M.Kersten 2008 5 SQL MonetDB Server MonetDB Kernel MAL The MonetDB Assembly Language glues the layers together. - binary relational model operators - imperative constructs - functional abstractions Original: P. A. Boncz, M. L. Kersten. MIL Primitives for Querying a Fragmented World. The VLDB Journal, 8(2):101-119, October 1999. MonetDB quickstep

6 SQL MonetDB Server Tactical Optimizers MonetDB Kernel MAL function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X6:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",1); X9:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",2); X13:bat[:oid,:oid] := sql.bind_dbat("sys","photoobjall",1); X8 := algebra.kunion(X1,X6); X11 := algebra.kdifference(X8,X9); X12 := algebra.kunion(X11,X9); X14 := bat.reverse(X13); X15 := algebra.kdifference(X12,X14); X18 := algebra.markT(X15,0@0); X19 := bat.reverse(X18); X20 := aggr.count(X19); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1; select count(*) from photoobjall; 0 base table 1 insert 2 update delete

7 M.Kersten 2008 7 SQL MonetDB Server MAL optimizers MonetDB Kernel MAL Strategic optimizer: – Exploit the lanuage the language – Rely on heuristics Operational optimizer: – Exploit everything you know at runtime – Re-organize if necessary Tactical MAL optimizer: –Modular optimizer framework –Focused on coarse grain resource optimization MonetDB quickstep

8 SQL MonetDB Server Tactical Optimizers MonetDB Kernel MAL function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X20 := aggr.count(X1); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1; select count(*) from photoobjall; Optimizer pipelines. sql> select optimizer; inline,remap,evaluate,costModel,coercions,emptySet,aliases,mergetable,deadcode,c onstants,commonTerms,joinPath,deadcod e,reduce,garbageCollector,dataflow,history,replication,multiplex

9 MonetDB quickstep SQL MonetDB Server Tactical Optimizers MonetDB Kernel MAL function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X20 := aggr.count(X1); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1; select count(*) from photoobjall; Kernel execution paradigms Tuple-at-a-time pipelined Operator-at-a-time

10 M.Kersten 2008 10 MonetDB storage Database Structures N-ary stores PAX stores Column stores Try to keep things simple

11 M.Kersten 2008 11 John32Houston OK Early 80s: tuple storage structures for PCs were simple Mary31Houston OK Easy to access at the cost of wasted space Try to keep things simple

12 M.Kersten 2008 12 Slotted pages Logical pages equated physical pages 32John Houston 31 Mary Houston Try to keep things simple

13 M.Kersten 2008 13 Slotted pages Logical pages equated multiple physical pages 32John Houston 31 Mary Houston Try to keep things simple

14 M.Kersten 2008 14 Not all attributes are equally important Avoid things you don’t always need

15 M.Kersten 2008 15 A column orientation is as simple and acts like an array Attributes of a tuple are correlated by offset Avoid moving too much around

16 M.Kersten 2008 16 MonetDB Binary Association Tables Try to keep things simple

17 M.Kersten 2008 17 Physical data organization Binary Association Tables Bat Unit fixed size Dense sequence Memory mapped files Try to avoid doing things twice

18 M.Kersten 2008 18 Binary Association Tables accelerators Hash-based access Try to avoid doing things twice Column properties: key-ness non-null dense ordered

19 M.Kersten 2008 19 Binary Association Tables storage control A BAT can be used as an encoding table A VID datatype can be used to represent dense enumerations Type remappings are used to squeeze space 100 Try to avoid doing things twice

20 M.Kersten 2008 20 Column orientation benefits datawarehousing Brings a much tighter packaging and improves transport through the memory hierarchy Each column can be more easily optimized for storage using compression schemes Each column can be replicated for read-only access Mantra: Try to keep things simple

21 M.Kersten 2008 21 Execution Paradigm Database Structures Query optimizer DBMS Architecture Try to maximize performance

22 M.Kersten 2008 22 Execution Paradigm Volcano model Materialize All Model Vectorized model Try to maximize performance

23 M.Kersten 2008 23 Volcano Engines Query SELECT name, salary*.19 AS tax FROM employee WHERE age > 25

24 M.Kersten 2008 24 Operators Iterator interface -open() -next(): tuple -close() Volcano Engines

25 M.Kersten 2008 25 Primitives Provide computational functionality All arithmetic allowed in expressions, e.g. multiplication mult(int,int)  int Volcano Engines

26 M.Kersten 2008 26 The Volcano model is based on a simple pull- based iterator model for programming relational operators. The Volcano model minimizes the amount of intermediate store The Volcano model is CPU intensive and can be inefficient Try to maximize performance Volcano paradigm

27 M.Kersten 2008 27 MonetDB paradigm The MonetDB kernel is a programmable relational algebra machine Relational operators operate on ‘array’-like structures Try to use simple a software pattern

28 M.Kersten 2008 28 Operator implementation All algebraic operators materialize their result GOOD: small code footprints GOOD: potential for re-use BAD : extra storage for intermediates BAD: cpu cost for retaining it Local optimization decisions Sortedness, uniqueness, hash index Sampling to determine sizes Parallelism options Properties that affect the algorithms Try to use simple a software pattern

29 M.Kersten 2008 29 Operator implementation All algebraic operators materialize their result Local optimization decisions Heavy use of code expansion to reduce cost 55 selection routines 149 unary operations 335 join/group operations 134 multi-join operations 72 aggregate operations Try to use simple a software pattern

30 M.Kersten 2008 30 Execution Paradigm Database Structures Query optimizer DBMS Architecture Try to avoid the search space trap

31 Query optimization Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation (Chapter 13) Cost difference between a good and a bad way of evaluating a query can be enormous Example: performing a r X s followed by a selection r.A = s.B is much slower than performing a join on the same condition Need to estimate the cost of operations Depends critically on statistical information about relations which the database must maintain Need to estimate statistics for intermediate results to compute cost of complex expressions

32 Introduction (Cont.) Relations generated by two equivalent expressions have the same set of attributes and contain the same set of tuples, although their attributes may be ordered differently.

33 Introduction (Cont.) Generation of query-evaluation plans for an expression involves several steps: 1.Generating logically equivalent expressions Use equivalence rules to transform an expression into an equivalent one. 2.Annotating resultant expressions to get alternative query plans 3.Choosing the cheapest plan based on estimated cost The overall process is called cost based optimization.

34 Equivalence Rules 1.Conjunctive selection operations can be deconstructed into a sequence of individual selections. 2.2.Selection operations are commutative. 3.Only the last in a sequence of projection operations is needed, the others can be omitted. 4.Selections can be combined with Cartesian products and theta joins. a.  (E 1 X E 2 ) = E 1  E 2 b. 1 (E 1 2 E 2 ) = E 1 1 2 E 2

35 Equivalence Rules (Cont.) 5.Theta-join operations (and natural joins) are commutative. E 1  E 2 = E 2  E 1 6.(a) Natural join operations are associative: (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) (b) Theta joins are associative in the following manner: (E 1 1 E 2 ) 2  3 E 3 = E 1 2 3 (E 2 2 E 3 ) where  2 involves attributes from only E 2 and E 3.

36 Pictorial Depiction of Equivalence Rules

37 Equivalence Rules (Cont.) 7.The selection operation distributes over the theta join operation under the following two conditions: (a) When all the attributes in  0 involve only the attributes of one of the expressions (E 1 ) being joined.  0 E 1  E 2 ) = ( 0 (E 1 ))  E 2 (b) When  1 involves only the attributes of E 1 and  2 involves only the attributes of E 2.  1   E 1  E 2 ) = ( 1 (E 1 ))  (  (E 2 ))

38 Equivalence Rules (Cont.) 8.The projections operation distributes over the theta join operation as follows: (a) if it involves only attributes from L 1  L 2 : (b) Consider a join E 1  E 2. Let L 1 and L 2 be sets of attributes from E 1 and E 2, respectively. Let L 3 be attributes of E 1 that are involved in join condition , but are not in L 1  L 2, and let L 4 be attributes of E 2 that are involved in join condition , but are not in L 1  L 2.

39 Equivalence Rules (Cont.) 9.The set operations union and intersection are commutative E 1  E 2 = E 2  E 1 E 1  E 2 = E 2  E 1 n(set difference is not commutative). 10.Set union and intersection are associative. (E 1  E 2 )  E 3 = E 1  (E 2  E 3 ) (E 1  E 2 )  E 3 = E 1  (E 2  E 3 ) 11.The selection operation distributes over ,  and –.   (E 1 – E 2 ) =   (E 1 ) –   (E 2 ) and similarly for  and  in place of – Also:   (E 1 – E 2 ) =   (E 1 ) – E 2 and similarly for  in place of –, but not for  12.The projection operation distributes over union  L (E 1  E 2 ) = ( L (E 1 ))  ( L (E 2 ))

40 Multiple Transformations (Cont.)

41 Optimizer strategies Heuristic Apply the transformation rules in a specific order such that the cost converges to a minimum Cost based Simulated annealing Randomized generation of candidate QEP Problem, how to guarantee randomness

42 Memoization Techniques How to generate alternative Query Evaluation Plans? Early generation systems centred around a tree representation of the plan Hardwired tree rewriting rules are deployed to enumerate part of the space of possible QEP For each alternative the total cost is determined The best (alternatives) are retained for execution Problems: very large space to explore, duplicate plans, local maxima, expensive query cost evaluation. SQL Server optimizer contains about 300 rules to be deployed.

43 Memoization Techniques How to generate alternative Query Evaluation Plans? Keep a memo of partial QEPs and their cost. Use the heuristic rules to generate alternatives to built more complex QEPs r 1 r 2 r 3 r 4 r 1 r 2 r 2 r 3 r 3 r 4 r 1 r 4 x Level 1 plans r3 r3 r3 r3 Level 2 plans Level n plans r4 r4 r 2 r 1

44 M.Kersten 2008 44 Ditching the optimizers Applications have different characteristics Platforms have different characteristics The actual state of computation is crucial A generic all-encompassing optimizer cost- model does not work

45 M.Kersten 2008 45 Code Inliner. Constant Expression Evaluator. Accumulator Evaluations. Strength Reduction. Common Term Optimizer. Join Path Optimizer. Ranges Propagation. Operator Cost Reduction. Foreign Key handling. Aggregate Groups. Code Parallizer. Replication Manager. Result Recycler. MAL Compiler. Dynamic Query Scheduler. Memo-based Execution. Vector Execution. Alias Removal. Dead Code Removal. Garbage Collector. Try to disambiguate decisions

46 M.Kersten 2008 46 Execution Paradigm Database Structures Query optimizer DBMS Architecture No data from persistent store to the memory trash

47 M.Kersten 2008 47 Execution paradigms The MonetDB kernel is set up to accommodate different execution engines The MonetDB assembler program is Interpreted in the order presented Interpreted in a dataflow driven manner Compiled into a C program Vectorised processing X100 project No data from persistent store to the memory trash

48 M.Kersten 2008 48 MonetDB/x100 Combine Volcano model with vector processing. All vectors together should fit the CPU cache Vectors are compressed Optimizer should tune this, given the query characteristics. ColumnBM (buffer manager) X100 query engine CPU cache networked ColumnBM-s RAM

49 M.Kersten 2008 49 Varying the vector size on TPC-H query 1 mysql, oracle, db2 X100 MonetDB low IPC, overhead RAM bandwidth bound No data from persistent store to the memory trash

50 Query evaluation strategy Pipe-line query evaluation strategy Called Volcano query processing model Standard in commercial systems and MySQL Basic algorithm: Demand-driven evaluation of query tree. Operators exchange data in units such as records Each operator supports the following interfaces:– open, next, close open() at top of tree results in cascade of opens down the tree. An operator getting a next() call may recursively make next() calls from within to produce its next answer. close() at top of tree results in cascade of close down the tree

51 Query evaluation strategy Pipe-line query evaluation strategy Evaluation: Oriented towards OLTP applications Granule size of data interchange Items produced one at a time No temporary files Choice of intermediate buffer size allocations Query executed as one process Generic interface, sufficient to add the iterator primitives for the new containers. CPU intensive Amenable to parallelization

52 Query evaluation strategy Materialized evaluation strategy Used in MonetDB Basic algorithm: for each relational operator produce the complete intermediate result using materialized operands Evaluation: Oriented towards decision support queries Limited internal administration and dependencies Basis for multi-query optimization strategy Memory intensive Amendable for distributed/parallel processing

53 M.Kersten 2008 53 Outline Execution Paradigm Database Structures Indexing

54 M.Kersten 2008 54 Materialized Views Cracking Indexing Traditional B-tree, Hash Find a trusted fortune teller

55 M.Kersten 2008 55 Indices in database systems focus on: All tuples are equally important for fast retrieval There are ample resources to maintain indices MonetDB does not rely on B-trees MonetDB cracks the database into pieces Find a trusted fortune teller

56 M.Kersten 2008 56 Database Cracking Cracking in a database kernel? Every database scan is a physical reorganization This is crazy… Reorganization is utterly expensive… Better to have many (update) users pay less then one (query) user a lot It does not fit the Volcano- style query processor.. It just doesn’t work that way……. D1 Find a trusted fortune teller

57 M.Kersten 2008 57 Cracking in a database kernel? Every database scan is a physical reorganization select * from D1 where ccd=‘j5’ D1 Find a trusted fortune teller

58 M.Kersten 2008 58 Cracking in a database kernel? Every database scan is a physical reorganization D1 create view V as select * from D1 where ccd=‘j5’ D1 select * from D1 where not ccd=‘j5’ Find a trusted fortune teller

59 M.Kersten 2008 59 Cracking in a database kernel? Every database scan is a physical reorganization D1 select * from D1 where ccd=‘j5’ select * from D1 where ccd=‘j7’ D1 select * from D1 where not ( ccd=‘j5’or ccd= ‘j7’) select * from D1 where not ( ccd=‘j5’or ccd= ‘j7’) Find a trusted fortune teller

60 M.Kersten 2008 60 Cracking in a database kernel? Every database scan is a physical reorganization D1 select * from D1 where ccd=‘j5’ select * from D1 where ccd=‘j7’ D1 select * from D1 where not ( ccd=‘j5’or ccd= ‘j7’) Find a trusted fortune teller

61 M.Kersten 2008 61 Cracking in a database kernel? Every database scan is a physical reorganization Don’t pay the complete sort cost during update Incremental predicate index maintenance The first user pays ! D1 Try to avoid useless investments

62 M.Kersten 2008 A simple range query Try to avoid useless investments

63 M.Kersten 2008 TPC-H query 6 Try to avoid useless investments

64 M.Kersten 2008 64 Cracking is easy in a column store and is part of the critical execution path Cracking works under high volume updates Try to avoid useless investments

65 M.Kersten 2008 65 Cstore SQLserver DB2 PostgreSQL MySQL Whoa MonetDB ! Speed lines !


Download ppt "M.Kersten 2008 1 The MonetDB Architecture Martin Kersten CWI Amsterdam."

Similar presentations


Ads by Google