Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Universe Fast Parallel MySQL Server. Target Markets Database Servers Data Warehouse Servers Data Analytics Servers.

Similar presentations


Presentation on theme: "Parallel Universe Fast Parallel MySQL Server. Target Markets Database Servers Data Warehouse Servers Data Analytics Servers."— Presentation transcript:

1 Parallel Universe Fast Parallel MySQL Server

2 Target Markets Database Servers Data Warehouse Servers Data Analytics Servers

3 Parallel Universe Parallel Universe is the industry’s only SQL server with fast parallel query engine. It is created by extending MySQL server architecture. Speed is achieved by processing tables in parallel utilizing multiple core/CPU of server hardware. Because of fast query processing being available to data analysis, it is an ideal data warehouse server. With Parallel Universe, you'll also be able to deploy less costly server hardware for the same query load/task. Parallel Universe is released under the GPL license and fully compatible with MySQL and Percona servers. Also available as part of Linux OS images at Amazon Web Services and www.GoGrid.com.

4 New Technology Today microprocessors which provide computational resource to RDBMS (Relational Data Base Management System) contain multiple CPU cores where each core is capable of executing its own code independently. If RDBMS server can break down its task into a number of smaller subtasks then these subtasks may be performed by those multiple cores concurrently resulting in faster execution. This new technology speeds up the execution phase of query.

5 Parallel Universe is an Extension to MySQL Server + Fast Parallel Query Engine MySQL Server Parallel Universe

6 MySQL Query Processing 1. Background Information MySQL query determines combinations of records from given tables which satisfy a given condition. SELECT field_list FROM table_list WHERE condition where field_list represents fields from the tables to output, each table consists of a number of records and each record is comprised of a number of fields(attributes) and the condition specifies field relationships. The server parses and translates the query into the optimum query execution plan which specifies a order by which the tables are processed, how records are read from the tables, conditions which must be satisfied by these records and process/output record method.

7 Current Technology All operations carried out by a single thread*. Parse & Optimize Query Table Order Table Access Methods Record Match Conditions #Process Record Method Execute #Process Records Record Combinations Result Optimization PhaseExecution Phase Query Execution Plan *Thread is a flow of code execution scheduled by the operating system to run on a particular core of the CPU.

8 New Technology The plan is executed by multiple (n) threads. Table Order Table Access Methods Record Match Conditions Process Record Method Execute Thread 1~n Process Records Thread n Record Combinations Result Execution Phase Query Execution Plan

9 2. Example of Current Technology with 3 Tables, recursive execution by a single thread 1st table(t1) Read a record (may use an index) and store (used fields only) and if it satisfies the condition associated with this table then move down to the next table otherwise continue reading records when there is no more record to read, the procedure is finished. 2nd table(t2) Read a record (may use an index derived from record of t1) and if it satisfies the condition associated with this table which depends on records from this table and t1 then move down to the next table otherwise continue reading records when there is no more record to read, move back up to the previous table. Last table(t3) Read a record (may use an index derived from records of t2 and t1) and if it satisfies the condition associated with this table which depends on records from this table, t2 and t1 then process and output this particular combination of records according to the process record method of the query execution plan in any case continue reading records when there is no more record to read, move back up to the previous table.

10 Processing by Thread 1 Table 1Table 2 Processing by Thread 1 Table 3 Processing by Thread 1 Records Current Technology Tables take turns in being processed by Thread 1.

11 3. Same Example, New Technology executed by 3 threads 1st table(t1, executed by thread 1) Read a record and if it satisfies the condition then insert this record into inter table buffer between this table and the next (possibly waits in case the buffer is full). Continue reading records when there is no more record to read, processing of this table is finished. 2nd table(t2, executed by thread 2) Wait for a record from t1 to be available in the inter table buffer between this table and the previous, read a record of this table and if it satisfies the condition then insert this record and the record of t1 into the inter table buffer. Continue reading records when there is no more record to read, remove the record of t1 from the buffer and wait for the next record. Last table(t3, executed by thread 3) Wait for a record set of t2 and t1 to be available in the inter table buffer then read a record of this table and if it satisfies the condition then process and output this particular combination of records. Continue reading records when there is no more record to read, remove the record set of t2 and t1 from the buffer and wait for the next record set.

12 New Technology - Fast Parallel Query Engine *Record set is a set of records from tables processed thus far. Tables in query are processed in parallel utilizing multiple core/CPU. Processing by Thread 1 Table 1Table 2 Processing by Thread 2 Table 3 Processing by Thread 3 Records Record Sets* Record Sets

13 Table Processing by Thread i If Record Match then Append and Output Inter Table Buffer* Records Record Sets Record Sets Record Sets *Inter Table Buffer is a queue for record sets between tables to allow 2 threads to operate independently.

14 MySQL Compatible Use Existing Databases Use Existing Queries Specify Tables to be Processed in Parallel set @parallel_table_list=“mydb.t1,mydb.t2, mydb.t3” (default=null, parallel processing disabled)

15 Benchmarks Intel Dual Xeon Processors (2x6 cores) w/ 24GB Memory on Centos 6.2-64bit OS Using warm cache: 2 nd and subsequent runs where tables have already been loaded into the memory. For Innodb benchmarks --innodb_buffer_pool_size=16G t1 ( `region` char(1) DEFAULT NULL, (A thru F and repeats) `idn` int(11) DEFAULT NULL, (0 thru 999,999) `rev_idn` int(11) DEFAULT NULL, (999,999 thru 0) `grp` int(11) DEFAULT NULL, (0 thru 99 in steps of 3, modulo 100) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 (1 million records) regionidnrev_idngrp A0999,9990 B1999,9983 C2999,9976 C999,998194 D999,999097

16 t2 same as t1 with a key (idn) t3 same as t1 with a key (rev_idn) t4 same as t1 with a key (idn) t5 same as t1 with a key (rev_idn) t6 same as t1 with a key (idn) t7 same as t1 with a key (rev_idn) t8 same as t1 with a key (idn) 4 Innodb Tables Query SELECT count(*) FROM t1,t2,t3,t4 WHERE t1.idn=t2.idn and t2.rev_idn=t3.rev_idn and t3.idn=t4.idn and t4.grp>=0; Execution Plan idselecttabletypep_keyskeykeylenrefrowsExtra 1SIMPLEt1ALLNULL 1005444 1SIMPLEt2refidn 5t1.idn1where 1SIMPLEt3refrev_idn 5t2.rev_idn1where 1SIMPLEt4refidn 5t3.idn1where Non-parallelParallel 7.3 sec3.8 sec

17 8 Innodb Tables Query SELECT count(*) FROM t1,t2,t3,t4,t5,t6,t7,t8 WHERE t1.idn=t2.idn and t2.rev_idn=t3.rev_idn and t3.idn=t4.idn and t4.rev_idn=t5.rev_idn and t5.idn=t6.idn and t6.rev_idn=t7.rev_idn and t7.idn=t8.idn and t8.grp>=0; Execution Plan idselecttabletypep_keyskeykeylenrefrowsExtra 1SIMPLEt1ALLNULL 1005444 1SIMPLEt2refidn 5t1.idn1where 1SIMPLEt3refrev_idn 5t2.rev_idn1where 1SIMPLEt4refidn 5t3.idn1where 1SIMPLEt5refrev_idn 5t4.rev_idn1where 1SIMPLEt6refidn 5t5.idn1where 1SIMPLEt7refrev_idn 5t6.rev_idn1where 1SIMPLEt8refidn 5t7.idn1where Non-parallelParallel 18.0 sec6.9 sec

18 For MyISAM benchmarks t9 same as t1 with ENGINE=MyISAM (1 million records) same as t1 t10 same as t9 with a key (idn) t11 same as t9 with a key (rev_idn) t12 same as t9 with a key (idn). t23 same as t9 with a key (rev_idn) t24 same as t9 with a key (idn) 4 MyISAM Tables Query SELECT count(*) FROM t9,t10,t11,t12 WHERE t9.idn=t10.idn and t10.rev_idn=t11.rev_idn and t11.idn=t12.idn and t12.grp>=0; Execution Plan same as 4 Innodb Tables Query Non-parallelParallel 12.3 sec4.5 sec

19 8 MyISAM Tables Query SELECT count(*) FROM t9,t10,t11,t12,t13,t14,t15,t16 WHERE t9.idn=t10.idn and t10.rev_idn=t11.rev_idn and t11.idn=t12.idn and t12.rev_idn=t13.rev_idn and t13.idn=t14.idn and t14.rev_idn=t15.rev_idn and t15.idn=t16.idn and t16.grp>=0; Execution Plan same as 8 Innodb Tables Query Non-parallelParallel 31.0 sec6.2 sec

20 16 MyISAM Tables Query Amazon Web Services Cluster Compute Eight Extra Large Instance Server: 2 Intel Xeon E5-2670 Processors (2x8 cores) with 60.5 GB of memory and 3370 GB of instance storage (cc2.8xlarge) running Cluster Compute Amazon Linux 64 bit OS. SELECT straight_join count(*) FROM t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,t22,t23,t24 WHERE t9.idn=t10.idn and t10.rev_idn=t11.rev_idn and t11.idn=t12.idn and t12.rev_idn=t13.rev_idn and t13.idn=t14.idn and t14.rev_idn=t15.rev_idn and t15.idn=t16.idn and t16.rev_idn=t17.rev_idn and t17.idn=t18.idn and t18.rev_idn=t19.rev_idn and t19.idn=t20.idn and t20.rev_idn=t21.rev_idn and t21.idn=t22.idn and t22.rev_idn=t23.rev_idn and t23.idn=t24.idn and t24.grp>=0; Straight join is used to reduce query optimization time. Execution Plan similar to 8 Innodb Tables Query Non-parallelParallel 38.6 sec6.8 sec


Download ppt "Parallel Universe Fast Parallel MySQL Server. Target Markets Database Servers Data Warehouse Servers Data Analytics Servers."

Similar presentations


Ads by Google