Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag.

Similar presentations


Presentation on theme: "Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag."— Presentation transcript:

1 Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag

2 Motivation ✤ Different DBMS execute the same QEP using different schedules ✤ Run-time execution not query optimization ✤ No uniform scheduling format ✤ Query execution in different DBMS are not comparable ✤ Major differences between DBMS: ✤ Chunk Size: Size of operator’s input ✤ Scheduling Strategy: Execution model vs. run-time scheduler 2 How to make different schedules comparable to explain why one schedule performs better than another ?

3 Outline 1.Parallel Query Execution 2.QTM: Query Task Model 3.Evaluation 4.Outlook 3

4 Chunk Size 4 Selection t1t1 Tuple- at-a-time t1t1 Buffer- at-a-time t 1,t 2, t 3 t 4, t 5, t 6 Column- at-a-time t2t2 t3t3 t4t4 t5t5 t6t6 Chunk SizeDBMS 1 TupleSystem R, MySQL, (PostgreSQL) “Fit into Cache”Monet X100, DB2 with BLU Fix number of tuplesHyper Fix Block SizeC-Store ColumnMonetDB MIL

5 Scheduling Strategie 5 R S T Hash Build Hash Build Selection Hash Probe (S) Hash Probe (R)

6 Volcano Execution Model (Open-Next-Close Iterator) 6 R S T Hash Build Hash Build Selection Hash Probe (S) Hash Probe (R) Next Tuple

7 (Run-time) Scheduler 7 T Selection Hash Probe (S) Hash Probe (R) Spatial Locality Sel(t 1 ) Sel(t 2 ) Prob_S(t 1 ) Prob_S(t 2 ) Prob_R(t 2 ) Prob_R(t 1 ) Temporal Locality Sel(t 1 ) Sel(t 2 ) Prob_S(t 1 ) Prob_S(t 2 ) Prob_R(t 2 ) Prob_R(t 1 ) Time Further Optimiziation Criteria: I/O, NUMA or Memory Usage

8 Dynamic Load Balancing 8 CPU1CPU2 RST ⋈ ⋈ σ σ

9 DBMS Landscape 9 Tuple-at-a time Buffer-at-a time Column-at-a time Volcano Execution Model (Run-time) Scheduler Dynamic Load Balancing System R MySQL PostgreSQL DB2 PostgreSQL MonetDB X100 DB2 BLU StagedDB Hyper MonetDB MIL SAP HANA Chunk Size Scheduling Strategy

10 Outline 1.Parallel Query Execution 2.QTM: Query Task Model 3.Evaluation 4.Outlook 10

11 QTM: Query Task Model  Idea: A model that describes parallel query execution with tasks  QEP: Queue of tasks  Task: Encapsulate a piece of work on some data  Goals:  Open a design space for DBMS schedules  Make main aspects of query scheduling comparable:  Execution order, degree of parallelism and thread coordination, and partitioning 11

12 Query Task Model 12 Work Data Processing Strategies T1T1T3T2 Task Queue Data Queue t1t1 t3t3 t2t2 t1t1 Table t2t2 t3t3

13 QTM Transformation: Input 13 QEP Hardware Architecture Table Format

14 QTM Transformation 14 QEP Choosing Hash Join Max. Pipelines + Dependency Graph

15 QTM: Task Configuration 15 Max. Pipelines + Dependency Graph Task Configurations (Task Blueprints)

16 QTM: Tasks 16 Task Configuration (Task Blueprints) Instantiation Set of Tasks (TC Instantiation)

17 QTM: Implementation 17 Compile-time Run-time

18 Outline 1.Parallel Query Execution 2.QTM: Query Task Model 3.Evaluation 4.Outlook 18

19 Evaluation: Scenario 19 ScheduleWorkload Tuples per Relation 30M Selection< 25M S1 Values0,1,2 … S2 Values0,2,4,… S3 Values0,4,8,…

20 Evaluation: Configuration 20 ScheduleBuffer Size Tasks per Op Total Tasks 1) Tup – Pipe 130M90M 2) Tup – Mat 130M150M 3) Tup – Seq 130M150M 4) Buf - CL47.5M22.5M 5) Buf – L12,04814,64943,947 6) Buf – L216,3841,8325,496 7) Buf – L3491,52062186 8) Op - Mat7.5M420 9) Op - Seq7.5M420

21 Evaluation: Runtimes 21

22 Evaluation: Sampling 22 Data-related Misses Instruction-related Misses

23 Evaluation: Miss Distribution 23

24 Evaluation: Scalability 24

25 Evaluation: Insights ✤ Tradeoff between data and instruction cache performance ✤ Sweet spot: Largest private cache size vs. slightly larger buffer ✤ Medium sized tasks are data-efficient: ✤ Pros: Buffer fits entirely into cache, high data locality ✤ Cons: High number of tasks and instructions ✤ Large tasks are instruction-efficient: ✤ Pros: Decrease number of instructions and tasks, high instruction locality ✤ Cons: More data cache misses if cache size is exceeded ✤ QTM: Cache-performance can be adjusted by buffer size 25

26 Outline 1.Parallel Query Execution 2.QTM: Query Task Model 3.Evaluation 4.Outlook 26

27 Outlook ✤ Contributions: ✤ QTM: A model for parallel query execution using tasks ✤ Open a design space for DBMS schedules ✤ Make different schedules present in different DBMS comparable Thanks! ✤ Future Work: ✤ Cost Model ✤ Transformation process for an arbitrary QEP 27


Download ppt "Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag."

Similar presentations


Ads by Google