Presentation is loading. Please wait.

Presentation is loading. Please wait.

Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Similar presentations


Presentation on theme: "Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013."— Presentation transcript:

1 Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

2 Outline 1.Introduction & Motivation 2.VoltDB & Elastic Scale-Out Mechanism 3.Partition Placement Problem 4.Workload-Aware Optimizer 5.Experiments & Results 6.Supporting Multi-Partition Transactions 7.Conclusion 2

3 INTRODUCTION & MOTIVATION 3

4 DBMS Scalability Replication 4 Partitioning

5 Traditional (DBMS) Scalability Higher Load Add Resources Better Performance 5 Ability of a system to be enlarged to handle growing amount of work Expensive Downtime

6 Elastic (DBMS) Scalability Higher Load Dynamically Add Resources Better Performance 6 Use of computer resources which vary dynamically to meet a variable workload No Downtime

7 Elastically Scaling a Partition Based DBMS Re-Partitioning 7 Partition 1 Node 1 Partition 1 Node 1 Partition 2 Node 2 Scale Out Scale In

8 Elastically Scaling a Partition Based DBMS Partition Migration 8 P1 Node 1 P2 P3P4 Node 1 P1P2 Node 2 P3P4 Scale Out Scale In

9 Partition Migration for Elastic Scalability Mechanism How to add/remove nodes and move partitions Policy/Strategy Which partitions to move when and where during scale out/scale in 9

10 Elasca Elastic Scale-Out Mechanism Partition Placement & Migration Optimizer 10 = +

11 VOLTDB & ELASTIC SCALE-OUT MECHANISM 11

12 What is VoltDB? In memory, partition based DBMS – No disk access = very fast Shared nothing architecture, serial execution – No locks Stored procedures – No arbitrary transactions Replication – Fault tolerance & durability 12

13 VoltDB Architecture 13 P1P2 ES1 ES2 Initiator Client Interface P3P1 ES1 ES2 Initiator Client Interface P2P3 ES1 ES2 Initiator Client Interface Client Threads

14 Single-Partition Transactions 14 P1P2 ES1 ES2 Initiator Client Interface P3P1 ES1 ES2 Initiator Client Interface P2P3 ES1 ES2 Initiator Client Interface Client

15 Multi-Partition Transactions 15 P1P2 ES1 ES2 Initiator Client Interface P3P1 ES1 ES2 Initiator Client Interface P2P3 ES1 ES2 Initiator Client Interface Client ES1

16 Elastic Scale-Out Mechanism 16 P3P4 ES3 ES4 Initiator Client Interface P1P2 ES1 ES2 ES4 Initiator Client Interface ES1 P4 P1

17 Overcommitting Cores VoltDB suggests: Partitions per node < Cores per node Wasted resources when load is low or data access is skewed 17 Idea Aggregate extra partitions on each node and scale out when load increases

18 PARTITION PLACEMENT PROBLEM 18

19 Given… 19 Cluster and System Specifications Number of CPU cores MemoryMax. Number of Nodes

20 Given… 20

21 Given… 21

22 Given… 22 PartitionNode 1Node 2Node 3 P1 P2 P3 P4 P5 P6 P7 P8 Current Partition-to-Node Assignment

23 Find… 23 PartitionNode 1Node 2Node 3 P1??? P2??? P3??? P4??? P5??? P6??? P7??? P8??? Optimal Partition-to-Node Assignment (For Next Time Interval)

24 Optimization Objectives Maximize Throughput Match the performance of a static, fully provisioned system Minimize Resources Used Use the minimum number of nodes required to meet performance demands 24

25 Optimization Objectives Minimize Data Movement Data movement adversely affects system performance and incurs network costs Balance Load Effectively Minimizes the risk of overloading a node during the next time interval 25

26 WORKLOAD-AWARE OPTIMIZER 26

27 System Overview 27

28 Statistics Collected α. Maximum number of transactions that can be executed on a partition per second – Max capacity of Execution Sites β. CPU overhead of host-level tasks – How much CPU capacity the Initiator uses 28

29 Effect of β 29

30 Estimating CPU Load 30 CPU Load Generated by Each Partition Average CPU Load of Host-Level Tasks Per Node Average CPU Load Per Node

31 Optimizer Details Mathematical Optimization vs. Heuristics Mixed-Integer Linear Programming (MILP) Can be solved using any general-purpose solver (we use IBM ILOG CPLEX) Applicable for wide variety of scenarios 31

32 Objective Function 32 Minimizes data movement as primary objective and balances load as secondary objective

33 Effect of ε 33

34 Minimizing Resources Used Calculate the minimum number of nodes that can handle the load of all the partitions – Non-integer assignment Explicitly tell optimizer how many nodes to use If optimizer cant find solution with minimum nodes, it tries again with N + 1 nodes 34

35 Constraints Replication: Replicas of a given partition must be assigned to different nodes CPU Capacity: Sum of the load of partitions must be less than capacity of node Memory Capacity: All the partitions assigned to a node must fit in its memory Host-Level Tasks: The overhead of host-level tasks must not exceed capacity of single core 35

36 Staggering Scale In Fluctuating workload can result in excessive data movement Staggering scale in mitigates this problem Delay scaling in by s time steps Slightly higher resources used to provide stability 36

37 EXPERIMENTAL EVALUATION 37

38 Optimizers Evaluated ELASCA: Our workload-aware optimizer ELASCA-S: ELASCA with staggered scale in OFFLINE: Offline optimizer that minimizes resources used and data movement GREEDY: A greedy first-fit optimizer SCO: Static, fully provisioned system (no optimization) 38

39 Benchmarks Used TPC-C: Modified to make it cleanly partitioned and fit in memory (3.6 GB) TATP: Telecommunication Application Transaction Processing Benchmark (250 MB) YCSB: Yahoo! Cloud Serving Benchmark with 50/50 read/write ratio (1 GB) 39

40 Dynamic Workloads Varying the aggregate request rate – Periodic waveforms Sine, Triangle, Sawtooth Skewing the data access – Temporal skew – Statistical distributions Uniform, Normal, Categorical, Zipfian 40

41 Temporal Skew 41

42 Experimental Setup Each experiment run for 1 hour 15 time intervals – Optimizer run every four minutes Combination of simulation and actual runs – Exact numbers for data movement, resources used and load balance through simulation Cluster has 4 nodes, 2 separate client machines 42

43 Data Movement (TPC-C) 43 Triangle Wave (f = 1)

44 Data Movement (TPC-C) 44 Triangle Wave (f = 1), Zipfian Skew

45 Data Movement (TPC-C) 45 Triangle Wave (f = 4)

46 Computing Resources Saved (TPC-C) 46 Triangle Wave (f = 1)

47 Load Balance (TPC-C) 47 Triangle Wave (f = 1)

48 Database Throughput (TPC-C) 48 Sine Wave (f = 2)

49 Database Throughput (TPC-C) 49 Sine Wave (f = 2), Normal Skew

50 Database Throughput (TATP) 50 Sine Wave (f = 2)

51 Database Throughput (YCSB) 51 Sine Wave (f = 2)

52 Database Throughput (TPC-C) 52 Triangle Wave (f = 4)

53 Optimizer Scalability 53

54 SUPPORTING MULTI-PARTITION TRANSACTIONS 54

55 Factors Affecting Performance Maximum MPT Throughput (η): The maximum number of transactions an execution site can coordinate per second Probability of MPTs (p mpt ): Percentage of transactions that are MPTs Partitions Involved in MPTs: The number of partitions involved in MPTs 55

56 Changes to Model 56 CPU load generated by each partition is equal to sum of: 1.Load due to transaction work (same as SPTs) 2.Load due to coordinating MPTs

57 Maximum MPT Throughput 57

58 Probability of MPTs 58

59 Effect on Resources Saved 59

60 Effect on Data Movement 60

61 CONCLUSION 61

62 Related Work 62 Data replication and partitioning Database consolidation Live database migration Key-value stores Data placement

63 Elasca Elastic Scale-Out Mechanism Partition Placement & Migration Optimizer 63 = +

64 Conclusion Elasca = Mechanism + Optimizer Workload-Aware Optimizer – Meets performance demands – Minimizes computing resources used – Minimizes data movement – Effectively balances load Scalable to large problem sizes for online setting 64

65 Future Work Migrating to VoltDB 3.0 – Intelligent client routing, master/slave partitions Supporting multi-partition transactions Automated parameter tuning Transaction mixes Workload prediction 65

66 Thank You Questions? 66


Download ppt "Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013."

Similar presentations


Ads by Google