Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiment-driven System Management Shivnath Babu Duke University Joint work with Songyun Duan, Herodotos Herodotou, and Vamsidhar Thummala.

Similar presentations


Presentation on theme: "Experiment-driven System Management Shivnath Babu Duke University Joint work with Songyun Duan, Herodotos Herodotou, and Vamsidhar Thummala."— Presentation transcript:

1 Experiment-driven System Management Shivnath Babu Duke University Joint work with Songyun Duan, Herodotos Herodotou, and Vamsidhar Thummala

2 Managing DBs in Small to Medium Business Enterprises (SMBs) Peter is a system admin in an SMB – Manages the database (DB) – SMB cannot afford a DBA Suppose Peter has to tune a poorly-performing DB – Design advisor may not help – Maybe the problem is with DB configuration parameters Database (DB)

3 Tuning DB Configuration Parameters Parameters that control – Memory distribution – I/O optimization – Parallelism – Optimizers cost model Number of parameters ~ 100 – critical params depending on OLAP Vs. OLTP Few holistic parameter tuning tools available – Peter may have to resort to page tuning manuals or rules of thumb from experts – Can be a frustrating experience

4 Response Surfaces TPC-H 4 GB DB size, 1 GB memory, Query 18 2-dim Projection of a 11-dim Surface

5 DBAs Approach to Parameter Tuning DBAs run experiments – Here, an experiment is a run of the DB workload with a specific parameter configuration – Common strategy: vary one DB parameter at a time

6 Experiment-driven Management Are more experiments needed? Process output to extract information Plan next set of experiments Conduct experiments on workbench Yes Mgmt. taskResult Goal: Automate this process

7 Roadmap Use cases of experiment-driven mgmt. – Query tuning, benchmarking, Hadoop, testing, … iTuned: Tool for DB conf parameter tuning – End-to-end application of experiment-driven mgmt..eX: Language and run-time system that brings experiment-driven mgmt. to users & tuning tools

8 What is an Experiment? Depends on the management task – Pay some extra cost, get new information in return – Even for a specific management task, there can be spectrum of possible experiments

9 Uses of Experiment-driven Mgmt. DB conf parameter tuning

10 Uses of Experiment-driven Mgmt. DB conf parameter tuning MapReduce job tuning in Hadoop

11 Uses of Experiment-driven Mgmt. DB conf parameter tuning MapReduce job tuning in Hadoop Server benchmarking – Capacity planning – Cost/perf modeling

12 Cardinality Uses of Experiment-driven Mgmt. Tuning problem queries

13 Uses of Experiment-driven Mgmt. Tuning problem queries

14 Uses of Experiment-driven Mgmt. Tuning problem queries Troubleshooting Testing Canary in the server farm (James Hamilton, Amazon) … DB conf parameter tuning MapReduce job tuning in Hadoop Server benchmarking – Capacity planning – Cost/perf modeling

15 Roadmap Use cases of experiment-driven mgmt. – Query tuning, benchmarking, Hadoop, testing, … iTuned: Tool for DB conf parameter tuning – End-to-end application of experiment-driven mgmt..eX: Language and run-time system that brings experiment-driven mgmt. to users & tuning tools

16 Problem Abstraction Unknown response surface: y = F(X) – X = Parameters x 1, x 2, …, x m Each experiment gives a sample – Set DB to conf Xi – Run workload that needs tuning – Measure performance yi at Xi Goal: Find high performance setting with low total cost of running experiments

17 Example Goal: Compute the potential utility of candidate experiments Utility(X) Where to do the next experiment?

18 // Phase I: Bootstrapping – Conduct some initial experiments // Phase II: Sequential Sampling – Loop: Until stopping condition is reached 1.Identify candidate experiments to do next 2.Based on current samples, estimate the utility of each candidate experiment 3.Conduct the next experiment at the candidate with highest utility iTuneds Adaptive Sampling Algorithm for Experiment Planning

19 Utility of an Experiment Let -- be the samples from n experiments done so far Let be the best setting so far (i.e., y* = min i yi) – wlg assuming minimization U(X), Utility of experiment at X is // y = F(X) – y* - y if y* > y – 0 otherwise However, U(X) poses a chicken-and-egg problem – y will be known only after experiment is run at X Goal: Compute expected utility EU(X)

20 Expected Utility of an Experiment Suppose we have the probability density function of y (y is the perf at X) – Prob(y = v | for i=1,…,n) Then, EU(X) = s v =- 1 U(X) Prob(y = v) dv EU(X) = s v =- 1 (y* - v) Prob(y = v) dv Goal: Compute Prob(y = v | for i=1,…,n) v =+1 v=y*v=y*

21 GRS models the response surface as: y(X) = g(X) + Z(X) (+ (X) for measurement error) – E.g., g(X) = x 1 – 2x x 1 2 (Learned using common techniques) – Z: Gaussian Process to capture regression residual Model: Gaussian Process Representation (GRS) of a Response Surface

22 Primer on Gaussian Process Univariate Gaussian distribution – G = N(, ) – Described by mean, variance Multivariate Gaussian distribution – [G1, G2, …, Gn] – Described by mean vector and covariance matrix Gaussian Process – Generalizes multivariate Gaussian to arbitrary number of dimensions – Described by mean and covariance functions

23 If Z is a Gaussian process, then: [Z(X1),…,Z(Xn),Z(X)] is multivariate Gaussian Z(X) | Z(X1),…,Z(Xn) is a univariate Gaussian y(X) is a univariate Gaussian GRS captures the response surface as: y(X) = g(X) + Z(X) (+ (X) for measurement error) Model: Gaussian Process Representation (GRS) of a Response Surface

24 Parameters of the GRS Model [Z(X1),…,Z(Xn)] is multivariate Gaussian – Z(Xi) has zero mean – Covariance(Z(Xi),Z(Xj)) / exp( k – k |x ik – x jk | k ) Residuals at nearby points have higher correlation k, ½ k learned from --

25 Use of the GRS Model Recall our goals to compute – EU(X) = s v=- 1 (y* - v) Prob(y = v) dv – Prob(y = v | for i=1,…,n) Lemma: Using the GRS, we can compute the mean (X) and variance 2 (X) of the Gaussian y(X) Theorem: EU(X) has a closed form that is a product of: – Term that depends on (y* - (X)) – Term that depends on (X) It follows that settings X with high EU are either: – Close to known good settings (for exploitation) – In highly uncertain regions (for exploration) v=y*v=y*

26 Example Settings X with high EU are either: – Close to known good settings (high y*- (X)) – In highly uncertain regions (high (X)) EU(X) y* Unknown actual surface (X) 4 (X)

27 Test Data Where to Conduct Experiments? Data DBMS Production Platform Data DBMS Standby Platform DBMS Test Platform Clients Write Ahead Log (WAL) shipping Middle Tier

28 iTuneds Solution Exploit underutilized resources with minimal impact on production workload DBA/User designates resources where experiments can be run – E.g., production/standby/test DBA/User specifies policies that dictate when experiments can be run – Separate regular use (home) from experiments (garage) – Example: If CPU, mem, & disk utilization < 10% for past 15 mins, then resource can be used for experiments

29 One Implementation of Home/Garage Standby Machine Data DBMS Production Platform Clients Data WAL shipping Middle Tier Interface Engine iTuned Experiment Planner & Scheduler Home DBMS Apply WAL Home DBMS Apply WAL Garage DBMS Workbench for experiments Copy on Write

30 Overheads are Low Operation in APITime (seconds)Description Create Container610Create a new garage (one time process) Clone Container17Clone a garage from already existing one Boot Container19Boot garage from halt state Halt Container2Stop garage and release resources Reboot Container2Reboot the garage Snapshot-R DB (5GB, 20GB) 7, 11Create read-only snapshot of the database Snapshot-RW DB (5GB, 20GB) 29, 62Create read-write snapshot of database

31 Empirical Evaluation (1) Cluster of machines with 2GHz processors and 3GB memory Two database systems: PostgreSQL & MySQL Various workloads – OLAP: Mixes of heavy-weight TPC-H queries Varying #queries, #query_types, and MPL Scale factors 1 and 10 – OLTP: TPC-W and RUBiS Tuning of up to 30 configuration parameters

32 Techniques compared – Default parameter settings shipped (D) – Manual rule-based tuning (M) – Smart Hill Climbing (S): State-of-the-art technique – Brute-Force search (B): Run many experiments to find approximation to optimal setting – iTuned (I) Evaluation metrics – Quality: workload running time after tuning – Efficiency: time needed for tuning Empirical Evaluation (2)

33 Comparison of Tuning Quality

34 iTuneds Scalability Features (1) Identify important parameters quickly Run experiments in parallel Stop low-utility experiments early Compress the workload Work in progress: – Apply database-specific knowledge – Incremental tuning – Interactive tuning

35 iTuneds Scalability Features (2) Identify important parameters quickly – Using sensitivity analysis with a few experiments #Parameters = 9, #Experiments = 10

36 iTuneds Scalability Features (3)

37 Roadmap Use cases of experiment-driven mgmt. – Query tuning, benchmarking, Hadoop, testing, … iTuned: Tool for DB conf parameter tuning – End-to-end application of experiment-driven mgmt..eX: Language and run-time system that brings experiment-driven mgmt. to users & tuning tools

38 Back of the Envelope Calculation DBAs cost $300/day; Consultants cost $100/hr 1 Day of experiments gives a wealth of info. – TPC-H, TPC-W, RUBiS workloads; conf. params Cost of running these experiments for 1 day on Amazon Web Serv. – Server: $10/day – Storage: $0.4/day – I/O: $5/day – TOTAL: $15/day

39 .eX: Power of Experiments to the People Users & tools express needs as scripts in eXL (eXperiment Language).eX engine plans and conducts experiments on designated resources Intuitive visualization of results Resources eXL script Language processor Run-time engine.eX

40 Current Focus of.eX Parts of an eXL script 1.Query: (approx.) response surface mapping, search 2.Expt. setup & monitoring 3.Constraints & optimization: resources, cost, time I1I2O1…… Are more experiments needed? Process output to extract information Plan next set of experiments Conduct experiments on workbench Yes Result Automatically generate the experiment-driven workflow

41 Summary Automated expt-driven mgmt: The time has come – Need, infrastructure, & promise are all there We have built many tools around this paradigm – Poses interesting questions and challenges – Make it easy for users/admins to do expts – Make experiments first-class citizens in systems


Download ppt "Experiment-driven System Management Shivnath Babu Duke University Joint work with Songyun Duan, Herodotos Herodotou, and Vamsidhar Thummala."

Similar presentations


Ads by Google