Presentation is loading. Please wait.

Presentation is loading. Please wait.

DBMS Tuning DBMSs are complex systems with many tunable options that control nearly all aspects of their runtime operation. In last class, we saw for a.

Similar presentations


Presentation on theme: "DBMS Tuning DBMSs are complex systems with many tunable options that control nearly all aspects of their runtime operation. In last class, we saw for a."— Presentation transcript:

1 DBMS Tuning DBMSs are complex systems with many tunable options that control nearly all aspects of their runtime operation. In last class, we saw for a given optimizer how we were intercepting the requests and deriving optimal physical or at least a super set of optimal physical tuning. Here we are seeing configuration tuning aspect where there are enormous tunable parameters to consider. These knobs control runtime performance. Presenter: Anupam Sanghi

2 PostgreSQL Configuration “Knobs”
2 File Locations (data dir, auth-file, …) Connections and Authentication Resource Usage Write Ahead Log Query Tuning Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options… Some knobs are useless Boolean, integer, floating point, string or enum. Boolean values can be written as on, off, true, false, yes, no,1, 0 

3 PostgreSQL Configuration “Knobs”
3 File Locations Connections and Authentication (max_connections, auth timeout, …) Resource Usage Write Ahead Log Query Tuning Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options… Increasing max_connections costs ~400 bytes of shared memory per connection slot, plus lock space

4 PostgreSQL Configuration “Knobs”
4 File Locations Connections and Authentication Resource Usage Memory (shared buffers, temp buffers, work mem, autovacuum work mem, …) Disk (temp file limit) Kernel Resource Usage (max files per process, …) Background Writer (bgwriter delay, bgwriter lru maxpages, …) Asynchronous Behavior (effective io concurrency, max worker processes) Write Ahead Log Query Tuning Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options… Temp file limit = -1 for no limit else specify kbs Asynchronous – used for altering prefetching

5 PostgreSQL Configuration “Knobs”
5 File Locations Connections and Authentication Resource Usage Write Ahead Log Settings (buffers, level, commit delay, …) Checkpoints (segments, timeout, warning, …) Query Tuning Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options…

6 PostgreSQL Configuration “Knobs”
6 File Locations Connections and Authentication Resource Usage Write Ahead Log Query Tuning Planner Method Configuration Planner cost constants (also includes effective cache size) Genetic Query Optimizer, … Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options…

7 PostgreSQL Configuration “Knobs”
7 File Locations Connections and Authentication Resource Usage Write Ahead Log Query Tuning Runtime Statistics Replication, Lock Management, Error Reporting and Logging, Customized Options… CODD added a new parameter (skip_pagetest) under this set to skip physical page test # - Query/Index Statistics Collector - #track_activities = on #track_counts = on #track_io_timing = off #track_functions = none #track_activity_query_size = 1024 #update_process_title = on #stats_temp_directory = 'pg_stat_tmp' # - Statistics Monitoring #log_parser_stats = off #log_planner_stats = off #log_executor_stats = off #log_statement_stats = off

8 Finding optimal configuration is NP-Hard!
8 Tuning knobs for application’s workload and hardware is critical for performance! Tuning even one DBMS deployment is HARD. Finding optimal configuration is NP-Hard!

9 What is currently done in practice?
9 Hire expensive experts to configure the knobs for the expected workload. Personnel is estimated to be ~50% of the total ownership cost of a large-scale DBMS! Many DBAs spend nearly 25% of their time on tuning! With growing complexity of the database and applications, optimizing a DBMS has surpassed the abilities of humans. 40% of engagement requests are for tuning and knob configurations issues.

10 Automatic Tuning Through Machine Learning
On databases with reasonable logical and physical design Automatic Tuning Through Machine Learning SIGMOD 2017 Based on presentation material sourced from the following: blogs: (b) presentation:

11 Authors CMU 11 Dana Van Aken Bohan Zhang Advised by Co-advised by
RA under Peking University, Beijing Distributed Graph Processing – Best Paper, Sigmod 2017 Keynote at Sigmod 2017 Andrew Pavlo CMU Geoffrey J. Gordon

12 #1: Dependencies 12 8 99th %-tile latency (sec) lower is better
Yahoo Cloud Service Benchmark. DBMS tuning guides strongly suggest that a DBA only change one knob at a time (slow and not much helpful). Finding Optimal Knob Config is NP-Hard. If buffer pool is large and log file size is small, then DBMS maintains a smaller number of dirty pages and thus has to perform more flushes to disk. - MySQL (v5.6) - YCSB* Workload A - VM: 2 GB RAM, 2 vCPUs * Yahoo! Cloud Service Benchmark: consists of 6 different workloads. - Workload A (Update Heavy) has a mix of 50/50 reads and writes

13 Performance degrades because DBMS runs out of memory
13 9 #2: Continuous Settings Performance degrades because DBMS runs out of memory 99th %-tile latency (sec) lower is better - MySQL (v5.6) - YCSB Workload A - VM: 2 GB RAM, 2 vCPUs

14 #3: Non-Reusability 14 99th %-tile latency (sec)
lower is better - MySQL (v5.6) - YCSB Workloads (3 different) Optimal configuration is different for every workload. (else, by now the default settings would have been close to optimal)

15 15 3 #4: Tuning Complexity Number of configuration knobs in MySQL and Postgres releases (16 years)

16 OtterTune 16 3 Reuse historical performance data from tuning “past” DBMS deployments to tune “new” DBMS deployments. Maintain repository of data from previous tuning sessions Find past workloads that are similar to the new workload Identify the most impactful knobs Recommend knob settings Collect performance results (e.g. latency, throughput) 1. Maintain repository of data from previous tuning sessions. 2. Build models of how DBMS responds to different knob configurations. 3. For new application, it uses these models to guide experimentation and recommend optimal settings. 4. Each recommendation provides more info in a feedback loop that allow it to refine models and improve accuracy.

17 System Overview At the start of a tuning session,
17 3 1 1 At the start of a tuning session, User specifies the target objective – which “metric” to optimize?

18 System Overview 18 3 2 2 Controller connects to the target DBMS and collect hardware profile and current knob configuration. It then starts the observation period.

19 19 3 Aim: Collect current knob configuration and runtime statistics for both DBMS-independent external metric and DBMS-specific internal metric. Main steps performed by the controller: Reset statistics for target DBMS. Execute either (specified by the DBA) a set of queries for a fixed time - fixed observation period, suitable for OLTP. a specified workload trace - variable observation period, suitable for OLAP. Observe DBMS and measure specified metrics. Collect additional DBMS-specific internal metrics. E.g.: counter of pages written to/read from the disk Store metrics with the same name as a single sum scalar value. Observation Period Why is reset done? OLTP/OLAP? Scalar value?

20 System Overview 20 3 3 Repo has data organized per major DBMS versions. This prevents tuning knobs from older versions of the DBMS that may be deprecated in newer versions or tuning knobs that only exist in newer versions. 3 Tuning manager receives result from controller and stores it in a repository. Repository has data organized per major DBMS versions.

21 System Overview 21 3 4 4 Tuning Manager computes next configuration for recommendation using background process that continuously analyze new data and refine internal ML models. ML models allow to understand target workload and map it to a workload for same DBMS and hardware profile that it has seen (and tuned). recommend knob configuration that is designed to improve objective for current workload, DBMS and hardware.

22 System Overview 22 3 5 Second copy maintained in ituned also. Blacklist can have additional knobs added by DBA that should not be altered. Cost of restart is not considered. 5 Controller installs next configuration and collects measurements. Installation (often) requires: admin privileges. A second copy is taken if controller doesn’t have privileges DBMS restart as required by some knobs. A black-list of such knobs are prepared if restart is prohibited. extra (time-consuming) processing like resizing log files.

23 System Overview 23 3 1 2 5 4 3 Termination
With every recommendation, an estimate is given. DBA decides to continue or terminate. Termination OtterTune provides an estimate of how much better recommended configuration is compared to the best configuration that it has seen so far. Tuning continues until user is satisfied with improvement over initial configuration.

24 Tuning System 24 3 Discover a model to represents the distinguishing aspects of the target workload Based on the data collected, recommend the next configuration to try. Identify which knobs have the strongest impact on the target objective function

25 Internal DBMS Metrics Directly affected by the knobs’ settings
25 3 Directly affected by the knobs’ settings Problem: Redundancy Same but different units Highly correlated Solution: Prune them! Buffer pool size is too small: #buffer pool misses . total #buffer pool requests The total number of bytes in the InnoDB buffer pool containing data. The number includes both dirty and clean pages. The number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from disk. 3. The number of logical read requests. 4. Normally, writes to the InnoDB buffer pool happen in the background. When InnoDB needs to read or create a page and no clean pages are available, InnoDB flushes some dirty pages first and waits for that operation to finish. This counter counts instances of these waits. 5. The total number of data writes.

26 Prune Redundant Metrics
26 3 Construct smallest set of metrics that capture the variability in performance and distinguishing characteristics for different workloads. Factor Analysis Pre-processing step. Dimensionality reduction. Reduce the noise in the data. K-means Clustering Find groups of metrics similar to each other. Select one metric from each group. Factor Analysis helps to improve robustness and quality of the cluster analysis.

27 Factor Analysis 27 3 Given: A set of real-valued variables that contain arbitrary correlations. FA aims to find a smaller set of latent factors that explain (underlie) the observed variables. These factors capture the correlation patterns of the original variables.

28 Factor Analysis (Contd.)
28 3 C1 C F1 F M1 M2 .. Input FA M1 M2 .. Output Factors are ordered by the amount of variability in the original data. Initial factors are significant for DBMS metric data, which means that most of the variability is captured by first few factors. From the output, closely correlated metrics can be identified and pruned. Two metrics are close to each other if they have similar rows in this matrix.

29 K-means Clustering 29 3 (Choose Means) Factors Metrics Scatter Plot
Clusters of Metrics Non-Redundant Metrics

30 2-D Projection of the Scatter Plot
30 3 131 metrics 9 clusters! 57 metrics 8 clusters! Each cluster corresponded to a distinct aspect of performance

31 Configuration Knobs 31 Knobs have varying degrees of impact on the performance Some have high impact Some have no impact For many, it depends on the workload Problem: Which knob matters? Solution: Feature Selection

32 Least Absolute Shrinkage and Selection Operator (LASSO) Regression
32 Variant of linear regression. Adds an L1 penalty to the loss function. Y = vector of metrics X = vector of knobs θ = weights for different knobs λ = regularization parameter (penalty) Describe Regression and LASSO

33 Feature Selection with LASSO
33 Aim: find relationship between knobs (or polynomial functions of knobs) and metrics. Preprocessing: Transform categorical variables to dummy variables that take on values 0 or 1. Feature Selection: Start by adding high penalty thereby removing all knobs (weights shrink to zero). Decrease penalty in small increments, recompute regression and track what features are added. Order knobs by order of appearance. How many knobs to choose? Incremental approach: Dynamically increase the number of knobs used in a tuning session over time. Too many knobs grows the optimization time. Too few prevent finding best configuration.

34 Identifying Important Knobs
34 Type 2 Lasso paths for 99th %-tile latency. Eight most impactful features. Second degree polynomial features – 2 types. Product of two knobs Useful for detecting pairs of knobs that are non-independent. Product of single knob Reveals quadratic relationship between a knob and a target Type 1 When we say that a relationship is “quadratic”, we do not mean that it is an exact quadratic, but rather that it exhibits some nonlinearity. If the linear and quadratic terms for a knob both appear in the regression around the same time, then its relationship with the target metric is likely quadratic. But if the linear term for a knob appears in the regression much earlier than the quadratic term then the relationship is nearly linear.

35 Automatic Tuning Workload Mapping Configuration Recommendation 35 3
At this point we have Set of non-redundant metrics Set of most impactful knobs Data from previous tuning sessions stored in the repository Configuration Recommendation Use Gaussian Process (GP) regression to find knob configuration that would target metric. Workload Mapping Find workload in the data repository similar to the target workload.

36 Workload Mapping 36 Preprocessing
Aim: Match the target DBMS’s workload to the most similar workload in the repository. For each workload, compute a score by comparing the performance measurements For each metric Compute Euclidean distance between target workload and each other workloads Compute score for a workload by averaging distance over all possible metrics. Select workload with lowest score. Dynamic mapping used With each iteration quality of match increases with the amount of data gathered Preprocessing Compute deciles for each metric and then binning values based on which decile they fall into Matrices are constructed by background processes What happens if some workload with a particular configuration has not been run at all? Metrics and knobs could have been considered in a weighted order while computing average.

37 Recommendation – Gaussian Process Regression
37 Gaussian process regression provides Confidence intervals Tradeoff between exploration and exploitation Exploration Exploitation Exploration: gathering information to improve the model Exploitation: greedily trying to do well on the target objective Search unknown areas of the GP. Useful for getting more data. Helps identify configurations with knob values beyond limits tried in the past. Select configuration similar to best configuration found in the GP. Makes slight modifications to previously known good configurations. Greedily trying to do well on the target objective

38 Configuration Recommendation
38 Aim: Recommend configurations in order to optimize the target metric over the entire tuning session Reuse data from the similar workload to train a statistical model and use it to optimize the next configuration to try Use data from mapped workload to train a GP model. Update model by adding observed metrics from target workload. Predict the latency & variance for a sample set of possible configurations Optimize the next configuration to try, trading off between Exploration and Exploitation. Over time, the expected improvement in the model’s predictions drops as the number of unknown regions decreases. Preprocessing to ensure features are continuous and of same scale to encode categorical features with dummy variables Choose the configuration with the greatest expected improvement Gradient descent – initialization set: 1:10 ratio of top performing config + random config.

39 Experimental Evaluation - Setup
39 v5.6 v9.3 v4.2  Platform Amazon EC2 Spot Instances - two instances OtterTune’s controller - m4.large (4 vCPUs and 16 GB RAM) Target DBMS deployment - m3.xlarge (4 vCPUs and 15 GB RAM) OtterTune’s tuning manager and data repository on a local server 20 cores and 128 GB of RAM

40 Experimental Evaluation
40 ~18 GB - 5 min observation - 99%-tile latency ~18 GB ~20 GB - Variable observation - Total execution time ~10 GB Training data collection 15 YCSB and 4 TPCH workload mixtures ~30k trials per DBMS

41 Experimental Evaluation
41 Influence of the number of knobs used in the performance. The incremental approach works best for all DBMSs. Before starting the evaluation, training data was obtained to bootstrap OtterTune’s repository.

42 Experimental Evaluation
42 Tuning Evaluation Different training data iTuned uses 10 DBMS configurations using sampling Demonstrates that continuously integrating new training data helps with performance. OtterTune works much better on OLTP workloads, but it has similar performance with ITuned on OLAP workloads. Vector is less permissive on what values the tuning tools are allowed to set for its knobs

43 Experimental Evaluation
43 Vacuum command executed Data unloading and reloading in memory Fixed observation of 5 min

44 Experimental Evaluation
44 Efficacy Evaluation Default: The configuration provided by the DBMS Tuning script: The configuration generated by an open source tuning advisor tool DBA: The configuration chosen by a human DBA RDS: The configuration customized for the DBMS that is managed by Amazon RD and deployed on the same EC2 instance type Checkpointing caveat MySQL Postgres

45 Conclusion 45 OtterTune leverages machine learning techniques and data from past configurations to find good configurations for a larger number of knobs. identify dependencies between knobs. recommend configurations. But, still needs a DBA. Thank You


Download ppt "DBMS Tuning DBMSs are complex systems with many tunable options that control nearly all aspects of their runtime operation. In last class, we saw for a."

Similar presentations


Ads by Google