CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads Debabrata Dash, Anastasia Ailamaki, Neoklis Polyzotis 1.

Slides:



Advertisements
Similar presentations
On-line Index Selection for Physical Database Tuning
Advertisements

Active Appearance Models
To Share or Not to Share? Ryan Johnson Nikos Hardavellas, Ippokratis Pandis, Naju Mancheril, Stavros Harizopoulos**, Kivanc Sabirli, Anastasia Ailamaki,
CrowdER - Crowdsourcing Entity Resolution
Hopkins Storage Systems Lab, Department of Computer Science Automated Physical Design in Database Caches T. Malik, X. Wang, R. Burns Johns Hopkins University.
1 DynaMat A Dynamic View Management System for Data Warehouses Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan.
© Imperial College London Eplex: Harnessing Mathematical Programming Solvers for Constraint Logic Programming Kish Shen and Joachim Schimpf IC-Parc.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
10/28/2009VLSI Design & Test Seminar1 Diagnostic Tests and Full- Response Fault Dictionary Vishwani D. Agrawal ECE Dept., Auburn University Auburn, AL.
Using the Optimizer to Generate an Effective Regression Suite: A First Step Murali M. Krishna Presented by Harumi Kuno HP.
DISCOVER: Keyword Search in Relational Databases Vagelis Hristidis University of California, San Diego Yannis Papakonstantinou University of California,
Lecture 10 Query Optimization II Automatic Database Design.
IBM Software Group ® Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Outline SQL Server Optimizer  Enumeration architecture  Search space: flexibility/extensibility  Cost and statistics Automatic Physical Tuning  Database.
An Efficient Cost-Driven Selection Tool for Microsoft SQL Server Surajit ChaudhuriVivek Narasayya Indian Institute of Technology Bombay CS632 Course seminar.
Automated Selection of Materialized Views and Indexes for SQL Databases SANJAY AGRAWAL SURAJIT CHAUDHURI VIVEK NARASAYYA HASAN KUMAR REDDY A ( )
1 Primitives for Workload Summarization and Implications for SQL Prasanna Ganesan* Stanford University Surajit Chaudhuri Vivek Narasayya Microsoft Research.
Karl Schnaitter and Neoklis Polyzotis (UC Santa Cruz) Serge Abiteboul (INRIA and University of Paris 11) Tova Milo (University of Tel Aviv) Automatic Index.
Self-Tuning and Self-Configuring Systems Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems March 16, 2005.
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
A Two Phase Approach for Minimal Diagnostic Test Set Generation Mohammed Ashfaq Shukoor Vishwani D. Agrawal 14th IEEE European Test Symposium Seville,
NORM BASED APPROACHES FOR AUTOMATIC TUNING OF MODEL BASED PREDICTIVE CONTROL Pastora Vega, Mario Francisco, Eladio Sanz University of Salamanca – Spain.
ISD3 Chris Wallace Next 6 Weeks Extended Relational Model Object Orientation Matching systems 3 tier architecture Technology.
Solving the Protein Threading Problem in Parallel Nocola Yanev, Rumen Andonov Indrajit Bhattacharya CMSC 838T Presentation.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Parametric Query Generation Student: Dilys Thomas Mentor: Nico Bruno Manager: Surajit Chaudhuri.
Lecture Nine Database Planning, Design, and Administration
Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Databases C HAPTER Chapter 10: Databases2 Databases and Structured Fields  A database is a collection of information –Typically stored as computer.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Clearing Algorithms for Barter Exchange Markets: Enabling Nationwide Kidney Exchanges Hyunggu Jung Computer Science University of Waterloo Oct 6, 2008.
Process Flowsheet Generation & Design Through a Group Contribution Approach Lo ï c d ’ Anterroches CAPEC Friday Morning Seminar, Spring 2005.
Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.
Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.
Operations Research Assistant Professor Dr. Sana’a Wafa Al-Sayegh 2 nd Semester ITGD4207 University of Palestine.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Storage Allocation for Embedded Processors By Jan Sjodin & Carl von Platen Present by Xie Lei ( PLS Lab)
Materialized View Selection for XQuery Workloads Asterios Katsifodimos 1, Ioana Manolescu 1 & Vasilis Vassalos 2 1 Inria Saclay & Université Paris-Sud,
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Dynamic P2P Indexing and Search based on Compact Clustering Mauricio Marin Veronica Gil-Costa Cecilia Hernandez UNSL, Argentina Universidad de Chile Yahoo!
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
Capacity Enhancement with Relay Station Placement in Wireless Cooperative Networks Bin Lin1, Mehri Mehrjoo, Pin-Han Ho, Liang-Liang Xie and Xuemin (Sherman)
To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.
Indexes and Views Unit 7.
CS6321 Query Optimization Over Web Services Utkarsh Kamesh Jennifer Rajeev Shrivastava Munagala Wisdom Motwani Presented By Ajay Kumar Sarda.
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay.
Speeding Up Warehouse Physical Design Using A Randomized Algorithm Minsoo Lee Joachim Hammer Dept. of Computer & Information Science & Engineering University.
Materialized View Selection and Maintenance using Multi-Query Optimization Hoshi Mistry Prasan Roy S. Sudarshan Krithi Ramamritham.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
Oracle9i Developer: PL/SQL Programming Chapter 11 Performance Tuning.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
CS 721 Project Implementation of Hypergraph Edge Covering Algorithms By David Leung ( )
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
Data Driven Resource Allocation for Distributed Learning
CSCI5570 Large Scale Data Processing Systems
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Automatic Physical Design Tuning: Workload as a Sequence
Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
A Framework for Testing Query Transformation Rules
Outline Sparse Reconstruction RIP Condition
Presentation transcript:

CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads Debabrata Dash, Anastasia Ailamaki, Neoklis Polyzotis 1

High Cost of DB Tuning Enterprises spend a lot on DBMS 2 Sources: MS Azure, Forrester Research; 2010oracle.com/us/products/database/ pdf Need to reduce administration and tuning cost

A New Approach to Index Tuning Existing ApproachesCoPhy PortabilityNoYes ScalabilitySampling/pruningYes GeneralityNoYes Quality FeedbackNot with constraintsYes InteractivityNoYes CoPhy: Convert to a compact Binary Integer Program (BIP). Solve using mature solvers. − BIP: CoPhy: Convert to a compact Binary Integer Program (BIP). Solve using mature solvers. − BIP: Index Tuning: Select indexes that maximize performance

Outline Introduction BIP formulation –Existing formulation –Discovering structure –Exploiting the structure –Benefits Experimental Results Conclusion 4

Candidates Index Tuning Problem Index Tuning T1T1 T 2 I 1 I 2 I 3 I 4 T 1 Join T 2 Optimal Indexes Constraints ?? Workload

Existing Approaches 6 Index Advisor What-If DBMS Optimizer Index Advisor What-If DBMS Optimizer Fast What-If Optimizer [INUM07,C-PQO08] Greedy approaches Bottom-up [CN97, VZ+00] Top-down [BC05] BIP-based approach [CS96, PA07]

Existing BIP 7 x 1 x 2 x 3 x 4 {0,1} Min. Cost Select one atomic conf. Index presence {0,1} Program size = O(# T 1 Indexes x # T 2 Indexes) T1T1 T 2 I 1 I 2 I 3 I 4 t 1 t 2 are corresponding costs are atomic configurations

Index Advisor What-If DBMS Optimizer Fast What-If Optimizer CoPhy vs. Existing Approaches 8 What-If DBMS Optimizer Fast What-If Optimizer Index Advisor CoPhy [INUM07,C-PQO08]

Fast What-If: INUM 9 A template plan can be reused for many index combinations Place Holder Template Plan I1 I3 I1 I4 I1I4 I1I3 What-If Optimizer T 1 Join T 2 Plan Instantiated Plan Place Holder Instantiated Plan

Cost Structure 10 Linear Composability of Query Costs Linear Composability is exhibited by both INUM, C-PQO Atomic Configuration Cost of template plan under A Cost of optimal plan under A

Exploiting Linear Composability 11 Exposing the cost model leads to linearly growing BIPs T1T1 T 2 I 1 I 2 I 3 I 4 x 1 x 2 x 3 x 4 t 1 t 2 Program size = O(# T 1 Indexes + # T 2 Indexes) BIP Solver explores the index combinations with the knowledge of the objective Program size = O(# T 1 Indexes + # T 2 Indexes) BIP Solver explores the index combinations with the knowledge of the objective

More Complex BIPs Complex queries Update costs Complex Constraints [Bruno08]: –Storage constraint –Index constraints –Column constraints –Generators –Soft constraints 12 BIP formulation does not restrict the expressive power of the DBA We extend the BIP to handle:

CoPhy’s Architecture 13 What-If DBMS Optimizer INUM BIP Solver BIP Generator Candidate Generator Candidate Generator Workload Constraints Selected Indexes Selected Indexes Bounds CoPhy Theorem: CoPhy computes an optimal index configuration

Unique Features Enabled by the BIP Portability: No change to the optimizer –Requires only the what-if APIs Scalability: By solving large BIPs in seconds –No need to select workload, candidate indexes Generality: The formulation can be reused Quality feedback: All modern BIP solvers provide this –Can stop at near-optimal values Interactive tuning: By solving BIPs incrementally –Interactively add/drop candidate indexes –Enables efficient multi-objective optimization 14

Outline Introduction BIP formulation –Existing formulation –Discovering structure –Exploiting the structure –Benefits Experimental Results Conclusion 15

Experimental Setup Two commercial DBMS -- System A, System B 1 GB TPC-H database 1 GB index size constraint Algorithms: –Tool A, Tool B – the commercial designers –ILP – The state of the art BIP [PA07] –CoPhy A, CoPhy B – Our approach on the systems Queries generated using 15 TPC-H templates Metric: 16

Replacing heuristic algorithms improves savings Using larger set of candidates also helps Replacing heuristic algorithms improves savings Using larger set of candidates also helps Speedup Comparison 17 # of queries System A # of queries System B Better # Candidates: CoPhy ~2000, Tool A ~200, Tool B ~50

Tool Execution Time Comparison 18 Scalable index tuning eliminates the workload selection problem Better # of queries System B # of queries System A # of queries System A

Conclusion Index tuning using a novel compact BIP –Generic, scalable, efficient, and high quality –Quality feedback –Incremental index selection –Multi-objective optimization Future Work: –Incorporating other workload types –Applying the approach to other tuning problems 19

Backup Sildes 20

BIP for Multiple Plans x 23 x 24 x 25 x 26 Matching logic One plan per query Minimize cost 25 T1T1 T2T2 I 1 I 2 I 3 I 4

More Complex BIPs Storage constraint 22 Build indexes when used Size under a fixed constant

CoPhy vs. FLP 23 Offloading the search process to the solver improves both the problem construction and solving times Better