Presentation on theme: "1 Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System Surajit Chaudhuri Gerhard Weikum Microsoft Research University."— Presentation transcript:
1 Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System Surajit Chaudhuri Gerhard Weikum Microsoft Research University of the Saarland Redmond, USA Saarbruecken, Germany
2 Conclusion Problem: DBMS technology is packaged monolithically too many features, too much complexity Solution: RISC-style simplification and componentization break up DBMS into layered packages with narrow APIs and self-tuning capabilities compose appropriate packages into broader range of IT applications Think globally, fix locally
3 Outline Analysis Role Models for New Departure Proposal
4 Passing of a Dream Old WorldNew World DBMS DBMS at center of the universe payroll inventory order entry ERP dot com Web server Mining multi-tier architecture with many custom „data managers“
5 Why Did This Happen? Universality of DBMS was a leap of faith SQL is unnatural and complex –Yet another failed example of transparency trap Featurism has turned into a curse –Excessive bundling –Performance is unpredictable –(Auto-) Tuning is a nightmare Unacceptable GPR for app system architects
6 Example of Poor GPR: DBMS Query Processor Yet another indexing smart added Yet another join method added Yet another transformation rule added Optimizer designers will admit –It is unpredictable –Hard to abstract principles ERP/Mining/etc attempt to outsmart QP Turning into black magic –Cannot educate next generation of engineers
7 Role Models for New Departure Ex. 1: Aircraft with many subsystems (engine, fuselage, electrical control, etc.) Ex. 2: RISC hardware No single engineer understands entire system Local theories for individual subsystems and reasonable understanding of interactions –Few points of interaction with stable and narrow interfaces –Built-in system support for debugging subcomponents (incl. performance)
8 RISC Philosophy for DBMS DBMS technology must be packaged as components with simplified functionality Enforce –Layered approach –Strong limits on interaction (narrow APIs) –Multiple consumers for a component Components must have manageable complexity to be desirable for its potential consumers Encapsulation must include predictable performance and self-tuning
9 Why Predictability is Crucial From best-effort to guaranteed performance ”Our ability to analyze and predict the performance of the enormously complex software systems... are painfully inadequate" (PITAC Report) Downtime is very expensive (100K$/min) Very slow servers are like unavailable servers Tuning for peak load requires predictability of workload config performance function Self-tuning requires mathematical models – Feasible at component scale
10 Internal Server Error. Our system administrator has been notified. Please try later again. Check Availability (Look-Up Will Take 8-25 Seconds)
11 RISC-style Engine (Components) Level 1 (base layer): SPJ only –only B-trees, with automatic index selection built-in –API includes prioritization & exec. time prediction Level 2: Support for aggregation –Uses level 1 with narrow API –Self-tuning for aggregation considerations Level 3: Full-fledged SQL Layering sacrifices performance for manageability Design principles for components: include only functionality that is self-tuning apply Occam‘s razor for internal alternatives
12 RISC in the Large Use level 1 engine (SPJ, or merely record and index managers) for MP3 repository, simple E-service etc. Use level 2 engine (SPJ + aggregation) for OLAP or ERP Use level 3 engine (SQL) for full-fledged DW, legacy apps Composition principles for IT solutions in the large: Choose least-complexity components IT solution can rely on predictable/guaranteed performance of components
13 Implications of RISC Approach Need for Universal Glue for components –COM/Universal Runtime and EJB Simplicity is key –Eliminate all second-order optimizations Restrict alternatives –Not yet another join method or transformation rule –Don’t abuse extensibility!
14 Road Map Demonstrate “plug and play” light-weight data servers for various scenarios (API and guaranteed performance): – MP3 repositories – OLAP server – Metadata manager Open source “bazaar”?
15 Potential Caveats and Rebuttals We’ve been down this road before! But we now have better understanding of the appropriate components and APIs. We will lose performance! But we win in terms of predictability and overall GPR. There is no business incentive! As industries mature, predictability and manageability do matter for long-term benefit.