Performance Issues in Adaptive Query Processing Fred Reiss U.C. Berkeley Database Group.

Slides:



Advertisements
Similar presentations
Disclaimer (1) Not exactly ideas Id like to see worked on More computing paradigms we ought to pay attention to.
Advertisements

Scheduling Introduction to Scheduling
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Copyright © 2004 Pearson Education, Inc.. Chapter 15 Algorithms for Query Processing and Optimization.
Project by: Palak Baid (pb2358) Gaurav Pandey (gip2103) Guided by: Jong Yul Kim.
Logistics Network Configuration
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Parallel Database Systems
Online Aggregation Joe Hellerstein UC Berkeley Online Aggregation: Motivation Select AVG(grade) from ENROLL; A “fancy” interface: + Query Results AVG.
In this handout Stochastic Dynamic Programming
Karl Schnaitter and Neoklis Polyzotis (UC Santa Cruz) Serge Abiteboul (INRIA and University of Paris 11) Tova Milo (University of Tel Aviv) Automatic Index.
Chapter 2: Model of scheduling problem Components of any model: Decision variables –What we can change to optimize the system, i.e., model output Parameters.
Interactive query processing partial query results dynamic user control during query execution adaptive query execution interactive data cleaning and transformation.
1 Continuously Adaptive Continuous Queries (CACQ) over Streams Samuel Madden, Mehul Shah, Joseph Hellerstein, and Vijayshankar Raman Presented by: Bhuvan.
Engine Issues for Data Stream Processing Mike Franklin UC Berkeley 1 st Duodecennial SWiM Meeting January 9, 2003.
Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.
Freddies: DHT-Based Adaptive Query Processing via Federated Eddies Ryan Huebsch Shawn Jeffery CS Peer-to-Peer Systems 12/9/03.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.
> >
CS533 Concepts of Operating Systems Class 2 Thread vs Event-Based Programming.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Towards Adaptive Dataflow Infrastructure Joe Hellerstein, UC Berkeley.
1 04/18/2005 Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC.
The Case for Tiny Tasks in Compute Clusters Kay Ousterhout *, Aurojit Panda *, Joshua Rosen *, Shivaram Venkataraman *, Reynold Xin *, Sylvia Ratnasamy.
1 Chapter 7 Dynamic Job Shops Advantages/Disadvantages Planning, Control and Scheduling Open Queuing Network Model.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
1 Implementation of Relational Operations: Joins.
What is R By: Wase Siddiqui. Introduction R is a programming language which is used for statistical computing and graphics. “R is a language and environment.
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
online convex optimization (with partial information)
施賀傑 何承恩 TelegraphCQ. Outline Introduction Data Movement Implies Adaptivity Telegraph - an Ancestor of TelegraphCQ Adaptive Building.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
CSED421 Database Systems Lab. Welcome Lab Class –Library 501, Fri 9:00 – 10:40 Teacher Assistants – 안석현, 이상훈 –{ashworld, –IDS.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Database replication policies for dynamic content applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto Presented by Ahmed.
Goal-Oriented Buffer Management Revisited Kurt P. Brown, Michael J. Carey, Miron Livny Presented by Mike Nie.
An Example Data Stream Management System: TelegraphCQ INF5100, Autumn 2009 Jarle Søberg.
403MSADay16.ppt1 Activity Based Costing Absorption costing often misstates product costs. The reason is the simplicity of the basic cost allocation process.
1 Elke. A. Rundensteiner Worcester Polytechnic Institute Elisa Bertino Purdue University 1 Rimma V. Nehme Microsoft.
1 Continuously Adaptive Continuous Queries (CACQ) over Streams Samuel Madden SIGMOD 2002 June 4, 2002 With Mehul Shah, Joseph Hellerstein, and Vijayshankar.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Eddies: Continuously Adaptive Query Processing Ross Rosemark.
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Query Optimization for Stream Databases Presented by: Guillermo Cabrera Fall 2008.
CS4432: Database Systems II Query Processing- Part 1 1.
A Fragmented Approach by Tim Micheletto. It is a way of having multiple cache servers handling data to perform a sort of load balancing It is also referred.
Introduction to Machine Learning, its potential usage in network area,
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.
Security in Outsourcing of Association Rule Mining
CHAPTER 4 Designing Studies
LOGISTICS NETWORK.
Load Balancing: List Scheduling
Omega: flexible, scalable schedulers for large compute clusters
Moab® Automation Intelligence Overview
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Outline Introduction Background Distributed DBMS Architecture
CS222/CS122C: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
Machine learning Empirical Performance Analysis
Activity Based Costing
CS222p: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World
Eddies for Continuous Queries
Group Members First Names
Adaptive Query Processing (Background)
Load Balancing: List Scheduling
Presentation transcript:

Performance Issues in Adaptive Query Processing Fred Reiss U.C. Berkeley Database Group

What Is an Eddy? Query Optimizer Operator Output Static Plan

Adaptive Plan Eddy What Is an Eddy? Query Operator Tuple Pool Routing Policy Output

Eddy Performance Agenda Goal: Replace the query optimizer Mechanism –Make adaptivity cheap Policy –Simple and effective adaptive routing policy

Mechanism Goal: make adaptivity cheap Minimize overhead –Use batching to amortize decisions Offset overhead –Use the tuple pool to cluster similar tuples

Policy Goal: simplicity –Want a routing policy that handles a wide variety of situations Randomized routing –Lottery scheduling –Machine learning / statistical models Hybrid/adaptive routing –Use a static optimizer as a subroutine