Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin University of Maryland Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony.

Slides:



Advertisements
Similar presentations
Evaluating Window Joins over Unbounded Streams Author: Jaewoo Kang, Jeffrey F. Naughton, Stratis D. Viglas University of Wisconsin-Madison CS Dept. Presenter:
Advertisements

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Alternate Software Development Methodologies
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Transaction.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa
IntroductionAQP FamiliesComparisonNew IdeasConclusions Adaptive Query Processing in the Looking Glass Shivnath Babu (Stanford Univ.) Pedro Bizarro (Univ.
Revisiting a slide from the syllabus: CS 525 will cover Parallel and distributed computing architectures – Shared memory processors – Distributed memory.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin UC Berkeley Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic.
Evaluating Window Joins Over Unbounded Streams By Nishant Mehta and Abhishek Kumar.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming Michael K. Chen, Xiao Feng Li, Ruiqi Lian,
Freddies: DHT-Based Adaptive Query Processing via Federated Eddies Ryan Huebsch Shawn Jeffery CS Peer-to-Peer Systems 12/9/03.
XJoin: Getting Fast Answers From Slow and Bursty Networks T. Urhan M. J. Franklin IACS, CSD, University of Maryland Presented by: Abdelmounaam Rezgui CS-TR-3994.
Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin UC Berkeley Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
CS561 - XJoin1 XJoin: A Reactively-Scheduled Pipelined Join Operator IEEE Bulletin, 2000 by Tolga Urhan and Michael J. Franklin.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
1 04/18/2005 Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC.
1 The Google File System Reporter: You-Wei Zhang.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
CSC271 Database Systems Lecture # 30.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases (modified slides available on course webpage) Jianjun Chen et al Computer Sciences.
CONGRESSIONAL SAMPLES FOR APPROXIMATE ANSWERING OF GROUP-BY QUERIES Swarup Acharya Phillip Gibbons Viswanath Poosala ( Information Sciences Research Center,
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.
1 XJoin: Faster Query Results Over Slow And Bursty Networks IEEE Bulletin, 2000 by T. Urhan and M Franklin Based on a talk prepared by Asima Silva & Leena.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Loading a Cache with Query Results Laura Haas, IBM Almaden Donald Kossmann, Univ. Passau Ioana Ursu, IBM Almaden.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Workflow Early Start Pattern and Future's Update Strategies in ProActive Environment E. Zimeo, N. Ranaldo, G. Tretola University of Sannio - Italy.
© ETH Zürich Eric Lo ETH Zurich a joint work with Carsten Binnig (U of Heidelberg), Donald Kossmann (ETH Zurich), Tamer Ozsu (U of Waterloo) and Peter.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
PermJoin: An Efficient Algorithm for Producing Early Results in Multi-join Query Plans Justin J. Levandoski Mohamed E. Khalefa Mohamed F. Mokbel University.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
Page 1 A Platform for Scalable One-pass Analytics using MapReduce Boduo Li, E. Mazur, Y. Diao, A. McGregor, P. Shenoy SIGMOD 2011 IDS Fall Seminar 2011.
Efficient Evaluation of Queries in a Mediator for WebSources Louiqa Raschid University of Maryland Joint work with Zadorozhny, Vidal, Urhan, Bright.
University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.
For a good summary, visit:
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
By: Peter J. Haas and Joseph M. Hellerstein published in June 1999 : Presented By: Sthuti Kripanidhi 9/28/20101 CSE Data Exploration.
Cost-based Query Scrambling for Initial Delays Tolga Urhan Michael J. Franklin Laurent Amsaleg.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
File System Implementation
Ripple Joins for Online Aggregation
Database Query Execution
Software life cycle models
Outline Introduction Background Distributed DBMS Architecture
(A Research Proposal for Optimizing DBMS on CMP)
Probabilistic Databases
by Mikael Bjerga & Arne Lange
Adaptive Query Processing (Background)
Presentation transcript:

Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin University of Maryland Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic

M. Franklin, 3/17/992 Motivation n The Internet enables access to globally-distributed data sources... But, current search and data access technology is primitive : u Discovering relevant sources and data is difficult. F Simple text-based searches. F Navigation through link clicking. u Collecting, aggregating, and manipulating data from multiple sources is not supported.

M. Franklin, 3/17/993 Databases to the Rescue? n DB query languages used to be navigational. n Relational languages are more useful for many tasks. u Powerful, and (more or less) declarative. u Queries are written without regard to the physical structure/location/etc. of data. (Data Independence) u Easily extended to distributed systems. n DB query languages and optimization techniques have been developed over decades. n This technology is unavailable to the Internet user.

M. Franklin, 3/17/994 Distributed Query Processing (QP) SELECT eid,ename,title,salary FROM Emp, Proj, Assign WHERE Emp.eid = Assign.eid AND Proj.pid = Assign.pid AND Emp.loc <> Proj.loc n System handles query plan generation & optimization; ensures correct execution. n Originally conceived for corporate networks. ©1998 Ozsu and Valduriez

M. Franklin, 3/17/995 n Semantic Interoperability n Source Discovery n Performance n Responsiveness and Availability u Distributed database technology, caching, etc. u Unpredictability: how to build responsive systems? u This is the focus of this talk. QP on the Internet? — Issues u Wrapper/Mediator Architecture. u XML,XMI, CWMI,OLE-DB,... u Metadata Repositories and Directories.

M. Franklin, 3/17/996 Response-time Problems u Data sources may be unreachable or slow to respond. u Data delivery may be slower than expected. u Data delivery may be bursty. u Data delivery may be interrupted. Traditional, static query processing approaches cannot cope with such problems at run-time. Wide-area + Wrapped sources  Unpredictability

M. Franklin, 3/17/997 (Our) Proposed Solutions n Query Scrambling u “Reactive Query Execution” n XJoin u A lightweight, fully pipeline-able query operator. n Risk-Aware Query Planning u Producing robust plans. n Exploiting Alternative Sources u Mirrors or “not exactly”. n Relaxing Query Semantics

M. Franklin, 3/17/998 Query Scrambling - Introduction n Goal: u Overcome limitations of static QP for unexpected delays. n A Reactive Approach: u Start with an optimized plan. u Modify the plan on-the-fly if problems are detected. u Technique: hide delays by performing other useful work. n Assumptions: u Focus on Initial Delay u Query processing at client; Iterator model (Graefe 93).. u No replication.

M. Franklin, 3/17/999 n An iterative algorithm. n Monitor input and scramble when problems are detected. Query Scrambling - Overview n Phase 1: n Phase 1: Reschedule “runable” operators. n Phase 2: n Phase 2: Operator synthesis: create new operators. Phase 1 Phase 2 Scrambling Normal Execution Source(s) responded Source(s) delayed Still delayed

M. Franklin, 3/17/9910 Query Scrambling - Example 1 4 A CDE B Reschedule A CDEB New Operators BCDEA Initial PlanReschedule A BCDE ABCDE

M. Franklin, 3/17/9911 n A thread per operator. n Monitoring and scheduling. n A “smart” materialization operator. n Multi-threaded query operators? Building a Scrambling Engine Not Started Active Stalled Suspended Closed open done timeout data_arrival de-schedule resume

M. Franklin, 3/17/9912 Directing Scrambling [SIGMOD 98] n Original formulation [PDIS 96] was based on heuristics. n Demonstrated the ability for QS to hide delays, but was susceptible to making bad choices. n Query optimizers are able to choose good plans, but how to use an optimizer to do scrambling? u Phase I F Issue: where to place the materialization operator? F Answer: Choose subtree with best overhead/useful work ratio. u Phase II is trickier.

M. Franklin, 3/17/9913 n If no runable subtrees, create new ones. n Needed: an optimizer that: 1) is lightweight & incremental, and 2) understands delays. n Most QP systems optimize for total work. n But, delay is inherently a response-time issue. but only if it knows the duration of the delay! n Response-time optimization can “magically” move delayed operators to the “best” point in the plan, Phase II - Operator Synthesis

M. Franklin, 3/17/9914 n Invokes the optimizer with a very large delay value. n Optimizer pushes the delayed relation as far back as is useful. n Large delay estimation Aggressive. Include Delayed (ID) Algorithm

M. Franklin, 3/17/9915 Estimated Delay (ED) Algorithm Compromise between aggressive and conservative. n Initially calls the RT optimizer with a small delay value. u Small value = 25 % of the RT of the original query n Successively increases the delay estimation. u 50% and then 100% of the original RT. n Increasing estimates Adaptive Algorithm

M. Franklin, 3/17/9916 Experimental Environment n Workload: Queries derived from TPC-D benchmark u TPC-D (5), TPC-D(8), TPC-D(9), (1 GB base data) n Optimizer (built from scratch): u Two Phase Randomized Optimizer F Iterative improvement and Simulated Annealing (Ioannidis 90). u Runs as Total Work or Response Time based (GHK 92). u Search space = bushy plans n Studied algorithms on a simulated environment u Network, remote sites, query engine etc. u Subsequently validated with Predator-based implementation.

M. Franklin, 3/17/9917 National Market Share Query (TPC-D 8) n Experiments with several memory sizes n Delayed relation (Part) is an important relation. n Used hash joins only. n Lineitem is the largest relation, Part is a “reducer” n Optimizer initially chooses to go left-to-right. PartLineItem Supplier Nation Customer Region Order 1/1502/7 1/5

M. Franklin, 3/17/9918 National Market Share Query (large memory) > 4 MB Delay No Scramb

M. Franklin, 3/17/9919 National Market Share Query (Sm. memory) Scrambling becomes more expensive Pair: Local Decisions, lack of global view IN : Poor performance for short delays. ED : Good for a wide range of delay values. No Scramb. Delay

M. Franklin, 3/17/9920 Query Scrambling Summary: u Traditional static query processing does not scale to the wide-area environment. u A reactive approach is needed. u This requires a multi-threaded engine and a scrambling- enabled optimizer. Experimental Results: u Dramatic improvements over heuristic algorithms for several of the TPC-D queries. u Response time-based optimization does better. u Fundamental tradeoffs arise in the absence of good delay predictions.

M. Franklin, 3/17/9921 XJoin - Improving Responsiveness n QS can speed up the delivery of the entire answer. n But, its ability to hide delays is limited by the amount of useful work that can be done in the query. n XJoin is a new query operator that: u Produces results incrementally as they become available. u Allows progress to be made in highly erratic situations. u Has a small memory footprint. u Tolerates bursty and slow behavior.

M. Franklin, 3/17/9922 u Traditional Hash Joins block when one input stalls. Hash Join Build Probe Source A Source B Hash Table A Hash Table B u Symmetric Hash Join (SHJ) blocks only if both stall. u Processes tuples as they arrive from sources. u Produces all tuples in the join and no duplicates. Symmetric Hash Join

M. Franklin, 3/17/9923 Memory Utilization n As originally specified, SHJ requires both inputs to be memory resident. n For a complex query, this means all intermediate results must be in memory. n This is wasteful and can result in thrashing. n XJoin extends SHJ to allow it to work with limited memory. n XJoin does for SHJ what “Hybrid Hash” does for HJ.

M. Franklin, 3/17/9924 Partitioning n XJoin is a partitioned hash join method. n When allocated memory is exhausted, a partition is flushed to disk. n Join processing continues on memory- resident data. n Disk-resident tuples are handled in background.

M. Franklin, 3/17/9925 The 3 Stages of XJoin n Stage 1 - Symmetric hash join (memory-to-memory) n Stage 2- Disk-to-memory u Separate thread - runs when stage 1 blocks. u Stage 1 and 2 trade off until all input has been received. n Stage 3 - Clean up stage u Stage 1 misses pairs that were not in memory concurrently. u Stage 2 misses pairs when both are on disk, and may not get to run to completion.

M. Franklin, 3/17/9926 XJoin - Details n The asynchronous/multi-threaded nature of XJoin combined with its small footprint allows it to be fully pipelined, but… n Duplicate result tuples can be introduced during stages 2 and 3. n These are detected using timestamps. u Separate mechanisms are used for detecting pairs matched in the first and second stages. n Second stage can be further optimized, at the expense of a bit of memory and some additional duplicate detection.

M. Franklin, 3/17/9927 XJoin-Performance n We implemented XJoin in our multi-threaded version of the PREDATOR ORDBMS (from Cornell). n We modeled network delays using traces obtained from accessing sites across the Internet. u Replaying these traces provides repeatable results. n Focus on a “slow” (24.1 KB/sec) and “fast” (132.8 KB/Sec) trace - both exhibit bursty behavior. n Workload is simple join queries on Wisconsin Benchmark relations.

M. Franklin, 3/17/9928 XJoin H Fast Build, Slow Probe XJ-2 XJoin H Slow Build, Slow Probe XJ-2 Results - 2-Way Joins (Time in seconds to n th tuple) XJoin H Fast Build, Fast Probe XJ-2 XJoin Slow Build, Fast Probe H XJ-2

M. Franklin, 3/17/9929 Results - “Stress Test” SLOW FAST Delivery Times (in Seconds)

M. Franklin, 3/17/9930 XJoin - Summary n XJoin is a non-blocking, small footprint join operator. n It is multi-threaded, consisting of three stages. u These stages allow XJoin to make progress when input blocks, but they can introduce duplicates. n XJoin is optimized for streaming results to users as fast as they are created. n Similar to QS, XJoin hides delays with useful work, but at the operator level rather than at the plan level. n Experiments showed an order-of-magnitude or more improvements in time to get the initial results.

M. Franklin, 3/17/9931 Future Work n Investigating the properties of query plans that make them robust in the presence of network problems. u Will use these properties in the objective function for query optimization. n Next step is to use alternative, but not necessarily equivalent sources. n Further progress will involve relaxing the guarantees on semantics that the query system provides. u The WWW has shown us that users will accept this!

M. Franklin, 3/17/9932Conclusions n Current Internet querying and data manipulation capabilities are too limited. u Unexpressive, too coarse grained, etc. u Do not support manipulating data from multiple sites. n Distributed querying technology addresses these concerns but is not applicable on the Internet. n A key concern is unpredictability. u Query Scrambling is a reactive execution approach. u XJoin is a pipelined operator that streams answers. u Lots more interesting work to be done in this area.