SIGMOD'061 Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Bin Liu, Yali Zhu and Elke A. Rundensteiner Database Systems Research.

Slides:

Advertisements

Similar presentations

Evaluating Window Joins over Unbounded Streams Author: Jaewoo Kang, Jeffrey F. Naughton, Stratis D. Viglas University of Wisconsin-Madison CS Dept. Presenter:

Advertisements

Dissemination-based Data Delivery Using Broadcast Disks.

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.

LIBRA: Lightweight Data Skew Mitigation in MapReduce

Supporting top-k join queries in relational databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by Rebecca M. Atchley Thursday, April.

Di Yang, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute VLDB 2009, Lyon, France 1 A Shared Execution Strategy for Multiple Pattern.

Building a Distributed Full-Text Index for the Web S. Melnik, S. Raghavan, B.Yang, H. Garcia-Molina.

Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.

Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.

DAX: Dynamically Adaptive Distributed System for Processing CompleX Continuous Queries Bin Liu, Yali Zhu, Mariana Jbantova, Brad Momberger, and Elke A.

Dynamic Plan Migration for Continuous Query over Data Streams Yali Zhu, Elke Rundensteiner and George Heineman Database System Research Group Worcester.

VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute

State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries Song Wang Elke Rundensteiner Database Systems Research Group Worcester.

Continuous Stream Monitoring Technology Elke A. Rundensteiner Database Systems Research Laboratory Department of Computer Science Worcester Polytechnic.

Dynamic Plan Migration for Continuous Queries over Data Streams Yali Zhu, Elke Rundensteiner and George Heineman Database System Research Group, WPI. Massachusetts,

An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.

1 DCAPE: Distributed and Self-Tuned Continuous Query Processing Tim Sutherland,Bin Liu,Mariana Jbantova, and Elke A. Rundensteiner Department of Computer.

CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.

1 04/18/2005 Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC.

Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.

Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.

Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.

NiagaraCQ A Scalable Continuous Query System for Internet Databases Jianjun Chen, David J DeWitt, Feng Tian, Yuan Wang University of Wisconsin – Madison.

OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.

Index Tuning for Adaptive Multi-Route Data Stream Systems Karen Works, Elke A. Rundensteiner, and Emmanuel Agu Database Systems Research.

CAPE: Continuous Query Engine with Heterogeneous-Grained Adaptivity Elke A. Rundensteiner, Luping Ding, Timothy Sutherland, Yali Zhu Brad Pielech, Nishant.

1 Dynamically Adaptive Distributed System for Processing CompleX Continuous Queries Bin Liu, Yali Zhu, Mariana Jbantova, Brad Momberger, and Elke A. Rundensteiner.

Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)

Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.

Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.

CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware

Runtime Optimization of Continuous Queries Balakumar K. Kendai and Sharma Chakravarthy Information Technology Laboratory Department of Computer Science.

Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University.

PermJoin: An Efficient Algorithm for Producing Early Results in Multi-join Query Plans Justin J. Levandoski Mohamed E. Khalefa Mohamed F. Mokbel University.

ROCK: A Robust Clustering Algorithm for Categorical Attributes Authors: Sudipto Guha, Rajeev Rastogi, Kyuseok Shim Data Engineering, Proceedings.,

Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.

1 Elke. A. Rundensteiner Worcester Polytechnic Institute Elisa Bertino Purdue University 1 Rimma V. Nehme Microsoft.

Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.

Control-Based Load Shedding in Data Stream Management Yicheng Tu †, Song Liu ‡, Sunil Prabhakar †, Bin Yao ‡ † Indiana Center of Database Systems, Department.

CS4432: Database Systems II Query Processing- Part 2.

Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.

Di Yang, Zhengyu Guo, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute EDBT 2010, Submitted 1 A Unified Framework Supporting Interactive.

M.Kersten MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.

D-skyline and T-skyline Methods for Similarity Search Query in Streaming Environment Ling Wang 1, Tie Hua Zhou 1, Kyung Ah Kim 2, Eun Jong Cha 2, and Keun.

Query Processing CS 405G Introduction to Database Systems.

UNIT: User-ceNtrIc Transaction Management in Web-Database Systems Huiming Qu, Alexandros Labrinidis, Daniel Mosse Advanced Data Management Technologies.

Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,

Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.

By: Peter J. Haas and Joseph M. Hellerstein published in June 1999 : Presented By: Sthuti Kripanidhi 9/28/20101 CSE Data Exploration.

Handling Data Skew in Parallel Joins in Shared-Nothing Systems Yu Xu, Pekka Kostamaa, XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.

Accurate WiFi Packet Delivery Rate Estimation and Applications Owais Khan and Lili Qiu. The University of Texas at Austin 1 Infocom 2016, San Francisco.

1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.

Stela: Enabling Stream Processing Systems to Scale-in and Scale-out On- demand Le Xu ∗, Boyang Peng†, Indranil Gupta ∗ ∗ Department of Computer Science,

Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.

BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.

In quest of the operational database for real-time environmental monitoring and early warning systems Bartosz Baliś, Marian Bubak, Daniel Harezlak, Piotr.

Database Management System

A paper on Join Synopses for Approximate Query Answering

Chapter 12: Query Processing

Evaluation of Relational Operations

Supporting Fault-Tolerance in Streaming Grid Applications

Feifei Li, Ching Chang, George Kollios, Azer Bestavros

Slides adapted from Donghui Zhang, UC Riverside

Resource Allocation for Distributed Streaming Applications

Adaptive Query Processing (Background)

Parallel DBMS DBMS Textbook Chapter 22

Presentation transcript:

SIGMOD'061 Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Bin Liu, Yali Zhu and Elke A. Rundensteiner Database Systems Research Laboratory Worcester Polytechnic Institute

SIGMOD'06 2 Motivating Example Real Time Data Integration Server... Stock Price, Volumes,... Reviews, External Reports, News,...  Produce As Many Results As Possible at Run-Time (i.e., 9:00am-4:00pm)  Require Complete Query Results (i.e., for offline analysis after 4:00pm) Decision Support System... Decision-Make Applications Analyze relationship among stock price, reports, and news? Complex queries such as multi-joins are common! A equi-Join of stock price, reports, and news on stock symbols

SIGMOD'06 3 Challenges As Many Run-Time Results As Possible  Demand main memory based query processing a4 a2b3 a4b2 a4 b1 a1 a2 a3 State A b1 b2 State B A B AB Push-Based Processing with Complex Queries  Demand main memory space to store operator states  Operator states may monotonically increase over time Run-Time Main Memory Overflow?

SIGMOD'06 4 Problem : Memory Overflow High Demand on Main Memory :  High input rates and large windows result in huge states  Bursty streams cause temporary accumulation of tuples  Long-running queries exhibit monotonic state increases Potential Solutions :  Query Optimization  Distributed Processing  Load Shedding  Memory Management

SIGMOD'06 5 State Spill Push Operator States Temporarily into Disks Operator states spilled are temporarily inactive ABC ABC ABC Secondary Storage New incoming tuples processed only against partial states

SIGMOD'06 6 Three-staged Processing : Hash  Xjoin [UF00] Two Algorithms : Hash + Merge  Hash-Merge Join [MLA04] Single-input, Distributed Environment  Flux [SHCF03] Observation: Single Operator Focus !!! State of Art : State Flushing

SIGMOD'06 7 Observation:  Interdependency among Pipelined Operators Spilling of bottom operators affects its downstream operators ! Problem : What about Multi-Operator Plans ? ABC D Join 2 Join 1 Maximize Run-time Throughput of Join1 !! Increase memory consumption of Join 2 :  May quickly fill main memory  May require state spill again  Causes more work downstream But states in Join 2 may not contribute to final output :  Low selectivity ?

SIGMOD'06 8 Outline Basics on State Spill Plan-level Spill Strategies Experimental Evaluation

SIGMOD'06 9 Granularity : State Partitioning Divide Input Streams into Large Number of Partitions  At run-time, only need to choose partitions to spill [ DNS92,SH03 ] Avoid expensive run-time repartitioning Does not affect partitions that are not spilled Join m1m1 m2m2 Split ABC Example : 300 partitions M1 has odd IDs M2 has even IDs

SIGMOD'06 10 Partition Granularity : Choose State? Multiple States Exist from Different Inputs To disk... Select States from One Input Only Select States with Same ID Partition Group Granularity! Avoid across-machine processing Simplify spill management Streamline cleanup process

SIGMOD'06 11 Clean Up Stage Partition Groups Could be Pushed Multiple Times V 0 = PA 1 0 PB 1 0 PC 1 0 V 1 = PA 1 1 PB 1 1 PC V k = PA 1 k PB 1 k PC 1 k The Results Have Been Generated Incremental View Maintenance Algorithm [ZMH+95] Treat Multiple Join as Materialized View Partition Groups as Source Updates (PA 1, PB 1, PC 1 ) 0 0 0, (PA 1 1, PB 1 1, PC 1 1 ), (PA 1 2, PB 1 2, PC 1 2 ),..., (PA 1 k, PB 1 k, PC 1 k )

SIGMOD'06 12 Merge Disk Resident States To Merge Two Partition Groups with Same ID  i.e., (PA 1 0, PB 1 0, PC 1 0 ) and (PA 1 1, PB 1 1, PC 1 1 )  V 0 = PA 1 0 PB 1 0 PC 1 0, V 1 = PA 1 1 PB 1 1 PC 1 1 After Merge Combined States: PA 1 0  PA 1 1, PB 1 0  PB 1 1, PC 1 0  PC 1 1 Final Result: V = (PA 1 0  PA 1 1 ) (PB 1 0  PB 1 1 ) (PC 1 0  PC 1 1 ) Missing Results:  = V - V 0 - V 1 V-V 0 = PA 1 1 PB 1 0 PC 1 0  (PA 1 0  PA 1 1 ) PB 1 1 PC 1 0  (PA 1 0  PA 1 1 ) (PB 1 0  PB 1 1 ) PC 1 1

SIGMOD'06 13 State Spill Strategies

SIGMOD'06 14 Which Partitions to Push? Throughput-Oriented State Spill  Productivity of a partition group : P output : Number of output tuples generated from partition group P size : Size of partition group in terms of number of tuples Productivity: P output /P size

SIGMOD'06 15 Globally Choose Partition Groups Rank Partitions Based on Productivity: P output /P size Choose globally least productive partitions to spill ABC D E Join 2 Join 1 Join 3 … Disk State Spill Direct Extension : Local Output Method

SIGMOD'06 16 Bottom Up Pushing Strategy Spill States from Bottom Operators First  Minimize number of state spills  Minimize global state memory ABC D E Join 2 Join 1 Join 3 Mimics load-shedding.

SIGMOD'06 17 Bottom Up Pushing Strategy Spill States from Bottom Operators First  Choose partitions from Join 1 until it reaches threshold k%  If not done, choose partitions from Join 2, and so on ABC D E Join 2 Join 1 Join 3  Minimize intermediate results in upstream operators (memory)  Minimize number of state spill processes Partition Selection: Randomly or using local productivity Less spill process  Higher overall query throughput ?

SIGMOD'06 18 Partition Interdependency Smaller Number of Spill Processes  High Throughput !!  Partition pushed in bottom operator may be parent for productive partitions in its downstream operators 2 10 OP 1 OP 2... p11p11 p12p12 p21p21 p22p It may worthwhile to push P 2 1 instead of P 1 1 ! Global Strategy : Account for Dependency Relationships !

SIGMOD'06 19 “True” Global Output Strategy P output : Contribution to Final Query Output ABC D E Join 2 Join 1 Join 3 Split E Split A Split B Split C Split 2 Split D Split 1 k  Update P output values of partitions in Join 3  Apply Split 2 to each tuple and find corresponding partitions from Join 2, and update its P output value  And so on …  Employ lineage tracing algorithm to update P output statistics

SIGMOD'06 20 Global Output with Penalty Incorporate Intermediate Result Sizes P 1 1 : P size = 10, P output =20 P 1 2 : P size = 10, P output =20 OP 1... p11p11 1 p12p12 OP 2... p2ip2i p2jp2j Intermediate Result Factor P inter  Productivity value: P output /(P size + P inter )

SIGMOD'06 21 Global Penalty : Tracing P inter Penalty P inter : Contribution to Intermediate Result Sizes Apply Similar Lineage Tracing Algorithm for P inter... OP 1... p11p11 1 p12p12 OP 2 p21p21 p2jp2j OP 3 p31p31 p3jp3j 2 OP 4... p41p41 p4jp4j

SIGMOD'06 22 CAPE System Overview [LZ+05, TLJ+05] Local Statistics Gatherer Data Distributor CAPE-Continuous Query Processing Engine Data Receiver Query Processor Local Adaptation Controller Distribution Manager Streaming Data Networ k End User Global Adaptation Controller Runtime Monitor Query Plan Manager Repository Connection Manager Repository Application Server Stream Generator

SIGMOD'06 23 Experimental Setup : Queries and Data  Inputs: A, B, C, D, and E data streams  Query : Join 1 :A 1 =B 1 =C 1, Join 2 :C 2 =D 1, Join 3 :D 2 =E 1  Query Operators : Use symmetric hash join  Each input stream is partitioned into 300 partitions  Query is partitioned and run in two machines  Memory threshold for spill : 60MB  Push 30% of states in each state spill  Average tuple inter-arrival time 50ms from each input

SIGMOD'06 24 Experimental Setup High Performance PC cluster  Dual 2.4GHz CPUs, 2G Memory, Gigabit Network  3 Machines for Stream Generator, Application Server, and Distribution Manager.  Each Query Processor on Separate Machine Generated Data Streams with Integer Join Column Values  Data value V appears R times for every K input tuples Tuple Range : K Range Join Ratio : R  Average Join Rate : Average number of tuples with same join value per input

SIGMOD'06 25 Percentage Spilled per Adaptation Amount of State Pushed Each Adaptation  Percentage: # of Tuples Pushed / Total # of Tuples (Input Rate: 30ms/Input, Tuple Range:30K, Join Ratio:3, Adaptation threshold: 200MB) Run-Time Query Throughput Run-Time Main Memory Usage

SIGMOD'06 26 Experiment : Throughput & Memory Query with Average Join Rate: Join 1 : 3, Join 2 : 1, Join 3 : 1

SIGMOD'06 27 Experiment : Throughput Comparison Query with Average Join Rate: Join 1 : 1, Join 2 : 3, Join 3 : 3 Query with Average Join Rate: Join 1 : 3, Join 2 : 2, Join 3 : 3

SIGMOD'06 28 Experimental Summary Productivity metric improves run-time throughput Global-output-with-penality is overall winner Global output (with and without penality) outperform alternates in runtime throughput Global output (with and without penality) have similar (good) cleanup costs Bottom-up strategy has lowest # of adaptations, yet poor performer and high cleanup costs

SIGMOD'06 29 Related Work XJoin [UF00], Hash-Merge [MLA04], Flux [SH03]  Only spill states for one single operator Load Shedding [TUZC03]  Drop input tuples to handle resource shortage Continuous Query Processing [SLJ+05,XZH05,RD04,AC03,BBDM02,CF02, MSH02,CDT00]  No plan-level state spill

SIGMOD'06 30 Conclusions Identified Problem of Plan-Spill  State spill using “productivity” viable Proposed Plan-Level Spill Policies  Dependencies considered for multi-operator plans Evaluated Spill Policies  Global spill solutions improve throughput

SIGMOD'06 31 Questions ? Thank You !

SIGMOD'06 32 Acknowledgments DSRG students contributed to CAPE code base, including Luping Ding, Bin Liu, Tim Sutherland, Brad Pielech, Rimma Nehme, Mariana Jbantova, Brad Momberger, Song Wang, Natasha Bogdanova Thanks to National Science Foundation for partial support via IDM and equipment grants, to WPI for RDC grant, and to NEC for student support