Aurora – system architecture Pawel Jurczyk. Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.
Load Shedding in a Data Stream Manager Kevin Hoeschele Anurag Shakti Maskey.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.
MPDS 2003 San Diego 1 Reducing Execution Overhead in a Data Stream Manager Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
©Brooks/Cole, 2003 Chapter 7 Operating Systems Dr. Barnawi.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
1 Load Shedding in a Data Stream Manager Slides edited from the original slides of Kevin Hoeschele Anurag Shakti Maskey.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
Pipelining By Toan Nguyen.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Switching Techniques Student: Blidaru Catalina Elena.
Real-Time Software Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Monitoring Streams- A New Class of Data Management Applications Presented by Qing Cao at
MONITORING STREAMS: A NEW CLASS OF DATA MANAGEMENT APPLICATIONS DON CARNEY, U Ğ UR ÇETINTEMEL, MITCH CHERNIACK, CHRISTIAN CONVEY, SANGDON LEE, GREG SEIDMAN,
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Database Management 9. course. Execution of queries.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
Copyright © Curt Hill Query Evaluation Translating a query into action.
Delivery, Forwarding, and Routing of IP Packets
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Methodology – Physical Database Design for Relational Databases.
Lecture 11 Page 1 CS 111 Online Virtual Memory A generalization of what demand paging allows A form of memory where the system provides a useful abstraction.
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, ESRI Vana Kalogeraki, AUEB
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data ACM EuroSys 2013 (Best Paper Award)
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Processor Architecture
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Aurora Group 19 : Chu Xuân Tình Trần Nhật Tuấn Huỳnh Thái Tâm Lec: Associate Professor Dr.techn. Dang Tran Khanh A new model and architecture for data.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
A new model and architecture for data stream management.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Aurora: a new model and architecture for data stream management Daniel J. Abadi 1, Don Carney 2, Ugur Cetintemel 2, Mitch Cherniack 1, Christian Convey.
Query Processing CS 405G Introduction to Database Systems.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Monitoring Streams -- A New Class of Data Management Applications based on paper and talk by authors below, slightly adapted for CS561: Don Carney Brown.
CS 540 Database Management Systems
18-WAN Technologies and Dynamic routing Dr. John P. Abraham Professor UTPA.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
1 Chapter 5: CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms.
Practical Database Design and Tuning
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
Memory Management.
CPU Scheduling CSSE 332 Operating Systems
CSC 4250 Computer Architectures
Chapter 12: Query Processing
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Real-time Software Design
Chapter 15 QUERY EXECUTION.
Database Management Systems (CS 564)
CPU Scheduling G.Anuradha
Switching Techniques.
Lecture 2- Query Processing (continued)
BEST FIRST SEARCH -OR Graph -A* Search -Agenda Search CSE 402
Overview of Query Evaluation
Eddies for Continuous Queries
Adaptive Query Processing (Background)
Presentation transcript:

Aurora – system architecture Pawel Jurczyk

Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive model) –Only current state of data is important –Data synchronized; queries have exact answers (no support for approximation) Monitoring applications are difficult to implement in traditional DBMS –Triggers do not scale past a few triggers per table –Problems with getting required data from historical time series –Development of dedicated middleware is expensive Conclusion: these systems are ill suited for applications used to alert human when abnormal situation occurs (expected DAHP model – DBMS-active, human-passive)

Aurora – main assumptions Data comes from various, uniquely identified data sources (data streams) Each incoming tuple is timestamped Aurora is expected to process incoming streams Tuples are transferred through loop-free, directed graph Outputs from the system are presented to applications Maintains historical storage

Aurora system overview Applications Input data streams Output data Queries Storage Any box can filter stream (select operation) Box can compute stream aggregates applying aggregate function accross a window of values in the stream Output of any box can be an input for several other boxes (split operation) Each box can gather tuples from many inputs (union operation)

Aurora query model Each CP and view should have a persistence specification (e.g. „keep data for 2 hr”) Each output is associated with QoS specification (helps to allocate the processing elements along the path) b1b1 b7b7 b2b2 b6b6 b5b5 b4b4 b3b3 Appl Connection points Storage S1Storage S2 Storage S3 Continuous query View Ad-hoc query „Keep 2 hr” QoS spec „Keep 2 hr”

Queries in the aurora Continuous queries –Query continuously processes tuples –Output tuples are delivered to an application Ad-hoc queries –System will process data and deliver answer from the earliest time stored in the connection point –Semantic is the same as continuous query that started execution at t now – (persistence specification) –Query continues until explicit termination Views –Similar to materialized or partially-materialized views in classical DB systems –Application may connect to the end of this path whenever there is a need

Connection points Support for dynamic modification of network Support for data caching (persistence specification) – helpful for ad-hoc queries Connection point without upload stream can be used as a stored data set (like in classical DBMS) Tuples from connection point can be pushed through the system (e.g when connection point is „materialized” and stored tuples are passed as a stream to the downstream nodes) Alternatively, downstream node can pull the data (helpful in the execution of filtering or joining operations)

Optimization in the Aurora - problems Many changes in the network over the time The need of dealing with a large number of boxes The system operates in a data flow mode Optimization issues address different needs than classical DBMS

Optimization of continuous queries Optimization is done during the run-time Aurora starts execution of unoptimized network Optimization is performed step-by-step for portions of network (subnetworks) Firstly, hold on all input messages for selected subnetwork – drain it of messages Then, optimize selected subnetwork –Insert projections (get rid of unneeded attributes of tuples as soon as possible) –Combine boxes (e.g. projection with filtering) –Reorder boxes (e.g. filtering can be pushed down the query tree through join) Finally, stop holding input messages Optimizer cycles periodically through all subnetworks (it is a background task)

Optimization of continuous queries - details Each box has: –c(b) – execution cost –s(b) – selectivity -expected number of output tuples per 1 input tuple Amount of processing for successive boxes (according to the situation in figure): c(b i ) + c(b j )*s(b i ) Boxes are in right order if: (1-s(b j ))/c(b j ) < (1-s(b i ))/c(b i ) Let’s check the condition above for b i and b j : –(1 – 0.5)/1 < (1 – 5)/4 0.5 < -4/4 FALSE –The condition is not satisfied – we should change the order of boxes S(A)T(B, C) Filter (A>2, A<4) Join (A=B) Filter (C > 0) b i : c=4; s=5 b j : c=1; s=0.5 Filter (B>2, B<4)

Optimization of ad-hoc queries Each ad-hoc query is attached to a connection point – it runs on all the historical data stored in a connection point Connection point keeps historical data as B-tree Firstly examined ‘historical part’ of ad-hoc query (successor(s) of connection point) –filter boxes being compatible with the B-tree storage key can use indexed lookup –joins can use merge-sort or indexed lookup – the cheapest one is chosen The rest of query is optimized as continuous queries

Run-time architecture Router Scheduler Load Shedder QoS Monitor Storage manager Box Processors Q1Q1 Q2Q2 QiQi QnQn QjQj Buffer Manager Persistent Storage Outputs Inputs

Run-time components Router –Routes tuples in the system –Forwards them either to outputs or to the storage manager Storage manager –Responsible for maintaining the box queues and managing the buffer Scheduler –Decides which box will be processed Box processor –Executes the appropriate operation –Forwards output to router QoS monitor –Observes outputs and activates load shedder Load shedder –Shades load till the performance reaches the acceptable level

QoS Optimization is based on the attempt to maximize the perceived QoS for the outputs Basically, QoS is a function of: –Response times (production of output tuples) –Tuple drops –Values produced (importance of produced values) Administrator specifies QoS graphs for output based on one or more of mentioned functions Other types of QoS functions can be defined too Administrator defines headroom for the system (the percentage of computing resources that can be used by Aurora)

QoS graphs Graphs are expected to be normalized Graphs should allow a properly sized network to operate with all outputs in a ‘good zone’ Graphs should be convex (the value-based graph is an exception) 1 0 Delay 1 0 % tuples delivered 1 0 Output value good zone

Aurora Storage Manager (ASM) – Queues management Windowed operations (e.g. aggregations) require historical collection of tuples Tuples may accumulate in various places when network is saturated There is one queue at the output of each box; this queue is shared by all successor boxes Queues are stored in memory and on disks Queues may change length Scheduler and ASM share scheduling priority and the percentage of queue in the main memory b2b2 b1b1 time Queue organization Processed tuples

Aurora Storage Manager (ASM) – Connection point management If the amount of needed historical data in the CP is less than the maximal window size of the successor boxes, no extra storage needed Historical data is organized in B-trees based on the storage key (default: timestamp) Periodically, all tuples that are older than the history requirement, are removed from B-tree B-trees are stored in the space allocated by the ASM

Scheduling in Aurora Scheduler (and Aurora) aims to reduce overall tuple execution cost Exploit of two nonlineralities in tuple processing –Interbox nonlinearity: Minimaze tuple trashing (if buffer space is not sufficient tuples has to be shuttled between memory and disk) Avoiding to copy data from output to buffer (a possibility of bypassing ASM when one box is scheduled right after another) –Intrabox nonlinearity: The cost of tuple processing may decrease as the number of available tuples in the queue increases (avoiding context- switching, better optimization)

Scheduling in Aurora Aurora’s approach: (1) have in queues as many tuples as possible, (2) process it at once – train scheduling, and (3) pass them to subsequent boxes without going to disk – superbox scheduling Two goals: (1) minimize number of I/O operations and (2) minimize number of box calls per tuple How does it work? –Output is selected for execution –There is found the first downstream box with queue in memory –Then, there are considered upstream boxes – there is found as many upstream boxes with queues (not empty) in memory as possible –Found sequence of boxes can be scheduled one after another –Storage manager is notified to keep all the queues of selected boxes in memory during the execution

Priorities assignment in Scheduler The waiting delay of tuples (a part of the latency of each output) is the function of scheduling The goal of scheduler: to assign priorities to boxes outputs that maximize the overall QoS The Scheduler’s approach is divided into two aspects: –state-based analysis that assigns priorities to outputs and picks for scheduling the output with the highest utility –feedback-based analysis that observes overall system and increases the priorities of outputs not doing well

Scheduler – execution overhead Time (ms) Execution costs Scheduling overhead Tuple at a timeTrainsSuperboxes

Prediction of overload situations Static analysis –The goal: determine if the hardware running the network is sized correctly –Each box has processing cost c(b) and selectivity s(b) –Each input has the rate of tuples production r(d) –Analysis starts from each datasource and continues downstream –The system is stable when: 1/c(b i ) ≥ r(d i ) –The output rate from b i is: min(1/c(b i ), r(d i )) * s(b i ) –Iteration of the steps above gives output data rate and computational requirements for each box –Then there is a possibility of prediction required computational resources

Prediction of overload situations b 1 : 1/0.05t/s ≥ 100t/s (not true!) Output stream: min(1/0.05s, 100t/s) * 0.1 = 2t/s b 2 : (1/0.05)t/s ≥ 100t/s (not true!) Output stream: min(1/0.05s, 100t/s) * 0.1 = 2t/s S(A, B, C)T(B, C) Filter (A>2, A<4) Join (A=B) Filter (C > 0) b 3 : c=0.1s; s=5 b 4 : c=0.05s; s=0.5 Filter (B>2, B<4) b 1 : c=0.05s; s=0.1b 2 : c=0.05s; s=0.1 r s =100t/sr t =100t/s b 3 : (1/0.1)t/s ≥ (2 + 2)t/s (true) Output stream: min(1/0.1s, 4t/s) * 5 = 20t/s b 4 : (1/0.05)t/s ≥ 20t/s (true) Output stream: min(1/0.05s, 20t/s) * 0.5 = 10t/s Needed computation: 100t/s+100t/s+2t/s+2t/s+20t/s+10t/s=234t/s

Prediction of overload situations Run-time analysis –Helps to deal with input rate spikes –Uses delay-based QoS information –If many of tuples are outside the ‘good zone’, there is a probability of overload

Load shedding Reaction to overload Load shedding process relies on QoS information Load shedding by dropping tuples –Drop is a system level operator that enables to drop randomly tuples from stream at specified rate –Drop box is located as far upstream as possible –Result of static analysis Dropping of tuples on network branches that terminate in more tolerant outputs Algorithm: (1) choose the output with the smallest negative slope in tuple drops graph, (2) move horizontally along this curve until there is another output with smaller negative slope at this point, (3) this horizontal difference is an indication of of the output tuples drop rate –Result of dynamic analysis Similar algorithm as previously Can be use delay-based graphs Dropping of tuples on branches that terminate in higher priority outputs (otherwise it would be ineffective)

Load shedding Load shedding by filtering tuples –Idea: remove less important tuples rather than randomly chosen –It use value-based QoS information –There is prepared a histogram containing the frequency with which value ranges have been observed –Then there can be calculated utility of each of intervals (multiply frequency with value-based QoS function value) –Backward interval propagation: Aurora picks the interval with the lowest utility and prepares predicate for it that is used in filter box –Forward interval propagation: Estimation of proper filter predicate and checking it by trial and error