Staged-DB IC-65 Advances in Data Management Systems 1 Scheduling in Staged- DB Systems Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva.

Slides:



Advertisements
Similar presentations
Starfish: A Self-tuning System for Big Data Analytics.
Advertisements

Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
OPERATING SYSTEMS PROCESSES
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
To Share or Not to Share? Ryan Johnson Nikos Hardavellas, Ippokratis Pandis, Naju Mancheril, Stavros Harizopoulos**, Kivanc Sabirli, Anastasia Ailamaki,
Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag.
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.
Parallel Databases By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
EECB 473 Data Network Architecture and Electronics Lecture 3 Packet Processing Functions.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
Fall 2008Parallel Query Scheduling1. Fall 2008Parallel Query Scheduling2 Query Processing Queries submitted to the system are queued up and processed.
CS533 Concepts of Operating Systems Class 2 Thread vs Event-Based Programming.
CS533 Concepts of Operating Systems Class 2 The Duality of Threads and Events.
Informationsteknologi Tuesday, October 9, 2007Computer Systems/Operating Systems - Class 141 Today’s class Scheduling.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Lecture 11: DMBS Internals
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
Databases Illuminated
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Joseph M. Hellerstein Peter J. Haas Helen J. Wang Presented by: Calvin R Noronha ( ) Deepak Anand ( ) By:
Parametric Optimization Of Some Critical Operating System Functions An Alternative Approach To The Study Of Operating Systems Design.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
CS4432: Database Systems II Query Processing- Part 2.
OBJECTIVE: To learn about the various system calls. To perform the various CPU scheduling algorithms. To understand the concept of memory management schemes.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Lecture 14- Parallel Databases Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Lecture 3 - Query Processing (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Counting and Distributed Coordination BASED ON CHAPTER 12 IN «THE ART OF MULTIPROCESSOR PROGRAMMING» LECTURE BY: SAMUEL AMAR.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
CHP-4 QUEUE Cont…. 7.DEQUE Deque (short form of double-ended queue) is a linear list in which elements can be inserted or deleted at either end but not.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
Zeta: Scheduling Interactive Services with Partial Execution Yuxiong He, Sameh Elnikety, James Larus, Chenyu Yan Microsoft Research and Microsoft Bing.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Chapter 13: Query Processing
CPU Scheduling Operating Systems CS 550. Last Time Deadlock Detection and Recovery Methods to handle deadlock – Ignore it! – Detect and Recover – Avoidance.
CS422 Principles of Database Systems Buffer Management Chengyu Sun California State University, Los Angeles.
DATABASE OPERATORS AND SOLID STATE DRIVES Geetali Tyagi ( ) Mahima Malik ( ) Shrey Gupta ( ) Vedanshi Kataria ( )
Lecturer 5: Process Scheduling Process Scheduling  Criteria & Objectives Types of Scheduling  Long term  Medium term  Short term CPU Scheduling Algorithms.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
Parallel Databases.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Applying Control Theory to Stream Processing Systems
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.
Chapter 12: Query Processing
Lecture 11: DMBS Internals
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Evaluation of Relational Operations: Other Operations
CPU Scheduling G.Anuradha
CS179G, Project In Computer Science
Process Scheduling B.Ramamurthy 2/23/2019.
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Uniprocessor scheduling
CS703 - Advanced Operating Systems
Query Processing.
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Staged-DB IC-65 Advances in Data Management Systems 1 Scheduling in Staged- DB Systems Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva

Staged-DB IC-65 Advances in Data Management Systems 2 Organization What is Staged-DB? Scheduling in Staged-DB Our Contribution –Scheduling in Execution Phase –System Modeling System Design Details Performance Study Future Work

Motivation Response time: time needed to produce the first page as output Big advantage for the overlapping case ('1')

Staged-DB IC-65 Advances in Data Management Systems 4 Query PARSER OPTIMIZER EXECUTION Answer Query tree Query plan Data catalogs and statistics operators Query Lifetime in DBMS EXECUTION(Disk-IO) : 90% OF TIME

Staged-DB IC-65 Advances in Data Management Systems 5 DB Paradigm So Far.. Query  Query Execution Plan (Tree of Operators) Multiple Queries –Each query handled by a DIFFERENT THREAD No cross communication/sharing across threads  Sharing Opportunity is missed DBMS thread pool x no coordination D C D C One Query Multiple Operators

Staged-DB IC-65 Advances in Data Management Systems 6 Staged-DB Paradigm DB is remodeled as various stages Stage –“Common execution logic” grouped into a stage –Each operator in QEP can be seen as a stage Query passed through all the needed stages to get an output Common Data needs  Detected by the Stage DBMS thread pool D C D C StagedDB One Operator Multiple queries

Staged-DB IC-65 Advances in Data Management Systems 7 Staged Database Systems DB  Stages ; Execution Stage  microEngine Each Stage has a queue, Also each microEngine has a request queue. DBMS queries Stage 3 Stage 2 Stage 1 StagedDB queries Conventional High concurrency  locality across requests

Staged-DB IC-65 Advances in Data Management Systems 8 Scheduling In Staged-DB Scheduling at Different levels –Stages (Parser, Optimizer, Execution) –Across MicroEngines (Execution Engine has SCAN,JOIN etc micro-engines) –Within MicroEngine We Consider only scheduling “across microEngines” Scheduling Policies: –Round-Robin –Heavy Load First –Light Load First

Staged-DB IC-65 Advances in Data Management Systems 9 Detailed System Design Based on Discrete Event Simulation technique All the computation, data needs, dependencies are modeled using events System components –Global System Queue –Dispatcher –Operator (or)  Engine –Global Scheduler –Main Memory –Overlap Detector

Staged-DB IC-65 Advances in Data Management Systems 10 Query Arrival Dispatcher Scheduler Disk-Fetch Engine Insert Engine Exec-Begin Engine Exec-End Memory Global System Queue event eventId componentId functionId firingTime packet

Staged-DB IC-65 Advances in Data Management Systems 11  Engine Engine Insert Engine Execution Begin Engine Execution End Input Packet Queue Packet format queryId list queryPlans pageId contextInfo Request packet from parent node/ dispatcher Call Overlap detector Insert packet Pick packet from Q Send packet to Child OR execute and produce output Insert event into Event queue for the scheduler

Staged-DB IC-65 Advances in Data Management Systems 12  Engines Join Sort Aggregation Scan Wait and Scan Index Scan

Staged-DB IC-65 Advances in Data Management Systems 13 Overlap detection With memory With input queue Two types –Linear –Spike

Staged-DB IC-65 Advances in Data Management Systems 14 Memory Manager Pinning and unpinning Put() pageExists() consumePage()

Staged-DB IC-65 Advances in Data Management Systems 15 Performance study 5 queries 5 runs Uniform arrival rate

Effect of Overlapping Response time: time needed to produce the first page as output Big advantage for the overlapping case ('1')

Effect of Overlapping Memory consumption: max # of pages consumed in memory during the life time of the query Higher memory consumption with Overlapping !

Effect of Overlapping Throughput: # of queries completed in a unit of time Clear advantage with Overlap detection !

Comparing scheduling policies Mean response time Round Robin seems to perform a little better

Comparing scheduling policies Memory consumption No differences !

Staged-DB IC-65 Advances in Data Management Systems 21 Future Work Few more interesting global scheduling policies are possible. The system did not consider a local scheduling policy to pick one packet among many in the input packet queue, for processing next. It picks the fist packet in the queue at the moment. Regarding implementation, experimentation should be done with more  Engines and a bench mark style input queries.