Fall 2008Parallel Query Scheduling1. Fall 2008Parallel Query Scheduling2 Query Processing Queries submitted to the system are queued up and processed.

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

COP 3502: Computer Science I (Note Set #21) Page 1 © Mark Llewellyn COP 3502: Computer Science I Spring 2004 – Note Set 21 – Balancing Binary Trees School.
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
Fall 2008Parallel Query Optimization1. Fall 2008Parallel Query Optimization2 Bucket Sizes and I/O Costs Bucket B does not fit in the memory in its entirety,
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
Chapter 10 Operating Systems.
2-dimensional indexing structure
BTrees & Bitmap Indexes
02/04/2008CSCI 315 Operating Systems Design1 CPU Scheduling Algorithms Notice: The slides for this lecture have been largely based on those accompanying.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
Searching with Structured Keys Objectives
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
Indexing structures for files D ƯƠ NG ANH KHOA-QLU13082.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts B + -Tree Index Files Indexing mechanisms used to speed up access to desired data.  E.g.,
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Chapter 9 Uniprocessor Scheduling Spring, 2011 School of Computer Science & Engineering Chung-Ang University.
Memory Management. Process must be loaded into memory before being executed. Memory needs to be allocated to ensure a reasonable supply of ready processes.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
1 Index Structures. 2 Chapter : Objectives Types of Single-level Ordered Indexes Primary Indexes Clustering Indexes Secondary Indexes Multilevel Indexes.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling Minimize response time Maximize throughput Maximize processor efficiency.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
1 Heaps and Priority Queues Starring: Min Heap Co-Starring: Max Heap.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
1 Sec (3.2) Operating System Architecture OS. 2 Software dividing into two categories: 1. Application software 2. System Software  Operating System 
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Uniprocessor Scheduling
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Process-Concept.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Session 1 Module 1: Introduction to Data Integrity
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  What Operating Systems Do  Computer-System Organization  Computer-System Architecture  Operating-System Structure.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
© 2006 Pearson Addison-Wesley. All rights reserved15 A-1 Chapter 15 External Methods.
CENG 3511 External Sorting. CENG 3512 Outline Introduction Heapsort Multi-way Merging Multi-step merging Replacement Selection in heap-sort.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Remote Backup Systems.
Memory Management.
Chapter 2 Memory and process management
Indexing Goals: Store large files Support multiple search keys
Indexing and hashing.
Multiway Search Trees Data may not fit into main memory
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Database Management System
Data Structures Interview / VIVA Questions and Answers
Database Management Systems (CS 564)
Chapter 12: Query Processing
CSI 400/500 Operating Systems Spring 2009
Main Memory Management
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Operating System Architecture OS
Apollo Weize Sun Feb.17th, 2017.
Chapter 17: Database System Architectures
Operating Systems.
Chapter3 Memory Management Techniques
Selected Topics: External Sorting, Join Algorithms, …
Chapter 11 Indexing And Hashing (1)
2018, Spring Pusan National University Ki-Joune Li
Database Systems (資料庫系統)
Remote Backup Systems.
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Presentation transcript:

Fall 2008Parallel Query Scheduling1

Fall 2008Parallel Query Scheduling2 Query Processing Queries submitted to the system are queued up and processed in two steps:  Compile-time: Each query is translated into a query tree which specifies the optimized order for executing the necessary database operators.  Run-time: The operators are scheduled to execute in the PNs in such a way to maximize system throughput while ensuring good response times.

Fall 2008Parallel Query Scheduling3 Competition-Based Scheduling  In CB scheduling, a set of coordinator processes are pre- created and allocated to the queries by a dispatcher process on a First-Come-First-Serve (FCFS) basis.  A coordinator is responsible for scheduling the operators in the corresponding query tree.  A coordinator also competes for the operator servers on the behalf of its operators.  When a coordinator has successfully acquired all the operator servers needed for an operator, it coordinates these servers to execute the operation in parallel.  Partial execution of unary operations (e.g., select) is also allowed.

Fall 2008Parallel Query Scheduling4 CB Scheduling

Fall 2008Parallel Query Scheduling5 (Dis)Advantage of CB Scheduling  Advantage:  An obvious advantage is its simplicity.  Each query is given the same opportunity to compete for the computing resources.  Disadvantage:  The system resources can be under-utilized because the coordinators are not aware of the presence of other queries.  It cannot take advantage of techniques such as Best Fit which can maximize the system utilization.

Fall 2008Parallel Query Scheduling6 Planning-Based Scheduling  All active queries share a single scheduler.  Since the scheduler knows the resource requirements of all the active queries, it can schedule the operators based on how well their requirements match the current condition of the computing system.  The requirement of an operator is defined as the set of operator servers needed for its execution.  The scheduler considers only the queued queries within a fixed size window, called scheduling window.

Fall 2008Parallel Query Scheduling7 PB Scheduling

Fall 2008Parallel Query Scheduling8 PB with Largest-Fit-First (PB-LF) Step 1: Determine the requirement of each leaf node in the scheduling window, and insert these operators into a ready list. Step 2: Sort the operators in ready list into descending order according to the size of their requirements. Step 3: Examine each operator in the ready list in the sorting order until an operator whose requirement can be met is found. Step 4: Create a coordinator process to coordinate the parallel execution. Step 5: Repeat Step 3 & 4, until there is no operator found. Step 6: Execute found operators.

Fall 2008Parallel Query Scheduling9 PB with First-Fit-First (PB-FF)  A potential drawback of PB-LF is that operators with smaller requirement run the risk of experiencing longer waiting time. Step 1: Determine the requirement of each leaf node in the scheduling window, and insert these operators into a ready list. Step 2: Examine each operator in the ready list in the arrival order until an operator whose requirement can be met is found. Step 4: Create a coordinator process to coordinate the parallel execution. Step 5: Repeat Step 3 & 4, until there is no operator found. Step 6: Execute found operators.

Fall 2008Parallel Query Scheduling10 Test Query Tree Structures

Fall 2008Parallel Query Scheduling11 Effect of Query Arrival Rate

Fall 2008Parallel Query Scheduling12 Effect of Number of PNs

Fall 2008Parallel Query Scheduling13 CR-Property  CR-Property: Consecutive Retrieval Property.  CR-property is most used as for data allocation in database systems.  The basic application is to arrange all records relevant to a query and store them into consecutive storage locations on a linear storage for minimizing access time for the query.  Another approach uses C-R property as a file allocation scheme to distribute arbitrarily well constructed file onto multiple disk systems for speedup the parallel data access.

Fall 2008Parallel Query Scheduling14 CR-Property Example: Q1 Q2 Q3 R1: R2: R3: R4: R5: R6: Page1 Page2 Q1 Q2 Q3 R1: R3: R5: R2: R4: R6: Page1 Page2 Query requirementsData allocation with CR-property

Fall 2008Parallel Query Scheduling15 Parallel Query Scheduling Q1Q1 Q2Q2 Q3Q3 Q4Q4 Q5Q5 Q6Q6 Q7Q7 PN PN PN PN PN PN PN PN Query Queue PN needed

Fall 2008Parallel Query Scheduling16 CRP Scheduling w/ Smallest First Level13567 Q4Q4 Q6Q6 Q1Q1 Q7Q7 Q2Q2 Q5Q5 Q3Q3 PN 3 11 PN 8 11 PN PN PN PN PN PN 7 11 Q 4, Q 1, and Q 5 will be scheduled first.

Fall 2008Parallel Query Scheduling17 CRP Scheduling w/ Largest First Level13567 Q6Q6 Q4Q4 Q7Q7 Q1Q1 Q2Q2 Q5Q5 Q3Q3 PN 3 11 PN 8 11 PN PN PN PN PN PN 7 11 Q 6, Q 2, and Q 3 will be scheduled first.

Fall 2008Parallel Query Scheduling18 Effect of No. of PNs