Static Process Scheduling

Slides:



Advertisements
Similar presentations
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
U of Houston – Clear Lake
Hadi Goudarzi and Massoud Pedram
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.
A SYSTEM PERFORMANCE MODEL CSC 8320 Advanced Operating Systems Georgia State University Yuan Long.
Towards Feasibility Region Calculus: An End-to-end Schedulability Analysis of Real- Time Multistage Execution William Hawkins and Tarek Abdelzaher Presented.
Reference: Message Passing Fundamentals.
1 Friday, September 29, 2006 If all you have is a hammer, then everything looks like a nail. -Anonymous.
Parallel Simulation etc Roger Curry Presentation on Load Balancing.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.
Summary for Chapter 5 --Distributed Process Scheduling
Dynamic Load Sharing and Balancing Sig Freund. Outline Introduction Distributed vs. Traditional scheduling Process Interaction models Distributed Systems.
Summary :- Distributed Process Scheduling Prepared BY:- JAYA KALIDINDI.
Chapter 5 Distributed Process Scheduling. 5.1 A System Performance Model --Niharika Muriki.
Juan Mendivelso.  Serial Algorithms: Suitable for running on an uniprocessor computer in which only one instruction executes at a time.  Parallel Algorithms:
Challenges of Process Allocation in Distributed System Presentation 1 Group A4: Syeda Taib, Sean Hudson, Manasi Kapadia.
Distributed Real-Time systems 1 By: Mahdi Sadeghizadeh Website: Sadeghizadeh.ir Advanced Computer Networks.
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Performance Evaluation of Parallel Processing. Why Performance?
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.
Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.
 A System Performance Model  Static Process Scheduling  Dynamic Load Sharing and Balancing  Real-Time Scheduling.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Summary :-Distributed Process Scheduling Prepared By:- Monika Patel.
CSC 8420 Advanced Operating Systems Georgia State University Yi Pan.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Real-Time Support for Mobile Robotics K. Ramamritham (+ Li Huan, Prashant Shenoy, Rod Grupen)
Distributed Process Scheduling : A Summary
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
A System Performance Model Distributed Process Scheduling.
Introduction to Real-Time Systems
Lecture 3: Designing Parallel Programs. Methodological Design Designing and Building Parallel Programs by Ian Foster www-unix.mcs.anl.gov/dbpp.
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Constraint-Based Routing
Chapter – 5.2 Static Process Scheduling
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs (cont.) Dr. Xiao.
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Distributed Process Scheduling: 5.1 A System Performance Model
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Parallel Programming in C with MPI and OpenMP
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Static Process Scheduling Yi Sun

Overview Before execution, processes need to be scheduled and allocated with resources Objective Enhance overall system performance metric Process completion time and processor utilization In distributed systems: location and performance transparency In distributed systems Local scheduling (on each node) + global scheduling Communication overhead Effect of underlying architecture Dynamic behavior of the system

Process Interaction Models Precedence process model: Directed Acyclic Graph (DAG) Represent precedence relationships between processes Minimize total completion time of task (computation + communication) Communication process model Represent the need for communication between processes Optimize the total cost of communication and computation Disjoint process model Processes can run independently and completed in finite time Maximize utilization of processors and minimize turnaround time of processes

Process Models Partition 4 processes onto two nodes Communication overhead

System Performance Model Attempt to minimize the total completion time of (makespan) of a set of interacting processes

System Performance Model (Cont.) Related parameters OSPT: optimal sequential processing time; the best time that can be achieved on a single processor using the best sequential algorithm CPT: concurrent processing time; the actual time achieved on a n-processor system with the concurrent algorithm and a specific scheduling method being considered OCPTideal: optimal concurrent processing time on an ideal system; the best time that can achieved with the concurrent algorithm being considered on an ideal n-processor system(no inter-communication overhead) and scheduled by an optimal scheduling policy Si: the ideal speedup by using a multiple processor system over the best sequential time Sd: the degradation of the system due to actual implementation compared to an ideal system

System Performance Model (Cont.) Pi: the computation time of the concurrent algorithm on node i P4 P1 P3 P1 (RP  1) P2 P4 P2 OCPTideal P3 P4 OCPTideal

System Performance Model (Cont.) (The smaller, the better) (The larger, the better)

System Performance Model (Cont.) RP: Relative processing (algorithm) How much loss of speedup is due to the substitution of the best sequential algorithm by an algorithm better adapted for concurrent implementation but which may have a greater total processing need Loss of parallelism due to algorithm conversion Increase in total computation requirement Sd Degradation of parallelism due to algorithm implementation RC: Relative concurrency (algorithm?) How far from optimal the usage of the n-processor is RC=1  best use of the processors Theoretic loss of parallelism : loss of parallelism when implemented on a real machine (system architecture + scheduling)

Efficiency Loss  Impact factors: scheduling, system, and communication

Efficiency Loss  (Cont.)

Workload Distribution Performance can be further improved by workload distribution Loading sharing: static workload distribution Dispatch process to the idle processors statically upon arrival Corresponding to processor pool model Load balancing: dynamic workload distribution Migrate processes dynamically from heavily loaded processors to lightly loaded processors Corresponding to migration workstation model Model by queuing theory: X/Y/c Proc. arrival time distribution:X; Service time distribution:Y; # of servers: c : arrival rate; : service rate; : migration rate : depends on channel bandwidth, migration protocol, context and state information of the process being transferred.

Processor-Pool and Workstation Queueing Models Static Load Sharing Dynamic Load Balancing M for Markovian distribution

Comparison of Performance for Workload Sharing (Communication overhead) (Negligible Communication overhead)

Static Process Scheduling Static process scheduling: deterministic scheduling policy Scheduling a set of partially ordered tasks on a non-preemptive multi-processor system of identical processors to minimize the overall finishing time (makespan) Optimize makespan  NP-complete Need approximate or heuristic algorithms… Attempt to balance and overlap computation and communication Mapping processes to processors is determined before the execution Once a process starts, it stays at the processor until completion Need prior knowledge about process behavior (execution time, precedence relationships, communication patterns) Scheduling decision is centralized and non-adaptive

Precedence Process and Communication System Models Communication overhead for A(P1) and E(P3) = 4 * 2 = 8 Communication overhead for one message Execution time No. of messages to communicate

Precedence Process Model Precedence Process Model – NP-complete A program is represented by a DAG (Figure 5.5 (a)) Node: task with a known execution time Edge: weight showing message units to be transferred Communication system model: Figure 5.5 (b) Scheduling strategies List Scheduling (LS): no processor remains idle if there are some tasks available that it could process (no communication overhead) Extended List Scheduling (ELS): LS first + communication overhead Earliest Task First (ETF) scheduling: the earliest schedulable task is scheduled first Critical path: longest execution path Lower bound of the makespan Try to map all tasks in a critical path onto a single processor

Makespan Calculation for LS, ELS, and ETF

Communication Process Model Maximize resource utilization and minimize inter-process communication Undirected graph G=(V,E) V: Processes E: weight = amount of interaction between processes Cost equation e = process execution cost (cost to run process j on processor i) C = communication cost (C==0 if i==j) Again!!! NP-Complete

Stone’s two-processor model to achieve minimum total execution and communication cost Example: Figure 5.7 (Don’t consider execution cost) Partition the graph by drawing a line cutting through some edges Result in two disjoint graphs, one for each process Set of removed edges  cut set Cost of cut set  sum of weights of the edges Total inter-process communication cost between processors Of course, the cost of cut sets is 0 if all processes are assigned to the same node Computation constraints (no more k, distribute evenly…) Example: Figure 5.8 (Consider execution cost) Maximum flow and minimum cut in a commodity-flow network Find the maximum flow from source to destination

Computation Cost and Communication Graphs

Minimum-Cost Cut Only the cuts that separate A and B are feasible

Discussion – Static Process Scheduling Once a process is assigned to a processor, it remain there until its execution has been completed Need prior knowledge of execution time and communication behavior Not realistic

Reference Distributed Operating Systems & Algorithms, by Randy Chow and Theodore Johnson, 1997