1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.

Slides:



Advertisements
Similar presentations
Operating Systems Components of OS
Advertisements

Dr. Kalpakis CMSC 621, Advanced Operating Systems. Distributed Scheduling.
Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
CS 7810 Lecture 4 Overview of Steering Algorithms, based on Dynamic Code Partitioning for Clustered Architectures R. Canal, J-M. Parcerisa, A. Gonzalez.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Scheduling.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
DISTRIBUTED CONSISTENCY MANAGEMENT IN A SINGLE ADDRESS SPACE DISTRIBUTED OPERATING SYSTEM Sombrero.
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
Design, Implementation, and Evaluation of Differentiated Caching Services Ying Lu, Tarek F. Abdelzaher, Avneesh Saxena IEEE TRASACTION ON PARALLEL AND.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.
Strategies for Implementing Dynamic Load Sharing.
PRASHANTHI NARAYAN NETTEM.
Operating System Structure and Processor Architecture for a Large Distributed Single Address Space Alan C. SkousenDonald S. Miller Computer Science and.
RGF M.S. Thesis Presentaton 12/011 Reduced Development Costs in the Operating System.
ASU 64-bit OS Group7/15/ Effects of the Single Address Space Paradigm on CPU and OS Design for a Distributed Computer System Donald S. Miller.
CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
N-Tier Architecture.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Chapter 1. Introduction What is an Operating System? Mainframe Systems
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Load Balancing in Distributed Computing Systems Using Fuzzy Expert Systems Author Dept. Comput. Eng., Alexandria Inst. of Technol. Content Type Conferences.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
Architecting Web Services Unit – II – PART - III.
DCE (distributed computing environment) DCE (distributed computing environment)
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Performance Characterization and Architecture Exploration of PicoRadio Data Link Layer Mei Xu and Rahul Shah EE249 Project Fall 2001 Mentor: Roberto Passerone.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Distributed Real-Time Systems (contd.)
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young MACH: A New Kernel Foundation for UNIX Development Presenter: Wei-Lwun.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Distributed Real-Time Systems.
DISTRIBUTED COMPUTING
Static Process Scheduling
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Page Buffering, I. Pages to be replaced are kept in main memory for a while to guard against poorly performing replacement algorithms such as FIFO Two.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
System Components Operating System Services System Calls.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
Introduction to Load Balancing:
Operating Systems (CS 340 D)
Introduction to Operating System (OS)
Operating Systems (CS 340 D)
Parallel Programming in C with MPI and OpenMP
Chapter 2: Operating-System Structures
Outline Announcement Distributed scheduling – continued
Lecture 24: Multiprocessors
Chapter 2: Operating-System Structures
Presentation transcript:

1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil

2 Contents Distributed Scheduling Features of Sombrero Goals Related Work Platform for Distributed Scheduling Distributed Scheduling Algorithm (Simulation) Scaling of the Algorithm (Simulation) Initiation of Porting to Sombrero Prototype Testing Conclusion Future Work

3 Distributed Scheduling A distributed scheduling algorithm provides for sharing as well as better usage of resources across the system. The algorithm will allow threads in the distributed system to be scheduled among the different processors in such a manner that CPU usage is balanced.

4 Features of Sombrero Distributed scheduling in Sombrero takes advantage of the distributed SASOS features: The shared memory inherent to a distributed SASOS provides an excellent mechanism to distribute load information of the nodes in the system (information policy). The ability of threads to migrate in a simple manner across machines has a potentially far-reaching affect on the performance of the distributed scheduling mechanism.

5 Features of Sombrero (contd.) The granularity of migration is a thread not a process. This allows the distributed scheduling algorithm to have a flexible selection policy (determines which thread is to be transferred to achieve load balancing). This feature also reduces the software complexity of the algorithm.

6 Goals Platform for Distributed Scheduling Simulation of Distributed Scheduling Algorithm Scaling of the Algorithm (Simulation) Initiation of Porting to Sombrero Prototype

7 Related Work Load-Balancing Algorithms for Sprite Sprite PVM PVM Condor Condor UNIX UNIX

8 Requirements A working prototype of Sombrero is needed that has the ability to manage extremely large data sets across a network in a distributed single address space. A functional prototype is needed which implements essential features such as protections domains, Sombrero thread support, token tracking support, etc. The prototype is under construction and not available as development platform. Windows NT is used since the prototype is being developed on it.

9 Sombrero Node Load Table Sombrero Node Local Thread Information Selection Policy Communication Thread Distributed Scheduler Sombrero Node Local Thread Information Selection Policy Communication Thread Distributed Scheduler Thread Migration Architecture of Sombrero Nodes

10 Sombrero Clusters RMOCB 0x5000 RMOCB 0x7000 RMOCB 0x6000 RMOCB 0x1000 RMOCB 0x2000 Router 0x1 Router 0x11 A B B B A Cluster I RMOCB 0x3000 RMOCB 0x4000 Cluster II Cluster III Load Table 0x1000 Load Table 0x5000 Load Table 0x2000 The Sombrero system is organized into hierarchies of clusters for scalable distributed scheduling.

11 Sombrero Router Architecture of Sombrero Routers I/O Completion Port Service Threads Socket

12 Inter-node Communication Sombrero nodes communicate with each other through the routers. RMOCB 0x1000 RMOCB 0x2000 RMOCB 0x3000 Router 0x1 Router 0x11 A B B B A Cluster ICluster II

13 Router Tables Router 0x1 CD R3 AB R1 A : B : R3:

14 Router Tables(contd.) Router 0x3 C : D : R1: CD R3 AB R1

15 Address Space Allocation This project implements an address space allocation mechanism to distribute the 2 64 bytes address space amongst the nodes in the system. Example:- Consider a system of four Sombrero nodes (A, B, C and D). The nodes come online for the very first time in the order - A, B, C and D. CD R3 AB R1

16 The address space allocated for the nodes when A is initialized will be: A: 0x – 0xfffffffffffffff The address space allocated for the nodes when B is initialized will be: A: 0x – 0x7fffffffffffffff B: 0x – 0xffffffffffffffff Address Space Allocation(contd.)

17 The address space allocated for the nodes when C is initialized will be: A: 0x – 0x3fffffffffffffff B: 0x – 0xffffffffffffffff C: 0x – 0x7fffffffffffffff The address space allocated for the nodes when D is initialized will be: A: 0x – 0x3fffffffffffffff B: 0x – 0xffffffffffffffff C: 0x – 0x5fffffffffffffff D: 0x – 0x7fffffffffffffff Address Space Allocation(contd.)

18 Load Measurement A node’s workload can be estimated based on some measurable parameters: Total number of threads on the node at the time of load measurement. Instruction mixes of these threads (I/O bound or CPU bound).

19 Load Measurement (contd.) p  processor utilization of a thread f  heuristic factor (adjusts the importance of thread depending on how it is being used) The heuristic factor ‘f’ should have a large value for I/O intensive threads and a small value for CPU intensive threads. The values of the heuristic factor can be empirically determined by using a fully functional Sombrero prototype. Work Load =  i (p i  f i )

20 Load Measurement - Simulation In the simulation we assume that the processor utilization of all threads is the same:  This is sufficient to prove the correctness of the algorithm The measure of load at the node level is the number of Sombrero threads. A threshold policy has been defined:  high--number of Sombrero threads  HIGHLOAD  low--number of Sombrero threads < MEDIUMLOAD  medium--number of Sombrero threads < HIGHLOAD and number of Sombrero threads  MEDIUMLOAD

21 Load Tables Shared memory is used to distribute load information. (In Sombrero the shared memory consistency is managed by the token tracking mechanism) One load table is needed for each cluster. Thresholds of load have been established to minimize the exchange of load information in the network. Only threshold crossings are recorded in the load table.

22 Distributed Scheduling Algorithm Highly loaded nodes in minority Sender Initiated Algorithm Lightly loaded nodes in minority Receiver Initiated Algorithm Highly loaded nodes Lightly loaded nodes Medium loaded nodes are not considered

23 Distributed Scheduling Algorithm The algorithm used is dynamic i.e. sender initiated at lower loads and receiver initiated at higher loads. 1. Nodes loaded in the medium range do not participate in load balancing. 2. The load balancing is not to be done if the node belongs to the majority (larger of the groups of highly or lightly loaded nodes).

24 Distributed Scheduling Algorithm 3. Load balancing is to be done if node belongs to the minority (smaller of the groups of highly or lightly loaded nodes).  The node is heavily loaded and the algorithm is sender initiated:- choose a lightly loaded node at random and the RGETTHREADS message protocol is followed for thread migration.  The node is lightly loaded and the algorithm is receiver initiated:- choose a highly loaded node at random and the GETTHREADS message protocol is followed for thread migration.

25 Scaling the Algorithm Aggregating the clusters provides scalability. Thresholds for clusters are defined as given:  high: - no cluster members are lightly loaded and at least one member is highly loaded  low: - no cluster members are highly loaded and at least one member is lightly loaded  medium: - all other cases of loads where load balancing can occur within the cluster members or when all members of the cluster are medium loaded

26 Scaling the Algorithm 1. At any level of cluster only the nodes belonging to the minority group at that level will be active. 2. Load balancing at an n th level cluster will be attempted every (n  SOMECONSTANT) times the number of unsuccessful attempts at the node level. 3. A suitable n th level target cluster is found through the corresponding load table and the TRANSFERREQUEST message protocol is followed for thread migration. …... n=1 n=2 n=3

27 Testing Eight Nodes Cluster: [# of highly loaded nodes, # of medium loaded nodes, # of lightly loaded nodes]

28 Testing Three Clusters …... n=1 n=2

29 Testing Six Clusters at Two Levels n=1 n=2 n=3 ………………

30 Conclusion The testing of distributed scheduling using the simulator verifies that the algorithm functions correctly. It is observed that the increase in number of messages is proportional to the increase in number of heavily loaded nodes. The number of messages required for load balancing at the first level and above is the same if the ratio of heavily and lightly loaded nodes is kept constant at both levels.

31 Conclusion (contd.) Only one additional load table is required per additional cluster. Hence, the required number of messages is expected to increase by a small constant factor as the level of clustering increases. It can be concluded that the algorithm’s complexity is O(n) where n is the number of highly loaded nodes.

32 Future Work Porting of code from NT to Sombrero for the Sombrero node - communication code. Changing definition of load measurement to the more general formula. Reuse code from the Sombrero router. Adaptive cluster forming algorithm.

33 Acknowledgements Dr. Donald Miller Dr. Rida Bazzi Dr. Bruce Millard Mr. Alan Skousen Mr. Raghavendra Hebbalalu Mr. Ravikanth Nasika Mr. Tom Boyd

34