Presentation is loading. Please wait.

Presentation is loading. Please wait.

Andrzej M. Goscinski School of Computing and Mathematics

Similar presentations

Presentation on theme: "Andrzej M. Goscinski School of Computing and Mathematics"— Presentation transcript:

1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers
Andrzej M. Goscinski School of Computing and Mathematics Deakin University Joint work with Michael Hobbs. Jackie Silcock and Justin Rough

2 Overview and Aims Basic issues and solutions
Parallel processing: user expectations, clusters, phases Parallelism management Transparency Communication paradigms What to do? Related systems Cluster Execution environments Middleware Cluster operating systems GENESIS Architecture Services for parallelism management and transparency GENESIS programming interface Message passing DSM Primitives Easy to Use and Program Environment Performance Study Summary and Future Work

3 Parallel Processing: User Expectations
Affordable Supercomputers for a “poor man” Performance Good performance Ease of Use Free from creation and placement concerns Transparency Unaware of location of processes Ease of Programming Choice and easy use of communication paradigm

4 Parallel Processing: Clusters
Advantages: Cheap to build: commodity PCs, networks Widely available Idle during weekends Low utilization during working hours Disadvantages: Poor and difficult to use software (operating systems and runtime systems) User unfriendly Distribution of resources (CPUs and peripherals) Clusters are an ideal platform for the execution of parallel applications Many institutions (universities, banks, industries) move toward homogeneous non-dedicated clusters

5 Parallel Processing Phases
Three distinct phases: Initialization Execution Termination Researchers and manufacturers mainly concentrate on execution to achieve the best performance Ease of use of parallel systems and programmer’s time are neglected Application developers are discouraged as they have to program many activities, which are of an operating system nature

6 Parallelism Management
Present operating systems that manage clusters are not built to support parallel processing Reason: these operating systems do not provide services to manage parallelism Parallelism management is the management of parallel processes and computational resources Achieve high performance Use computational resources efficiently Make programming and use of parallel systems easy

7 Parallelism Management
Parallelism management in parallel programming tools, Distributed Shared Memory and enhanced operating system environments has been neglected left to the application developers Application developers must deal not only with parallel application development but also with the problems of initiation and control for the execution on the cluster Transparency and reliability (SSI) have been neglected – users do not see a cluster as a single powerful computer

8 Services for Parallelism Management on Clusters
Services for parallelism management and transparency Establishment of a virtual machine Mapping of processes to computers Parallel processes instantiation Data (including shared) distribution Initialisation of synchronization variables Coordination of parallel processes Dynamic load balancing

9 Transparency Users should see a cluster as a single powerful computer
Dimensions of parallel processing transparency Location transparency Process relation transparency Execution transparency Device transparency

10 Communication Paradigms
Two communication paradigms: Message Passing (MP) Explicit communication between processes of a parallel application Fast Difficult to use for programmers Distributed Shared Memory (DSM) Implicit communication between processes of a parallel application through shared memory objects Easy to use Demonstrates reduced performance Claim: Operating environments that offer MP and DSM should be provided as a part of a cluster operating system as they manage system resources

11 What to do? Affordable Clusters Performance Introduce special services Ease of Use Parallelism management Transparency Operating systems Ease of Programming Message passing and DSM Development of cluster operating systems supporting parallel processing Services of cluster operating systems: Distributed services for transparent communication and management of basic system resources Services for parallelism management and transparency

12 Related Systems Message Passing Systems
PVM A set of cooperating server processes and specialized libraries that support process communication, execution and synchronization A virtual machine must be set up by the user Provides transparent process creation and termination MPI Objective is to standardize and coordinate the direction of various message passing applications, tools and environments Provides limited process management functions to support parallel processing HARNESS Does not provide transparency Programmers are forced to specify computers, map processes to these computers Load imbalance is neglected

13 Related Systems DSM Systems
Research concentrates mainly on improving performance Ease of use has been neglected Munin Programmers must label different variables according to the consistency protocol they require The initialisation stage requires the application developer to define the number of computers to be used Programmers must create a thread on each computer, initialise shared data and create synchronization variables TreadMarks The application developer has a substantial input into initialisation of DSM processes Full transparency is not provided

14 Related Systems Execution Environments
Improvement to PVM, MPI and DSM approach of running on top of an operating system is through the enhancement of an operating system to support parallel processing Beowulf Exploits distributed process space to manage parallel processes Processes can be started on remote computers after logon operation into that computer was completed successfully It does not address resource allocation nor load balancing Transparent process migration is not provided

15 Related Systems Execution Environments
NOW Combines specialized libraries and server processes with enhancement to the kernel Enhancement: scheduling and communication kernel modules- GLUnix to provide network wide process, file and VM management Parallelism management service: process initialisation on any cluster computer, support semi-transparent start of parallel processes on multiple nodes (how to select nodes?), barriers, MPI MOSIX Provides enhanced and transparent communication and scheduling within the kernel Employs PVM to provide parallelism support (initial placement) Process migration transparently migrates processes Provides dynamic load balancing and data collection Remote communication is handled through the originating computer

16 Related Systems Summary
All systems but MOSIX are based on middleware – there is no trial to develop a comprehensive operating system to support parallel processing on clusters The solutions are performance driven – little work has been done on making them programmer friendly Problems from parallel processing point of view: Processes are created one at a time although primitives provided enable the user to create multiple processes These systems (with the exception of MOSIX) do not provide complete transparency Virtual machine is not set up automatically These systems do not provide load balancing

17 Cluster Execution Environments
Execution environments that support parallel processing on clusters can be developed using Middleware approach – at the application level Underware – at the kernel level

18 Middleware OR Application processes
User process M PVM software Library functions or separate software Operating System (Unix) DSM software OR Application processes Operating system

19 Middleware - summary Middleware allows programmers Middleware
to develop parallel application (PVM, MPI) execute parallel applications on clusters (Beowulf) employ shared memory based programming (Munin) achieve good execution performance take advantage of portability Middleware does not offer complete transparency reduces potential execution performance (services are duplicated) forces programmers to be involved in many time consuming and error prone activities that are of the operating system nature Conclusion: to provide parallelism management, offer transparency, make programming and use of a system easy develop the needed services at the operating system level

20 Cluster operating systems
Cluster is a special kind of a distributed system Cluster operating system supporting parallel processing should possess the features of a distributed operating system to deal with distributed resources and their management and hide distribution exploit additional services to manage parallelism for application and offer complete transparency provide an enhanced programming environment Three logical levels of a cluster operating system Basic distributed operating system Parallelism management and transparency system Programming environment

21 Logical architecture of a cluster operating system
Message Passing/PVM M PROGRAMMING ENVIRONMENT Communication Services DSM Parallelism Management System Enhanced Subset of a Distributed Operating System (Microkernel, Communication/File Management) Shared Memory

22 GENESIS Cluster Operating System
Proof of concept Client-server model, microkernel approach and object based approach (all entities have names) All basic resources: processor, main memory, network, interprocess communication, files are managed by relevant servers IPC - Message passing services basic communication paradigm cornerstone of the architecture provided by IPC Manager and local IPC component of microkernel IPC placement and relationship with other services designed to achieve high performance and transparency DSM provided by Space (memory) and IPC Managers

23 The GENESIS Architecture
Global Scheduler RHODOS Microkernel DSM System Execution Manager IPC Space Process File/Cache Network MP PVM Resource Discovery Migration Parallel Processes Parallelism Management System Kernel Servers

24 GENESIS Services for Parallelism Management and Transparency
Basic services that provide parallelism management and offer transparency: Establishment of a virtual machine Process creation Process duplication Process migration Global scheduling

25 Establishment of a Virtual Machine
Resource Discovery Server supports adaptive establishment of a virtual machine Resource Discovery Server Identifies Idle and lightly loaded computers Computer resources: e.g., processor model, memory size Computational load and available memory Communication patterns for each process Passes information to the Global Scheduling Server per Process Server Averaged over an entire cluster Virtual machine changes dynamically Some computers become overloaded or out of order Some computers become idle

26 Process Creation Requirements Three forms of process creation:
Multiple process creation – to create many instances of a process on a single or over many computers Scalability – must be scalable to many computers Complete transparency – must hide the location of all resources and processes Three forms of process creation: Single Multiple Group Creation is invoked when the Execution Manager receives a process create request from a parent process Execution Manager notifies Global Scheduler Global Scheduler sends location on which process should be created Execution Manager on selected computer manages process creation

27 Process Creation Single and Multiple Services
Single process creation service Similar to the services found in traditional systems supporting parallel processing Requires executable image to be downloaded from disk for each parallel process to be created Multiple process creation service Supports the concurrent instantiation of a number of processes on a given computer through one creation call When many computers are involved in multiple process creation, each computer is addressed in a sequential manner Executable image of a parallel child process must be downloaded separately for each computer involved – scalability problem

28 Process Creation Group
Group process creation combines multiple process creation and group communication Group process creation service allows multiple process to be created concurrently on many computers Single executable is downloaded from a file server using group communication

29 Group Process Creation Behavior
File Server Global Scheduler Exec Manager Parent Child 1 Computer 1 Child 2 Computer 2 Child n Computer n 9 1 2 3 4 5 8 6 7

30 Process Duplication Single Local and Remote
Parallel processes are instantiated on selected computers by employing process duplication supported by process migration Three forms of process duplication Single local and remote Multiple local and remote Group remote Single local and remote process duplication Duplication is invoked when the Execution Manager receives a twin request from a parent process Execution Manager notifies Global Scheduler Global Scheduler sends a location on which twin should be placed If this computer is remote process migration is employed

31 Process Duplication Multiple Local and Remote
Multiple local and remote process duplication is an enhancement of single process duplication Duplication is invoked when the Execution Manager receives a multiple duplication request from a parent process Execution Manager notifies Global Scheduler Global Scheduler sends a location on which twin should be placed If computer is local Process Manager and Space Manager are requested to duplicate multiple copies of process entries and memory spaces If computer is remote the parent process is migrated to this destination multiple copies of the parent process are duplicated the parent process on the remote computer is killed Child processes should be duplicated on many computers Remote process duplication is performed for each selected computer

32 Process Duplication Group Remote
When more than one remote computer is involved in process duplication the overall performance decreases Decrease is caused by migrating a parent process to each remote computer sequentially Performance is improved by employing group process migration Process Managers and Execution Managers each join a relevant group and use group communication The parent process is concurrently migrated to all selected remote computers involved in process duplication

33 Group Remote Process Duplication Behavior
Computer n Global Scheduler Exec Manager Child 1 Parent Computer 1 9 1 2 3 5M 5 4 6 7 Migration Child 2 Computer 2 Child n 8 10

34 Process Migration Designed to separate policy from mechanism
Process Migration Manager acts as the coordinator for migration of various resources that combine to form a process Migration of resources: memory, process entries, buffers is carried out by the Space, Process and IPC Managers, respectively Two forms of process migration: single and group Single process migration Global Scheduler provides “which” process to “where” computer Local Manager requests its remote peer to prepare for a process Local Migration Manager requests Space, Process and IPC Managers to migrate respective resources Remote Manager informs its local peer of successful migration Local Manager requests Space, Process and IPC Managers to delete the respective resources of the migrated process

35 Process Migration Behavior
3 1 2 6 5 7 Process Manager Global Scheduler Migration Source Computer Space IPC Destination Computer Process State Spaces IPC Buffers 4 Event

36 Group Process Migration
Enhancement of the single process migration Modifying the single communication between the peer Migration Managers, Process Managers, Space Managers and IPC Managers to that of group communication Global Scheduler provides “which” process to “where” computers Each server migrates their respective resources to multiple destination computers in a single message using group communication Parent process is duplicated on each remote computer At the end of successful migration the parent process on each remote computer is killed

37 Global Scheduling Makes policy decisions of which processes should be mapped to which computers Input provided by the Resource Discovery Manager Relies on mechanisms of Single, multiple an group process creation and duplication services Single and group process migration The server combines services of Static allocation – at the initial stage of parallel processing Dynamic load balancing – to react to load fluctuations Currently, the Global Scheduler is implemented as a centralized server

38 GENESIS Programming Interface
Designed and developed to provide both communication paradigms: Message passing Shared memory Message Passing/PVM M PROGRAMMING ENVIRONMENT Communication Services DSM Parallelism Management System Enhanced Subset of a Distributed Operating System (Microkernel, Communication/File Management) Shared Memory

39 Message Passing Basic Message Passing GENESIS PVM
Exploits basic interprocess communication concepts Transparent and reliable local and remote IPC Integral component of GENESIS Offers standard message passing and RPC primitives GENESIS PVM PVM added to provide a well known parallelism programming tool Ported from the UNIX based PVM Implemented within a ‘library’ in GENESIS Mapping of the standard PVM services onto the GENESIS services Performance improvement of PVM on GENESIS No additional “classic” PVM server processes required Direct interprocess communication model instead of the default model Load balancing provided

40 Architecture of PVM on Unix
Server User Task 1 libpvm Kernel Computer 1 User Task 2 Computer 2 TCP Connections UDP Datagrams

41 Architecture of PVM on GENESIS
Execution Manager Migration Global Scheduler IPC Network Computer 1 Computer 2 PVM Comms User PVM Parallel Processes libpvm Microkernel

42 Distributed Shared Memory
DSM is an integral component of the operating system Since DSM is a memory management function the DSM system is integrated into the Space Manager Shared memory used as though it were physically shared Easy to use shared memory Low overhead, improved performance Two consistency models supported: Sequential – implemented using invalidation model Release – implemented using write-update model Synchronization and coordination of processes Semaphores - owned by Space Manager on particular computer Gaining ownership is distributed and mutually exclusive Barriers used for coordination – their management is centralized

43 Distributed Shared Memory
IPC Manager Process DSM Space User DSM Parallel Processes Computer 1 Computer 2 Microkernel Network Shared Memory

44 GENESIS Primitives Execution
Two groups of primitives to support execution services for the provision of communication and coordination services MP PVM DSM proc-ncreate() pvm_spawn() proc-exit() pvm_exit()

45 GENESIS Primitives Communication and Coordination
MP PVM DSM send() pvm_send() read access recv() pvm_recv() write access pvm_pkbuf() wait() pvm_unpkbuf() signal() barrier() pvm_barrier()

46 Easy to Use and Program Environment
GENESIS system Provides and efficient and transparent environment for execution of parallel applications Offers transparency Relieves programmers from activities such as: Selection of computers for a virtual a machine for the given application Setting up a virtual machine Mapping processes to virtual machine Process instantiation using process creation and duplication supported by process migration Load balancing

47 Easy to Use and Program Environment
In the GENESIS system Location of the remote computer(s) of the cluster is selected automatically by Global Scheduler Users do not know process location Programming of parallel applications has been made easy by providing Message passing: standard and PVM Distributed Shared Memory Powerful primitives: implement sequences of operations and provide transparency process_ncreate(GROUP_CREATE,n, “child_prog”) Process instantiation using process creation and duplication supported by process migration Load balancing

48 Performance of Standard Parallel Applications
GENESIS System 13 Sun3/50 Workstations 12 Computation + 1 File Server 10 Mbit/sec shared Ethernet Influence of process instantiation on execution performance GENESIS PVM vs. Unix PVM Standard parallel applications Successive Over Relaxation Quicksort Traveling Salesman Problem

49 Influence of Process Instantiation on Execution Performance Parallel Simulation (5, 25, 50 Second Workload) Simulation - amount of work relates to the overall exec time Two parameters: Work load (5, 25, 50 Seconds) Number of workstations (1 ..12) Global scheduler & migration Speedups for #comp = #proc

50 GENESIS PVM vs. Unix PVM IPC Latency
Support for IPC provided by the PVM server in Unix was substituted with GENESIS operating system mechanisms To measure the time saved by removing the server, a simple PVM application that exchanges messages (1kbyte –100kbytes) was used Round-trip time (including data packing and unpacking) was measured

51 GENESIS PVM vs. Unix PVM Speedup
Application used to study the influence of process instantiation - amount of work relates to the overall exec time – was studied Parameters: Number of workstations GENESIS with and without load balancing

52 Successive Over Relaxation
Parallel applications developed based on algorithms of Rice University Rice superior cluster hardware: DEC station-5000/240 + fast ATM net For 8 computers – array size: Rice x 2048 elements with 101 iterations; GENESIS 128 x 128 elements with 10 iterations DSM: TreadMarks – 6.3; GENESIS – 4.4 PVM: Rice – 6.91; GENESIS – 5.1

53 Quicksort Parallel applications developed based on algorithms of Rice
Rice superior cluster hardware: DEC station-5000/240 + fast ATM net For 8 computers – array size: Rice x 1024 integers; GENESIS 256 x 256 integers DSM: TreadMarks – 5.3; GENESIS – 2.5 PVM: Rice – 6.79; GENESIS – 6.07

54 Traveling Salesman Problem
Parallel applications developed based on algorithms of Rice University Rice superior cluster hardware: DEC station-5000/240 + fast ATM net For 8 computers – 18 city tour; with the minimum threshold set to 13 cities DSM: TreadMarks – 4.74; GENESIS – 6.33 PVM: Rice – 5.63; GENESIS – 5.94

55 Summary Nondedicated clusters are commonly available
Force application developers to program operating system operations Do not offer transparency Application developers need a computer system that Processes applications efficiently Uses cluster resources well Allows to see cluster as a single powerful computer rather than as a set of connected computers Proposal: employ a cluster operating system Design: cluster operating system with three logical levels Distributed operating system Parallelism management and transparency system Programming environment

56 Summary GENESIS – designed and developed as a “proof of concept”
GENESIS is a system that satisfies user requirements GENESIS approach is unique Offers both message passing (MP and PVM) and DSM environment Services providing parallelism management are integral components of an operating system Provides a comprehensive environment to transparently manage system resources Programmers do not have to be involved in parallelism management Use of the cluster is has been made easy Complete transparency is offered Good performance results have been achieved

57 Future Work Port GENESIS to an Intel like platform
Use virtual memory to support DSM Offer reliable parallel computing services on clusters by employing Reliable group communication Checkpointing to offer fault tolerance

Download ppt "Andrzej M. Goscinski School of Computing and Mathematics"

Similar presentations

Ads by Google