Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.

Similar presentations


Presentation on theme: "Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs."— Presentation transcript:

1 Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs closely tied to architecture –Software is being developed using 50’s mentality

2 Computing Trends Centralized systems are a thing of the past –Evolving towards cycle servers Each user has their own computer Workstations are networked –Typical LAN speeds are 100mbs For some a single workstation does not provide adequate computing power

3 A Solution A virtual computing environment –Utilize existing software to build a programming model that can be used to develop distributed and parallel applications –Provide tools to create, debug, and execute applications on heterogeneous hardware –Let the software map high level descriptions of the problems to available hardware –Programmer will no longer need to be concerned with low-level issues

4 Other Names For many scientists, it is not uncommon to find problems that require weeks or months of computation to solve. –Such an environment is called a High Throughput Computing (HTC) environment –Scientists involved in this type of research need a computing environment that delivers large amounts of computational power over a long period of time In contrast, High Performance Computing (HPC) environments deliver a tremendous amount of power over a short period of time.

5 Workstation Users All VCE configuration include some workstations Workstations are chronically underutilized Workstation users can be classified as follows: –Casual Users –Sporadic Users –Frustrated Users The VCE must help frustrated users without hurting casual and sporadic users

6 Other Considerations The VCE must be cost effective –Use existing tools like NFS, ISIS, PVM, MPI whenever possible –Must not require tremendous amounts of processor power The VCE must coexist with other software –Non-VCE applications should not be impacted by the VCE The VCE must avoid kernel modes

7 Users View of the VCE The software development module (SDM) provides tools to build and annotate an application task graph The Execution module (EXM) compiles the application and dispatches the tasks

8 The VCE Problem Specification Design Stage Coding Level Compilation Manager Runtime Manager SDM EXM

9 Runtime Issues Compilation Issues –Executables must be prepared to maximize scheduling flexibility –Compilations must be scheduled to maximize application performance and hardware utilization –Java?

10 Runtime Issues Task Placement –The criteria for selecting machines to host tasks must consider both hardware utilization and application throughput –Hints supplied by the programmer might improve task placement decisions

11 Processor Utilization Free Parallelism –Parallel applications with low efficiency benefit when run on idle machines Anticipatory Processing –Use idle resources to perform work which may be useful if certain schedules are ultimately executed

12 Load Balancing Central issue in the execution module Good application throughput must be achieved without impacting interactive users Many systems provide the ability to migrate tasks

13 Task Migration Various migration strategies are possible –Redundant execution –Check-pointing –Dump and migrate –Recompilation –Byte coded tasks

14 Systems Many systems are available which provide some form of a VCE –PVM –MPI –Beowulf –Condor –…

15 The Berkeley Now Project

16 Condor Condor is a software system that runs on a cluster of workstations to harness wasted CPU cycles. –A Condor pool consists of any number of machines, of possibly different architectures and operating systems, that are connected by a network To monitor the status of the individual computers in the cluster, Condor "daemons" must run all the time. –One daemon is called the "master". Its only job is to make sure that the rest of the Condor daemons are running.

17 Idle Machines Only Two other daemons run on every machine in the pool: startd and schedd Startd monitors information about the machine that is used to decide if it is available to run a Condor job –keyboard and mouse activity –load on the CPU –startd also notices when a user returns to a machine that is currently running and removes the job.

18 Condor Architecture

19 Condor Executables Code does not have to be modified in any way to be used in Condor –it must be linked with the Condor libraries Once re-linked, jobs gain two crucial abilities: –Checkpoint –Perform remote system calls Condor also provides a mechanism to run binaries that have not been re-linked, which are called "vanilla" jobs

20 Condor Executables

21 Condor Tricks Match Making –When a task is submitted to Condor, the system finds a machine that matches the resources required by the task Condor uses check-pointing to migrate jobs –You only loose the computation that has been performed since the last checkpoint Condor tasks move around to find the under utilized workstations

22 Beowulf The Beowulf parallel workstation is a single user multiple computer with direct access keyboard and monitors. Beowulf comprises: –16 motherboards with Intel x86 processors –256 Mbytes of DRAM, 16 MByte per processor board –16 hard disk drives and controllers –2 Ethernets and controllers per processor –2 high res monitors with controllers and 1 keyboard The Beowulf architecture is a fully COTS (Commodity Off The Shelf) configured system.


Download ppt "Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs."

Similar presentations


Ads by Google