Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.

Similar presentations


Presentation on theme: "Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access."— Presentation transcript:

1 Sun Grid Engine

2 Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access point; kind of like plugging into an electrical grid Cluster grids: resources in one room Campus grids: multiple clusters on one campus Global grids: Cross administrative domains

3 Grids Potentially (ideally?) you could completely outsource your HPC needs by buying time on a commercial grid. Running a big data center is tricky and takes expensive people. If you are, say, a small computer animation group working on an animated short it might not make sense to set up a data center for six months of work OTOH, if you’re Pixar or Lucas this is a core competency

4 Sun Grid Engine SGE is a piece of software that matches jobs to compute resources BTW, SGE runs on OS X. This would be another fine project for someone to investigate

5 SGE As we’ve seen, Sun Grid Engine can accept a batch job and give it to a compute node. SGE (base level) is open source; see http://gridengine.sunsource.net/ http://gridengine.sunsource.net/ There are some other issues: Multiple queues Giving jobs only to nodes with the necessary resources Queue manipulation

6 SGE Users submit jobs; they’re kept by SGE in a holding area until resources become available, then sent to an execution device. The results are reported back. Types of hosts: master, execution, administration, and submit Master runs the master daemon and scheduling daemon Execution hosts are where jobs are run, admin hosts can manipulate the queues There are a lot of knobs to twiddle on SGE

7 SGE Imagine a bank that has five customers walk in. Four just want to deposit a check, and the fifth wants to set up a home loan. If the home loan guy happens to be first, and there is only one queue, the four with short transactions wait for a long time. What’s more, the home loan guy must have manager approval at some point in the process So: set up two queues, one for long transactions, one an express lane. The home loan queue specifies that the manager must be available. This reduces the median time spent in queue for the short transaction customers, and reduces the variance of the waiting time

8 SGE Queues There may be more than one queue; jobs are associated with queues qconf -sql Shows the list of defined queues Why multiple queues? Some types of jobs may be very long or require specific resources, so users may submit jobs to queues optimized for those types of jobs SGE Master Q1 Q2 SGE Scheduler Execution Host Execution Host Execution Host

9 Scheduler The scheduler (which assigns jobs to execute hosts) looks at several factors: Load parameters, how busy the execute hosts are by some measure Consumable resources, memory, disk space, licenses, etc. SGE keeps track of these and dispatches a job only if resources are available Attributes, such as 64-bit, G5, etc. These aren’t necessarily consumed, but may simply be a state The scheduler may look at all these factors before assigning a job from the holding pool to an execution host

10 Consumable Resources There are some finite resources in the cluster: CPU time, disk space, licenses, bandwidth Available capacity for these is defined by the administrator; the scheduler examines available consumables when deciding what to run

11 Requestable Attributes On job submission you can request attributes or characteristics: at least X amount of memory, a license for software package Y, a 64 bit host, etc. In a production environment licenses can be a big deal. Circuit design software may cost thousands per node, so not every node on the cluster may have a license. The attributes can be related to the hosts or the queues Attributes that are “requestable” can be mentioned in the qsub command, so jobs may require that attribute to run

12 SGE You don’t need to submit a job to a specific queue; instead you can simply ask for certain resources, and SGE will pick a queue based on the requirement profile

13 Environment Variables When a job runs on a host some environment variables are set: ARC SGE_ROOT SGE_STDOUT_PATH HOME

14 Dependencies Suppose you divide up a task into several subtasks. This can require sequencing--some subtasks may need to be finished before other subtasks can run. You can specify a list of jobs that must finish before this job runs

15 Listing Attributes qconf -scl lists “complexes” of attributes. Typically this includes a complex for the queues, and one for the hosts qconf -sc host|queue Lists attributes for a complex #name shortcut type value relop requestable consumable default #-------------------------------------------------------------------------------------- arch a STRING none == YES NO none num_proc p INT 1 == YES NO 0 load_avg la DOUBLE 99.99 >= NO NO 0

16 Modifying Attributes Qconf -mc [complex name] opens up an editor that allows you to modify the complex settings

17 Attributes Note that some attributes are “requestable”. This means that you can specify that your job requires that attribute from the qsub command line. Qsub -l arch=“glinux” says the job requires a “glinux” host to run Qconf -se compute-0-0 shows resources for a host

18 Priorities By default jobs are handled in a FIFO manner. As they come in they are assigned to a compatible queue for processing by the scheduler. Qsub -p can provide a priority to the job that can override FIFO behavior. Qdel and qstat to find and delete jobs from the holding area

19 Checkpointing Sometimes on very long jobs it is worthwhile to be able to stop the job and restart it later. What are the issues involved here? Why use it? Starter, suspend, resume, terminate methods

20 Hard & Soft Requirements A hard requirement must be present before the job is scheduled


Download ppt "Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access."

Similar presentations


Ads by Google