Presentation is loading. Please wait.

Presentation is loading. Please wait.

Batch Queuing Systems The Portable Batch System (PBS) and the Load Sharing Facility (LSF) queuing systems share much common functionality in running batch.

Similar presentations


Presentation on theme: "Batch Queuing Systems The Portable Batch System (PBS) and the Load Sharing Facility (LSF) queuing systems share much common functionality in running batch."— Presentation transcript:

1 Batch Queuing Systems The Portable Batch System (PBS) and the Load Sharing Facility (LSF) queuing systems share much common functionality in running batch jobs. However, they differ in their implementation of the batch environment and their user commands. Table 1 below provides a comparative list of command options to help users migrating from LSF (used on halem) to PBS (used on palm and discover).

2 Queuing System LSF(halem) PBS (palm and discover) Resource Directive Sentinel #BSUB#PBS # of Nodes/Processors-n (nodes)On palm: -l ncpus= (Processors) On discover: -l select= (nodes) Wall Clock Limit-W hh:mm-l walltime=hh:mm:ss Queue-q Email notification-B sends mail when job begins -N sends job report when finished -m b sends mail when job begins -m e sends mail when job ends Email address-u -M Initial Directory(default = job submission directory)(default = $HOME) Job Name-J -N STDOUT-o STDERR-e STDERR & STDOUT to same file (use -o without -e)-j oe (both to STDOUT) -j eo (both to STDERR) Project to charge-P -W group_list= Table 1: Syntax for frequently used options

3 Queuing System LSF on halem PBS on palm or discover Submissionbsubqsub Deletionbkillqdel Statusbjobsqstat Queue Listbqueues -lqstat -Q GUI monitorxpbsmon Table 2: Frequently used job management commands ( check man pages of each command for more information) The following table compares commonly-used LSF and PBS commands to control and monitor the jobs. Batch Job Management

4 Both LSF and PBS provide support for special environment variables, which simplify scripting and configuration of the batch jobs. Queuing System LSF on halem PBS on palm or discover Processor List$LSB_HOSTScat $PBS_NODEFILE Directory of Submission$LS_SUBCWD$PBS_O_WORKDIR Job Id$LSB_JOBID$PBS_JOBID Table 3: Useful environmental variables Environment Variables

5 Example Batch Scripts The following simple LSF and PBS submission scripts compare how the batch systems request comparable resources and run the same parallel executable: LSF example : #!/bin/csh #BSUB -n 4 #BSUB -W 6:00 #BSUB -q special_b #BSUB -J myJobName #BSUB -o out.o%J #BSUB -u my_email@gsfc.nasa.gov #BSUB -P k1234 echo "Master Host: `hostname` " echo "Node List: $LSB_HOSTS " cd $LS_SUBCWD prun -n 16./mpihello To submit job, type: bsub < script_name PBS example : #!/bin/csh #PBS -l select=4:ncpus=4 <--- on discover or… #PBS -l ncpus=16 <--- on palm #PBS -l walltime=6:00:00 #PBS -q general #PBS -N myJobName #PBS -j oe #PBS -me -M my_email@gsfc.nasa.gov #PBS -W group_list=k1234 echo "Master Host: $PBS_O_HOST" echo "Nodes:"; cat -n $PBS_NODEFILE cd $PBS_O_WORKDIR mpirun -np 16./mpihello To submit job, type: qsub script_name

6 Interactive Batch Both queuing systems can enter an interactive batch mode, commonly used for debugging, by using the -Is (LSF) or -I (PBS) option. Other options are the same as previously shown, but will be entered all on one line. Commands for the two different queuing systems are compared below: LSF example (halem) : % bsub -Is -Pk1234 -qspecial_b -W6:00 -n4 /usr/dlocal/bin/tcsh When the requested processors are available, the interactive prompt will appear: bsub> cd $LS_SUBCWD bsub> prun -n 16./mpihello bsub> exit PBS example (discover or palm) : on discover: % qsub -I -W group_list=k1234 -q general -l walltime=06:00:00,select=4:ncpus=4 or on palm: % qsub -I -W group_list=k1234 -q general -l walltime=06:00:00,ncpus=16 When the requested processors are available, the interactive prompt will appear: % cd $PBS_O_WORKDIR % mpirun -np 16./mpihello % exit


Download ppt "Batch Queuing Systems The Portable Batch System (PBS) and the Load Sharing Facility (LSF) queuing systems share much common functionality in running batch."

Similar presentations


Ads by Google