Presentation is loading. Please wait.

Presentation is loading. Please wait.

Saguaro User Environment Fall 2013 Outline Get an account Linux Clusters Initial login Modules Compiling Batch System Job Monitoring System & Memory.

Similar presentations

Presentation on theme: "Saguaro User Environment Fall 2013 Outline Get an account Linux Clusters Initial login Modules Compiling Batch System Job Monitoring System & Memory."— Presentation transcript:


2 Saguaro User Environment Fall 2013

3 Outline Get an account Linux Clusters Initial login Modules Compiling Batch System Job Monitoring System & Memory Information Interactive mode Improving performance

4 Anti-outline Shell scripting Available software Installing software Debugging (parallel or otherwise) For any assistance with these – please contact us at any time:

5 Get an account - “Getting started” Faculty and academic staff get 10K hours every year. Students link to faculty account For assistance installing/using software on the cluster: Login via ssh, putty (Windows). ssh

6 Generic Cluster Architecture File Server PC+ internet PC+ Server PC Switch … Ethernet Myrinet, IB, Quadrics,... FCAL, SCSI,… (Adv. HPC System) Switch GigE, Infiniband

7 Initial Login Login with SSH ssh Do not run your programs on the login nodes Connects you to a login node Don’t overwrite ~/.ssh/authorized_keys – Feel free to add to it if you know how to use it – SSH is used for job start up on the compute nodes. Mistakes can prevent your jobs from running For X forwarding, use ssh –X

8 Packages (saguaro) Modules are used to setup your PATH and other environment variables They are used to setup environments for packages & compilers saguaro% module {lists options} saguaro% module avail {lists available packages} saguaro% module load {add one or more packages} saguaro% module unload {unload a package} saguaro% module list {lists loaded packages} saguaro% module purge {unloads all packages} Multiple compiler families available, so make sure you are consistent between libraries, source and run script!

9 MPP Platforms Local Memory Interconnect Processors Clusters are Distributed Memory platforms. Each processor has its own memory. Use MPI on these systems. … … …

10 SMP Platforms Shared Memory Banks Memory Interface Processors The Saguaro nodes are shared-memory platforms. Each processor has equal access to a common pool of shared memory. Saguaro queues have 8, 12, and 16 cores per node. One node has 32 cores. … … for(i; i

11 Compile (saguaro) After linking (via module command) Different compilers unfortunately means different flags saguaro% module load intel saguaro% icc –openmp –o program program.c saguaro% module purge saguaro% module load gcc/4.4.4 saguaro% gcc –fopenmp –o program program.c saguaro% gfortran –fopenmp –o program program.f Multiple compiler families available, so make sure you are consistent between libraries, source and run script!

12 Batch Submission Process internet Server Head C1C2C3C4 Submission: qsub job Queue: Job Script waits for resources on Server Master: Compute Node that executes the job script, launches ALL MPI processes Launch: contact each compute node to start executable (e.g. a.out) Launch mpirun Master Queue Compute Nodes mpiexec a.out

13 Torque and Moab and Gold Torque – The resource manager ( qstat ) Moab – The scheduler ( checkjob ) Gold – The accounting system ( mybalance )

14 Batch Systems Saguaro systems use Torque for batch queuing and Moab for scheduling Batch jobs are submitted on the front end and are subsequently executed on compute nodes as resources become available. “qsub scriptname”. Charged only for the time your job runs. Interactive jobs for keyboard input/GUI’s. You will be logged into a compute node. “qsub –I”. Charged for every second connected. Order of job execution depends on a variety of parameters: –Submission Time –Backfill Opportunities: small jobs may be back-filled while waiting for bigger jobs to complete –Fairshare Priority: users who have recently used a lot of compute resources will have a lower priority than those who are submitting new jobs –Advanced Reservations: jobs my be blocked in order to accommodate advanced reservations (for example, during maintenance windows) –Number of Actively Scheduled Jobs: there are limits on the maximum number of concurrent processors used by each user

15 Commonly Used TORQUE Commands qsubSubmit a job qstatCheck on the status of jobs qdelDelete running and queued jobs qholdSuspend jobs qrlsResume jobs qalterChange job information man pages for all of these commands

16 TORQUE Batch System VariablePurpose PBS_JOBIDBatch job id PBS_JOBNAMEUser-assigned (-J) name of the job PBS_TASKNUMNumber of slots/processes for a parallel job PBS_QUEUEName of the queue the job is running in PBS_NODEFILEFilename containing list of nodes PBS_O_WORKDIRJob working directory

17 Batch System Concerns Submission (need to know) –Required Resources –Run-time Environment –Directory of Submission –Directory of Execution –Files for stdout/stderr Return – Notification Job Monitoring Job Deletion –Queued Jobs –Running Jobs

18 TORQUE: OpenMP Job Script #!/bin/bash #PBS -l nodes=1:ppn=8 #PBS -N hello #PBS -j oe #PBS -o $PBS_JOBID #PBS -l walltime=00:15:00 module load openmpi/1.7.2-intel-13.0 cd $PBS_O_WORKDIR export OMP_NUM_THREADS=8./hello # of cores Job name Join stdout and stderr Execution commands Output file nameMax Run Time (15 minutes)

19 Batch Script Debugging Suggestions Echo issuing commands –(“set -x” or “set echo” for bash or csh). Avoid absolute pathnames –Use relative path names or environment variables ($HOME, $PBS_O_WORKDIR) Abort job when a critical command fails. Print environment –Include the "env" command if your batch job doesn't execute the same as in an interactive execution. Use “./” prefix for executing commands in the current directory –The dot means to look for commands in the present working directory. Not all systems include "." in your $PATH variable. (usage:./a.out). Track your CPU time

20 Job Monitoring (showq utility) saguaro% showq ACTIVE JOBS JOBID JOBNAME USERNAME STATE PROC REMAINING STARTTIME _90_96x6 vmcalo Running 64 18:09:19 Fri Jan 9 10:43: naf phaa406 Running 16 17:51:15 Fri Jan 9 10:25: N phaa406 Running 16 18:19:12 Fri Jan 9 10:53:46 23 Active jobs 504 of 556 Processors Active (90.65%) IDLE JOBS JOBID JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME poroe8 xgai Idle :00:00 Thu Jan 8 10:17: meshconv019 bbarth Idle 16 24:00:00 Fri Jan 9 16:24:18 3 Idle jobs BLOCKED JOBS JOBID JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME _90_96x6 vmcalo Deferred 64 24:00:00 Thu Jan 8 18:09: _90_96x6 vmcalo Deferred 64 24:00:00 Thu Jan 8 18:09:11 17 Blocked jobs Total Jobs: 43 Active Jobs: 23 Idle Jobs: 3 Blocked Jobs: 17

21 Job Monitoring (qstat command) saguaro$ qstat -s a job-ID prior name user state submit/start at queue slots NAMD xxxxxxxx r 01/09/ :13: tf7M.8 xxxxxxxx r 01/09/ :36: f7aM.7 xxxxxxxx r 01/09/ :33: ch.r32 xxxxxxxx r 01/09/ :56: NAMD xxxxxxxx qw 01/09/ :23: f7aM.8 xxxxxxxx hqw 01/09/ :03: tf7M.9 xxxxxxxx hqw 01/09/ :06: Basic qstat options: -u username Display jobs belonging to specified user (\* or ‘*’ for all) -r/-f Display extended job information Also check out the qmem and qpeek utilities.

22 TORQUE Job Manipulation/Monitoring To kill a running or queued job (takes ~30 seconds to complete): qdel qdel -p (Use when qdel alone won’t delete the job) To suspend a queued job: qhold To resume a suspended job: qrls To alter job information in queue: qalter To see more information on why a job is pending: checkjob -v (Moab) To see a historical summary of a job: qstat -f (TORQUE) To see available resources: showbf (Moab)

23 System Information Saguaro is a ~5K processor Dell Linux cluster, –>45 trillion floating point operations per second –>9TB aggregate RAM –>400TB aggregate disk –Infiniband Interconnect –Located in GWC 167 Saguaro is a true *cluster* architecture –Most nodes have 8 processors and 16GB of RAM. –Programs needing more resources *must* use parallel programming –Normal, single processor applications *do not go faster* on Saguaro

24 HardwareComponentsCharacteristics Compute Nodes Dell M600 Dell M610 Dell M620 Dell M710HD Dell R900 Dell R “Harpertown” Nodes 64 “Nehalem” Nodes 32 “Westmere” nodes 32 “WestmereEP” nodes 11 “Tigerton” nodes 1 SMP node w/32 procs, 1TB RAM 2.00/2.93 GHz 8MB/20MB Cache 16/96 GB Mem/node* 8/16 cores/node* SCRATCH File System I/O Nodes Dell Filers – DataDirect 6 I/O Nodes Lustre File System 415 TB Login and Service Nodes3 login & 2 datamover nodes Assorted service nodes 2.0Ghz, 8GB mem 2.67Ghz, 148GB Interconnect (MPI) InfiniBand 24-port leafs FDR and DDR core switches DDR - FDR Fat Tree Topology Ethernet (1GE/10GE ) OpenFlow and SDN switching 10G/E to campus research ring Saguaro Summary

25 Available File Systems Scratch Home Archive NFS Compute Node Lustre All Nodes Login nodes and desktop systems Data Compellent NetApp DDN

26 File System Access & Lifetime Table Mount pointUser Access Limit Lifetime /home (NetApp)50GB quotaProject /scratch (DDN)no quota30 days /archive (NetApp)Extendable5 years /data (Compellent)Extendable

27 Interactive Mode Useful commands – qsub –I –l nodes=2 Interactive mode, reserving 2 processors – qsub –I -X Interactive mode with X forwarding – screen Detach interactive jobs (reattach with –r) – watch qstat Watch job status

28 Improving Performance Saguaro is a heterogeneous system – know what hardware you are using: qstat -f more /proc/cpuinfo /proc/meminfo echo $PBS_NODELIST CPUPart #IB data rate CPUs per node GB per node Node ids WestmereEP5670QDR1296s63 SandyBridgeE5-2650FDR1664s61-s62 Harpertown5440DDR816s23-s42 Tigerton7340DDR1664fn1-fn12* Nehalem5570DDR824s54-s57 Westmere5650Serial1248s58-s59 { on node

29 TORQUE: OpenMP Job Script II #!/bin/bash #PBS -l nodes=1:ppn=8:nehalem #PBS -lwalltime=00:15:00 #PBS -m abe #PBS -M module load intel cd $PBS_O_WORKDIR export OMP_NUM_THREADS=8 export OMP_DYNAMIC=TRUE time./hello # of cores, architecture Send mail when job aborts/begins/ends address walltime

Download ppt "Saguaro User Environment Fall 2013 Outline Get an account Linux Clusters Initial login Modules Compiling Batch System Job Monitoring System & Memory."

Similar presentations

Ads by Google