Presentation is loading. Please wait.

Presentation is loading. Please wait.

Getting Started with HPC On Iceberg Michael Griffiths and Deniz Savas Corporate Information and Computing Services The University of Sheffield www.sheffield.ac.uk/wrgrid.

Similar presentations


Presentation on theme: "Getting Started with HPC On Iceberg Michael Griffiths and Deniz Savas Corporate Information and Computing Services The University of Sheffield www.sheffield.ac.uk/wrgrid."— Presentation transcript:

1 Getting Started with HPC On Iceberg Michael Griffiths and Deniz Savas Corporate Information and Computing Services The University of Sheffield www.sheffield.ac.uk/wrgrid

2 Review of hardware and software Accessing Managing Jobs Building Applications Resources Getting Help Outline

3 Job dependencies with arrays ( -hold_jid ) Job dependencies allow one to specify that one job should not be run until another job completes. One can use job dependencies as follows : ▬ In a two-step process, where the second step depends on the results of the first ▬ Splitting one long job into two smaller jobs helps the queue scheduler be more efficient ▬ One can allocate resources to each job separately. Often, one step requires more or less memory than the other. ▬ To avoid clogging the queue with a large number of jobs job dependencies can effectively limit the number of running jobs independent of the number of jobs submitted.

4 Job dependencies with arrays ( -hold_jid ) suppose one has two scripts: step1.sh and step2.sh One can make step2.sh dependent on step1.sh as follows : $ qsub step1.sh ▬ Your job 12357 ('step1.sh') has been submitted $ qsub -hold_jid 12357 step2.sh. ▬ Your job 12358 ('step2.sh') has been submitted Or, By explicitly using the job name $ qsub –N myjob step1.sh $ qsub -hold_jid myjob step2.sh One could also capture the step1_jid to be used in the step2 submit, as follows $ step1id=`qsub -terse step1.sh`; qsub -hold_jid $step1id step2.sh -terse causes the qsub to display only the job-id of the job being submitted rather than the regular "Your job..." string.

5 Building Applications Overview The operating system on iceberg provides full facilities for, –scientific code development, –compilation and execution of programs. The development environment includes, –debugging tools provided by the Portland test suite, –the eclipse IDE.

6 Compilers PGI, GNU and Intel C and Fortran Compilers are installed on iceberg. PGI ( Portland Group) compilers are readily available to use as soon as you log into a worker node. A suitable module command is necessary to access the Intel or GNU compilers. The following modules relating to compilers are available to use with the module add ( or module load) command: compilers/pgi compilers/intel compilers/gcc Java compiler and the phyton development environment can also be made available by using the following modules respectively; apps/java apps/phyton

7 Building Applications Compilers C and Fortran programs may be compiled using the GNU or Portland Group. The invoking of these compilers is summarized in the following table: LanguagePGI Compiler Intel Compiler GNU Compilers CPggcciccgcc C++pgCCicpcg++ FORTRAN 77pg77ifortg77 FORTRAN 90/95pgf90ifortgfortran

8 Building Applications Compilers All of these commands take the filename containing the source to be compiled as one argument followed by a list of optional parameters. Example: pgcc myhelloworld.c –o hello The filetype {suffix} usually determines how the syntax of the source file will be treated. For example myprogram.f will be treated as a fixed format (FTN77 style) source where as myprogram.f90 will be assumed to be free format ( Fortran90 style) by the compiler. Most compilers have a --help or –help switch that lists the available compiler options. -V parameter lists the version number of the compiler you are using.

9 Help and documentation on compilers As well as the –help or --help parameters of the compiler commands there are man ( manual ) pages available for these compilers on iceberg. For example; man pgcc, man icc, man gcc Full documentation provided with the PGI and Intel compilers are accessible via your browser from any platform via the page: http://www.shef.ac.uk/wrgrid/software/compilers

10 Building Applications A few Compiler Options OptionEffect -c CompileCompile, do not link. -o exefileSpecifies a name for the resulting executable. -gProduce debugging information (no optimization). -MboundsCheck arrays for out of bounds access. -fastFull optimisation with function unrolling and code reordering.

11 Building Applications Compiler Options OptionEffect -Mvect=sse2Turn on streaming SIMD extensions (SSE) and SSE2 instructions. SSE2 instructions operate on 64 bit floating point data. -Mvect=prefetchGenerate prefetch instructions. -tp k8-64-tp k8-64 Specify target processor type to be opteron processor running 64 bit system. -g77 libsLink time option allowing object files generated by g77 to be linked into programs (n.b. may cause problems with parallel libraries).

12 Building Applications Sequential Fortran Assuming that the Fortran program source code is contained in the file mycode.f90, to compile using the Portland group compiler type: pgf90 mycode.f90 In this case the code will be output into the file a.out. To run this code issue:./a.out at the UNIX prompt. To add some optimization, when using the Portland group compiler, the –fast flag may be used. Also –o may be used to specify the name of the compiled executable, i.e.: pgf90 –o mycode –fast mycode.f90 The resultant executable will have the name mycode and will have been optimized by the compiler.

13 Building Applications Sequential C Assuming that the program source code is contained in the file mycode.c, to compile using the Portland C compiler, type: pgcc –o mycode mycode.c In this case, the executable will be output into the file mycode which can be run by typing its name at the command prompt:./mycode

14 Memory Issues Programs requiring larger than 2Gigabytes of memory for its data ( i.e. using very large arrays etc. ) may get into difficulties due to addressing issues when pointers can not hold the values of these large addresses. It is also advisable that variables that store and use the array indices have sufficient number of bytes allocated to them. For example, it is not wise to use short_int (C) or integer*2 (Fortran) for variables holding array indices. Such variables must be re-declared as long_int or integer*4. To avoid such problems; when using the PGI compilers use the option; –mcmodel=medium when using the Intel compilers use the option; –mcmodel=medium –shared-intel

15 Setting other resource limits ulimit ulimit provides control over available resources for processes ▬ ulimit –a report all available resource limits ▬ ulimit –s XXXXX set maximum stacksize Sometimes necessary to set the hardlimit e.g. ▬ ulimit –sH XXXXXX

16 Useful Links for Memory Issues 64 bit programming memory issues ▬ http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/64- bit.html http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/64- bit.html Understanding Memory ▬ http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/me m.html http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/me m.html

17 Checkpointing Jobs Simplest method for checkpointing ▬ Ensure that applications save configurations at regular intervals so that jobs may be restarted (if necessary) using these configuration files. Using the BLCR checkpointing environment ▬ BLCR commands ▬ Using BLCR checkpoint with an SGE job Help on checkpointing ▬ https://upc-bugs.lbl.gov/blcr/doc/html/BLCR_Users_Guide.html https://upc-bugs.lbl.gov/blcr/doc/html/BLCR_Users_Guide.html

18 Checkpointing jobs Using BLCR BLCR commands relating to checkpointing. cr_run, cr_checkpoint, cr_restart Set an environment variable to avoid an error export LIBCR_DISABLE_NSCD=1 Start running the code under the control of the check_point system cr_run myexecutable [parameters] Find out it’s process_id (PID) ps | grep myexecutable Checkpoint it and write the state into a file cr_checkpoint -f checkpoint.file PID If and when my executable fails/crashes runs out of time etc. it can now be restarted from the checkpoint file you specified. cr_restart checkpoint.file

19 Using BLCR checkpoint with an SGE Job A checkpoint environment has been setup called BLCR - it's accessible using the test cstest.q queue. An example of a checkpointing job would look something like: # #$ -l h_rt=168:00:00 #$ -c sx #$ -ckpt blcr cr_run./executable >> output.file The -c hh:mm:ss options tells SGE to checkpoint over the specified time interval. The -c sx options tells SGE to checkpoint if the queue is suspended, or if the execution daemon is killed.

20 Restart a checkpointed job create a new jobscript with the same options but use the cr_restart command to resume the job: #!/bin/bash #$ -l h_rt=165:50:00 [..... any other normal options...] #$ -ckpt blcr #$ -c sx cr_restart checkpoint.[jobId].[Pid] replacing [jobId] and [Pid] with the values for the checkpoint file. Each time the job ends a new checkpoint file will be generated, and you can then use the new checkpoint file to resubmit the job.

21 Getting Help The wrgrid website https://www.sheffield.ac.uk/wrgridhttps://www.sheffield.ac.uk/wrgrid ▬ How to use https://www.sheffield.ac.uk/wrgrid/usinghttps://www.sheffield.ac.uk/wrgrid/using ▬ Software https://www.sheffield.ac.uk/wrgrid/softwarehttps://www.sheffield.ac.uk/wrgrid/software ▬ Data Management https://www.sheffield.ac.uk/wrgrid/datahttps://www.sheffield.ac.uk/wrgrid/data ▬ FAQS https://www.sheffield.ac.uk/wrgrid/questionanswerhttps://www.sheffield.ac.uk/wrgrid/questionanswer ▬ News and Events https://www.sheffield.ac.uk/wrgrid/eventshttps://www.sheffield.ac.uk/wrgrid/events ▬ Training https://www.sheffield.ac.uk/wrgrid/traininghttps://www.sheffield.ac.uk/wrgrid/training Contacts https://www.sheffield.ac.uk/wrgrid/contactshttps://www.sheffield.ac.uk/wrgrid/contacts ▬ CICS Helpdesk ▬ Iceberg admins

22 Building Applications 8: Debugging The Portland group debugger is a –symbolic debugger for Fortran, C, C++ programs. Allows the control of program execution using – breakpoints, –single stepping and enables the state of a program to be checked by examination of – variables –and memory locations.

23 Building Applications 9: Debugging PGDBG debugger is invoked using – the pgdbg command as follows: –pgdbg arguments program arg1 arg2.. Argn –arguments may be any of the pgdbg command line arguments. –program is the name of the traget program being debugged, – arg1, arg2,... argn are the arguments to the program. To get help from pgdbg use: pgdbg -help

24 Building Applications 10: Debugging PGDBG GUI –invoked by default using the command pgdbg. –Note that in order to use the debugging tools applications must be compiled with the -g switch thus enabling the generation of symbolic debugger information.

25 Building Applications 11: Profiling PGPROF profiler enables –the profiling of single process, multi process MPI or SMP OpenMP, or –programs compiled with the -Mconcur option. The generated profiling information enables the identification of portions of the application that will benefit most from performance tuning. Profiling generally involves three stages: –compilation –exection –analysis (using the profiler)

26 Building Applications 12: Profiling To use profiling in is necessary to compile your program with the following options indicated in the table below: OptionEffect -Mprof=funcInsert calls to produce function level pgrpof output. -Mprof=linesInsert calls to produce line level pgprof output. -Mprof=mpi.Link in mpi profile library that intercepts MPI calls to record message sizes and count message sends and receives. e.g. - Mprof=mpi,func. -pgEnable sample based profiling.

27 Building Applications 13: Profiling The PG profiler is executed using the command pgprof [options] [datafile] –Datafile is a pgprof.out file generated from the program execution.

28 Shared Memory applications using OpenMP Fortran and C programs containing OpenMP compiler directives can be compiled to take advantage of parallel processing on iceberg. OpenMP model of programming uses a thread-model whereby a number of instances “threads” of a program run simultaneously, when necessary communicating with each other via the memory that is shared by all threads. Although any given processor can run multiple threads of the same program via the operating system’s multi-tasking ability, it is more efficient to allocate one thread per processor in a shared memory machine. On Iceberg we have the following types of compute nodes; 2 dual-core (= 2*2 = 4 processors) AMD nodes 2 quad-core (= 2*4= 8 processor) AMD nodes 2 six-core (=2*6 =12 processor ) Intel nodes Therefore it is usually advisable to restrict OpenMP jobs to about 12 threads when using iceberg.

29 Shared Memory Applications Compiling OpenMP applications Source code that contains $OMP pragmas for parallel programming can be compiled using the following flags: –PGI C, C++, Fortran77 or Fortran90 pgf77, pgf90, pgcc or pgCC -mp [other options] filename -Intel C/C++, Fortran ifort, icc or icpc –openmp [other options] filename -Gnu C/C++, Fortran gcc or gfortran –fopenmp [other options] filename Note that source code compilation does not require working within a job using the openmp environment. Only the execution of an OpenMP parallel executable will necessitate such an environment that has been requested by the use of the –pe openmp flag to qsub or qsh commands.

30 Shared Memory Applications Specifying Required Number of Threads The number of parallel execution threads at execution time is controlled by setting the environment variable OMP_NUM_THREADS to the appropriate value. for the bash or sh shell (which is the default shell on iceberg) use - export OMP_NUM_THREADS=6 If you are using the csh or tcsh shell, use - setenv OMP_NUM_THREADS=6

31 Shared Memory Applications Starting an OpenMP interactive job Short interactive jobs that use OpenMP parallel programming are allowed. Although upto 48 way parallel jobs can theoretically be run such way, due to the high utilisation of the cluster we recommend that you do not exceed 12-way jobs. Here is an example of starting a 12- way interactive job: qsh -pe openmp 12 or qrsh -pe openmp 12 And in the new shell that starts type: export OMP_NUM_THREADS=12 Alternatively, effect of these two commands can be achieved via the –v parameter: E.g. qsh –pe openmp 12 –v OMP_NUM_THREADS=12 Number of threads to use can later be redefined in the same job to experiment with hyper-threading for example. Important Note: although the number of processors required is specified with the -pe option, it is still necessary to ensure that the OMP_NUM_THREADS environment variable is set to the correct value.

32 Shared Memory Applications Submitting an OpenMP Job to Sun Grid Engine The job is submitted to a special parallel environment that ensures the job ocupies the required number of slots. Using the SGE command qsub the openmp parallel environment is requested using the -pe option as follows; qsub -pe openmp 12 -v OMP_NUM_THREADS=12 myjobfile.sh The following job script, job.sh is submitted using, qsub job.sh Where job.sh is, #!/bin/bash #$ -cwd #$ -pe openmp 12 #$ -v OMP_NUM_THREADS=12./executable

33 Parallel Programming with MPI Introduction Iceberg is designed with the aim of running MPI (message passing interface ) parallel jobs, the sun grid engine is able to handle MPI jobs. In a message passing parallel program each process executes the same binary code but, – executes a different path through the code – this is SPMD (single program multiple data) execution. Iceberg uses –openmpi-ib and mvapich2-ib implementation provide by infiniband (quadrics/connectX), using IB fast interconnect at 32GigaBits/second.

34 MPI Tutorials From an iceberg worker, execute the following command: tar –zxvf /usr/local/courses/intrompi.tgz The directory which has been created contains some sample MPI applications which you may compile and run.

35 Iceberg Hardware for Parallel Computing 23 Sun X2200 nodes (parallel nodes), each with 8 cores and 32 GB of RAM each Switch oQlogic Silverstorm 9024 - 24 Port Infiniband Switch Cards oConnectX™ IB HCA Card, Single Port 20Gb/s InfiniBand, PCIe 2.0

36 Infiniband specifications. High data rates of upto 1880 MBits/s ConnectX™ IB HCA Card, Single Port 16Gb/s InfiniBand Low latency of ~ 1 ms Gigabit Ethernet is order of 100 ms SilverStorm 24-Port InfiniBand DDR Switch

37 Set The Correct Environment for MPI For batch jobs the environment is normally set in Makefile or job script See the script file mpienv.sh (in the intrompi directory) Set the correct environment by pasting in to users.bashrc file Set the environment by typing source mpienv.sh Use modules –Show available compilers module avail –Load module e.g. module add mpi/intel/openmpi/1.6.4

38 Environments for MPI on Iceberg Openmpi with gigabit ethernet Openmpi with infiniband Mvapich2 with infiniband mpirun_rsh -rsh -np $NSLOTS -hostfile $TMPDIR/machines./executable NOTE There are environments for both gnu, PGI and Intel compilers

39 Parallel Programming with MPI 2: Hello MPI World! #include #include int main(int argc,char *argv[]) { int rank; /* my rank in MPI_COMM_WORLD */ int size; /* size of MPI_COMM_WORLD */ /* Always initialise mpi by this call before using any mpi functions. */ MPI_Init(& argc, & argv); /* Find out how many processors are taking part in the computations. */ MPI_Comm_size(MPI_COMM_WORLD, &size); /* Get the rank of the current process */ MPI_Comm_rank(MPI_COMM_WORLD, & rank); if (rank == 0) printf("Hello MPI world from C!\n"); printf("There are %d processes in my world, and I have rank %d\n",size, rank); MPI_Finalize(); }

40 Parallel Programming with MPI Output from Hello MPI World! When run on 4 processors the MPI Hello World program produces the following output, Hello MPI world from C! There are 4 processes in my world, and I have rank 2 There are 4 processes in my world, and I have rank 0 There are 4 processes in my world, and I have rank 3 There are 4 processes in my world, and I have rank 1

41 Parallel Programming with MPI Compiling MPI Applications Using Infiniband To compile C, C++, Fortran77 or Fortran90 MPI code using the portland compiler, type, mpif77 [compiler options] filename mpif90 [compiler options] filename mpicc [compiler options] filename mpiCC [compiler options] filename

42 Parallel Programming with MPI Compiling MPI Applications Using Gigabit ethernet on X2200’s To compile C, C++, Fortran77 or Fortran90 MPI code using the portland compiler, with OpenMPI type, export MPI_HOME=“/usr/local/packages5/openmpi-pgi/bin” $MPI_HOME/mpif77 [compiler options] filename $MPI_HOME/mpif90 [compiler options] filename $MPI_HOME/mpicc [compiler options] filename $MPI_HOME/mpiCC [compiler options] filename

43 Parallel Programming with MPI Submitting an MPI Job to Sun Grid Engine To submit a an MPI job to sun grid engine, –use the openmpi-ib parallel environment, –ensures that the job occuppies the required number of slots. Using the SGE command qsub, –the openmpi-ib parallel environment is requested using the -pe option as follows, qsub -pe openmpi-ib 4 myjobfile.sh

44 Parallel Programming with MPI Sun Grid Engine MPI Job Script The following job script, job.sh is submitted using, –qsub job.sh –job.sh is, #!/bin/sh #$ -cwd #$ -pe openmpi-ib 4 # SGE_HOME to locate sge mpi execution script #$ -v SGE_HOME=/usr/local/sge6_2 /usr/mpi/pgi/openmpi-1.2.8/bin/mpirun./mpiexecutable

45 Parallel Programming with MPI Sun Grid Engine MPI Job Script Using this executable directly the job is submitted using qsub in the same way but the scriptfile job.sh is, #!/bin/sh #$ -cwd #$ -pe mvapich2-ib 4 # MPIR_HOME from submitting environment #$ -v MPIR_HOME=/usr/mpi/pgi/mvapich2-1.2p1 $MPIR_HOME/bin/mpirun_rsh –rsh -np 4 -hostfile $TMPDIR/machines./mpiexecutable

46 Parallel Programming with MPI Sun Grid Engine OpenMPI Job Script Using this executable directly the job is submitted using qsub in the same way but the scriptfile job.sh is, #!/bin/sh #$ -cwd #$ -pe ompigige 4 # MPIR_HOME from submitting environment #$ -v MPIR_HOME=/usr/local/packages5/openmpi-pgi $MPIR_HOME/bin/mpirun -np 4 -machinefile mpiexecutable

47 Parallel Programming with MPI 10: Extra Notes Number of slots required and parallel environment must be specified using -pe openmpi-ib NSLOTS The job must be executed using the correct PGI/Intel/gnu implementation of mpirun. Note also: –Number of processors is specified using -np NSLOTS –Specify the location of the machinefile used for your parallel job, this will be located in a temporary area on the node that SGE submits the job to.

48 Parallel Programming with MPI 10: Pros and Cons. The downside to message passing codes is that they are harder to write than scalar or shared memory codes. –The system bus on a modern cpu can pass in excess of 4Gbits/sec between the memory and cpu. –A fast ethernet between PC's may only pass up to 200Mbits/sec between machines over a single ethernet cable and this can be a potential bottleneck when passing data between compute nodes. The solution to this problem for a high performance cluster such as iceberg is to use a high performance network solution, such as the 16Gbit/sec interconnect provided by infiniband. –The availability of such high performance networking makes possible a scalable parallel machine.


Download ppt "Getting Started with HPC On Iceberg Michael Griffiths and Deniz Savas Corporate Information and Computing Services The University of Sheffield www.sheffield.ac.uk/wrgrid."

Similar presentations


Ads by Google