Presentation is loading. Please wait.

Presentation is loading. Please wait.

TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Advanced Session: Outline Cluster Architecture File System.

Similar presentations


Presentation on theme: "TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Advanced Session: Outline Cluster Architecture File System."— Presentation transcript:

1 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Advanced Session: Outline Cluster Architecture File System and Storage Lectures with Labs: Advanced Batch Jobs Compilers/Libraries/Optimization Compiling/Running Parallel Jobs Grid Computing

2 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. HPCC Clusters hrothgar: 128 dual-processor 64-bit Xeons, 3.2 Ghz, 4GB memory, Infiniband and Gigabit Ethernet, Centos 4.3 (Redhat) community cluster: 64 nodes, part of hrothgar, same except no Infiniband. Owned by faculty members, controlled by batch queues. minigar; 20 nodes, 3.6 Ghz, IB, for development, open soon Physics grid machine on order: some nodes available poseidon: Opteron, 3 nodes, pathscale compilers Several retired, test, grid systems

3 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Cluster Performance Main factors: 1. Individual node performance: of course. SpecFP2000Rate (www.spec.org) matches our apps well. Newest dual cores have 2x cores, ~1.5x perf per core for 3x performance per node vs. hrothgar. 2. Fabric latency (delay time of one message, ms. IB=6 GE=40) 3. Fabric bandwidth (MB/s IB=600 GE=60) Intels better cpu right now, AMD better shared mem performance. Overall about equal.

4 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Cluster Architecture An application example where the system is limited by interconnect performance: gromacs, simulation time completed/real time Hrothgar, 8 nodes, Gig-E: ~1200 ns/day Hrothgar, 8 nodes, IB:~2800 ns/day Current dual-core systems have 3x the serial throughput of hrothgar, and quad-core systems are coming next year. They need more bandwidth: Gig-E will in the future be suitable only for serial jobs.

5 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Cluster Usage ssh to hrothgar scp files to hrothgar compile on hrothgar run on compute nodes (only) using lsf batch system (only) example files: /home/shared/examples/

6 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Some Useful LSF Commands bjobs –w (-w for wide shows full node name) bjobs –l [job#](–l for long shows everything) bqueues [-l]shows queues [everything] bhist [job#] job history bpeek [job#] stdout/err stored by lsf bkill job#kill it - bash-3.00$ /home/shared/bin/check-hosts-batch.sh hrothgar, 2 free=0 nodes, 0 cpus hrothgar, 1 free=3 nodes, 3 cpus hrothgar, 0 free=125 nodes hrothgar, offline=0 nodes

7 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Batch Queues on hrothgar bqueues QUEUE_NAME PRIO STATU MAX JL/U JL/P JL/H NJOBS PEND RUN short 35 Open 56 56 - - 0 0 0 parallel 35 Open 224 40 - - 108 0 108 serial 30 Open 156 60 - - 204 140 64 parallel_long 25 Open 256 64 - - 16 0 16 idle 20 Open 256 256 - - 100 0 55 Every 30 sec the scheduler cycles queued jobs. Starts if: (1) nodes are available, free or idle run (2) Cpu’s less than user queue limit “bqueues JL/U” (3) Cpu’s Less that total queue limit “bqueues MAX” (4) Highest priority queue (short,par,ser,par_long,idle) (5) Fair share (user with smallest current usage goes first)

8 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Unix/Linux Compiling Common Features [compiler] [options] [source files] [linker options] (pathscale is only on poseidon) C compilers : gcc, icc, pathcc C++: g++, icpc, pathCC Fortran : g77, ifort, pathf90 Options : -O [optimize] -o outputfilename Source files : new.f or *.f or *.c Linker options: To link with libx.a or libx.so in /home/elvis/lib : -L/home/elvis/lib –lx Many programs need : -lm, -pthread

9 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. MPI Compile: Path. /home/shared/examples/new-bashrc[using bash] source /home/shared/examples/new-cshrc [using tcsh] hrothgar:dchaffin:dchaffin $ echo $PATH /sbin:/bin:/usr/bin:/usr/sbin:/usr/X11R6/bin:\ /usr/share/bin:/opt/rocks/bin:/opt/rocks/sbin:\ /opt/lsfhpc/6.2/linux2.6-glibc2.3-x86_64/bin:\ /opt/intel/fce/9.0/bin:/opt/intel/cce/9.0/bin:\ /share/apps/mpich/IB-icc-ifort-64/bin:\ /opt/lsfhpc/6.2/linux2.6-glibc2.3-x86_64/bin mpich: IB or GE, icc or gcc or pathcc, ifort or g77 or pathf90 mpicc/mpif77/mpif90/mpiCC must match mpirun!

10 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. MPI Compile/Run cp /home/shared/examples/mpi-basic.sh. cp /home/shared/examples/cpi.c. /opt/mpich/gnu/bin/mpicc cpi.c[or] /share/apps/mpich/IB-icc-ifort-64/bin/mpicc cpi.c vi mpi-basic.sh Ptiles comment out the mpirun that you are not using either IB or default Could change executable name bsub < mpi-basic.sh produces: job#.outlsf output job#.pgm.outmpirun output job#.errlsf stderr job#.pgm.errmpirun stderr

11 TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Exercise/Homework Run mpi benchmark on Infiniband, Ethernet, and Shared memory. Compare latency and bandwidth. Research and briefly discuss reasons for the performance: Hardware bandwidth (look it up) Software layers (OS, interrupts, MPI, one-sided copy, two-sided copy) Hardware: Topspin Infiniband SDR, PCI-X Xeon Nocona shared memory Intel Gigabit, on board Program:/home/shared/examples/mpilc.c or equivalent


Download ppt "TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Advanced Session: Outline Cluster Architecture File System."

Similar presentations


Ads by Google