Presentation is loading. Please wait.

Presentation is loading. Please wait.

Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009.

Similar presentations


Presentation on theme: "Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009."— Presentation transcript:

1 Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009

2 2  History of Topsail  Structure of Topsail  File Systems on Topsail  Compiling on Topsail  Topsail and LSF Outline

3 3 Initial Topsail Cluster  Initially: 1040 CPU Dell Linux Cluster 520 dual socket, single core nodes  Infiniband interconnect  Intended for capability research  Housed in ITS Franklin machine room  Fast and efficient for large computational jobs

4 4 Topsail Upgrade 1  Topsail upgraded to 4,160 CPU replaced blades with dual socket, quad core  Intel Xeon 5345 (Clovertown) Processors Quad-Core with 8 CPU/node  Increased number of processors, but decreased individual processor speed (was 3.6 GHz, now 2.33)  Decreased energy usage and necessary resources for cooling system  Summary: slower clock speed, better memory bandwidth, less heat Benchmarks tend to run at the same speed per core Topsail shows a net ~4X improvement Of course, this number is VERY application dependent

5 5 Topsail – Upgraded blades  52 Chassis: Basis of node names Each holds 10 blades -> 520 blades total Nodes = cmp-chassis#-blade#  Old Compute Blades: Dell PowerEdge 1855 2 Single core Intel Xeon EMT64T 3.6 GHZ procs 800 Mhz FSB 2MB L2 Cache per socket Intel NetBurst MicroArchitecture  New Compute Blades: Dell PowerEdge 1955 2 Quad core Intel 2.33 GHz procs 1333 Mhz FSB 4MB L2 Cache per socket Intel Core 2 MicroArchitecture

6 6 Topsail Upgrade 2  Most recent Topsail upgrade  Refreshed much of the infrastructure  Improved IBRIX filesystem  Replaced and improved Infiniband cabling  Moved cluster to ITS-Manning building Better cooling and UPS

7 7 Current Topsail Architecture  Login node: 8 CPU @ 2.3 GHz Intel EM64T, 12 GB memory  Compute nodes: 4,160 CPU @ 2.3 GHz Intel EM64T, 12 GB memory  Shared disk: 39TB IBRIX Parallel File System  Interconnect: Infiniband 4x SDR  64bit Linux Operating System

8 8 Multi-Core Computing  Processor Structure on Topsail 500+ nodes 2 sockets/node 1 processor/socket 4 cores/processor (Quad-core) 8 cores/node  http://www.tomshardware.com/2006/12/06/quad-core-xeon-clovertown-rolls-into-dp-servers/page3.html

9 9 Multi-Core Computing  The trend in High Performance Computing is towards multi-core or many core computing.  More cores at slower clock speeds for less heat  Now, dual and quad core processors are becoming common.  Soon 64+ core processors will be common And these may be heterogeneous!

10 10 The Heat Problem Taken From: Jack Dongarra, UT

11 11 More Parallelism Taken From: Jack Dongarra, UT

12 12 Infiniband Connections  Connection comes in single (SDR), double (DDR), and quad data rates (QDR). Topsail is SDR.  Single data rate is 2.5 Gbit/s in each direction per link.  Links can be aggregated - 1x, 4x, 12x. Topsail is 4x.  Links use 8B/10B encoding —10 bits carry 8 bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.  Data rate for Topsail is 8 GB/s (4x SDR).

13 13 Topsail Network Topology

14 14 Infiniband Benchmarks  Point-to-point (PTP) intranode communication on Topsail for various MPI send types  Peak bandwidth: 1288 MB/s  Minimum Latency (1-way): 3.6  s

15 15 Infiniband Benchmarks  Scaled aggregate bandwidth for MPI Broadcast on Topsail  Note good scaling throughout the tested range (from 24-1536 cores)

16 16 Login to Topsail  Use ssh to connect: ssh topsail.unc.edu  SSH Secure Shell with Windows  For using interactive programs with X- Windows Display: ssh –X topsail.unc.edu ssh –Y topsail.unc.edu  Off-campus users (i.e. domains outside of unc.edu) must use VPN connection

17 17 Topsail File Systems  39TB IBRIX Parallel File System  Split into Home and Scratch Space  Home: /ifs1/home/my_onyen  Scratch: /ifs1/scr/my_onyen  Mass Storage Only Home is backed up

18 18 File System Limits  500GB Total Limit per User  Home – 5GB limit for Backups  Scratch: No limit except 500GB total Not backed up Periodically cleaned  No installed packages/programs

19 19 Compiling on Topsail  Modules  Serial Programming Intel Compiler Suite for Fortran77, Fortran90, C and C++ - Recommended by Research Computing GNU  Parallel Programming MPI OpenMP  Must use Intel Compiler Suite  Compiler tag: -openmp  Must set OMP_NUM_THREADS in submission script

20 20 Compiling Modules  Module commands module – list commands module avail – list modules module add – add module temporarily module list – list modules being used module clear – remove module temporarily  Add module using startup files

21 21 Available Compilers  Intel – ifort, icc, icpc  GNU – gcc, g++, gfortran  Libraries - BLAS/LAPACK  MPI: mpicc/mpiCC mpif77/mpif90  mpixx is just a wrapper around the Intel or GNU compiler Adds location of MPI libraries and include files Provided as a convenience

22 22 Test MPI Compile  Copy cpi.c to scratch directory: cp /ifs1/scr/cdavis/Topsail/cpi.c /ifs1/scr/my_onyen/.  Add Intel module: module load hpc/mvapich-intel  Confirm Intel module: which mpicc  Compile code: mpicc –o cpi cpi.c

23 23 MPI/OpenMP Training  Courses are taught throughout year by Research Computing http://learnit.unc.edu/workshops  Next course: MPI – Spring OpenMP – Spring

24 24 Running Programs on Topsail  Upon ssh to Topsail, you are on the Login node.  Programs SHOULD NOT be run on Login node.  Submit programs to one of 4,160 Compute nodes.  Submit jobs using Load Sharing Facility (LSF).

25 25 Job Scheduling Systems  Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc.  Many types of schedulers Load Sharing Facility (LSF) – Used by Topsail IBM LoadLeveler Portable Batch System (PBS) Sun Grid Engine (SGE)

26 26 Load Sharing Facility (LSF) Submission host LIM Batch API Master host MLIM MBD Execution host SBD Child SBD LIM RES User job LIM – Load Information Manager MLIM – Master LIM MBD – Master Batch Daemon SBD – Slave Batch Daemon RES – Remote Execution Server queue 1 2 3 4 5 6 7 8 9 10 11 12 13 Load information other hosts other hosts bsub app

27 27 Submitting a Job to LSF  For a compiled MPI job: bsub -n " " -o out.%J -e err.%J -a mvapich mpirun./mycode  bsub – LSF command that submits job to compute node  bsub –o and bsub -e Job output saved to file in submission directory

28 28 Queue System on Topsail  Topsail uses queues to distribute jobs.  Specify queue with –q in bsub: bsub –q week …  No –q specified = default queue (week)  Queues vary depending on size and required time of jobs  See listing of queues: bqueues

29 29 Topsail Queues QueueTime LimitJobs/UserCPU Range int2 hrs128--- debug2 hrs128--- day24 hrs10244 – 1024 week1 week5124 – 256 month1 month1284 – 128 512cpu4 days1024128 – 512 128cpu4 days51232 – 128 32cpu2 days10244 – 32 chunk4 days512Batch Jobs

30 30 Submission Scripts  Easier to write submission script that can be edited for each job submission.  Example script file – run.hpl: #BSUB -n " " #BSUB -e err.%J #BSUB -o out.%J #BSUB -a mvapich mpirun./mycode  Submit with: bsub < run.hpl

31 31 More bsub options  bsub –x – NO LONGER USE!! Exclusive use of a node  bsub –n 4 –R span[ptile=4] Forces all 4 processors to be on same node Similar to –x  bsub –J job_name  see man pages for a complete description man bsub

32 32 Performance Test  Gromacs MD simulation of bulk water  Simulation setups: Case 1: -n 8 -R span[ptile=1] Case 2: -n 8 -R span[ptile=8]  Simulation times (1ns MD): Case 1: 1445 sec Case 2: 1255 sec  Using 1 node only improved speed by 13%

33 33 Following Job After Submission  bjobs bjobs –l JobID Shows current status of job  bhist bhist –l JobID More details information regarding job history  bkill bkill –r JobID Ends job prematurely

34 34 Submit Test MPI Job  Submit the test MPI program on Topsail bsub –q week –n 4 –o out.%J –e err.%J –a mvapich mpirun./cpi  Follow submission: bjobs  Output stored in out.%J file

35 35 Pre-Compiled Programs on Topsail  Some applications are precompiled for all users: /ifs1/apps Amber, Gaussian, Gromacs, NetCDF, NWChem  Add module to path using module commands: module list – shows available applications module add – add specific application  Once module command is used, executable is added to the full path

36 36 Test Gaussian Job on Topsail  Add Gaussian Application to path: module add apps/gaussian-03e01 module list  Copy input com file: cp /ifs1/scr/cdavis/water.com.  Check that executable has been added to path: echo $PATH  Submit job: bsub –q week –n 4 –e err.%J –o out.%J g03 water.com

37 37 Common Error 1  If job immediately dies, check err.%J file  err.%J file has error: Can't read MPIRUN_HOST  Problem: MPI enivronment settings were not correctly applied on compute node  Solution: Include mpirun in bsub command

38 38 Common Error 2  Job immediately dies after submission  err.%J file is blank  Problem: ssh passwords and keys were not correctly setup at initial login to Topsail  Solution: cd ~/.ssh/ mv id_rsa id_rsa-orig mv id_rsa.pub id_rsa.pub-orig Logout of Topsail Login to Topsail and accept all defaults

39 39 Interactive Jobs  To run long shell scripts on Topsail, use int queue  bsub –q int –Ip /bin/bash This bsub command provides a prompt on compute node Can run program or shell script interactively from compute node  Totalview debugger can also be run interactively from Topsail

40 40 Further Help with Topsail  More details about using Topsail can be found on the Getting Started on Topsail help document http://help.unc.edu/?id=6214  For assistance with Topsail, please contact the ITS Research Computing group Email: research@unc.eduresearch@unc.edu  For immediate assistance, see manual pages on Topsail: man


Download ppt "Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009."

Similar presentations


Ads by Google