Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009.

Slides:



Advertisements
Similar presentations
NERCS Users’ Group, Oct. 3, 2005 Interconnect and MPI Bill Saphir.
Advertisements

Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
TTU High Performance Computing User Training: Part 2 Srirangam Addepalli and David Chaffin, Ph.D. Advanced Session: Outline Cluster Architecture File System.
Jia Yao Director: Vishwani D. Agrawal High Performance Compute Cluster April 13,
Using Kure and Killdevil
Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.
Job Submission on WestGrid Feb on Access Grid.
Academic and Research Technology (A&RT)
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
A crash course in njit’s Afs
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 22, 2011assignprelim.1 Assignment Preliminaries ITCS 6010/8010 Spring 2011.
Parallel Communications and NUMA Control on the Teragrid’s New Sun Constellation System Lars Koesterke with Kent Milfeld and Karl W. Schulz AUS Presentation.
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2012, Jan 18, 2012assignprelim.1 Assignment Preliminaries ITCS 4145/5145 Spring 2012.
Getting Started on Topsail Mark Reed Charles Davis ITS Research Computing.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
Introduction to HPC resources for BCB 660 Nirav Merchant
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
HPC at HCC Jun Wang Outline of Workshop1 Overview of HPC Computing Resources at HCC How to obtain an account at HCC How to login a Linux cluster at HCC.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
Introduction to the HPCC Dirk Colbry Research Specialist Institute for Cyber Enabled Research.
O.S.C.A.R. Cluster Installation. O.S.C.A.R O.S.C.A.R. Open Source Cluster Application Resource Latest Version: 2.2 ( March, 2003 )
Common Practices for Managing Small HPC Clusters Supercomputing 12
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Research Computing Environment at the University of Alberta Diego Novillo Research Computing Support Group University of Alberta April 1999.
Linux & Shell Scripting Small Group Lecture 3 How to Learn to Code Workshop group/ Erin.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Experimental Comparative Study of Job Management Systems George Washington University George Mason University
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
Getting Started on Emerald Research Computing Group.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
ARCHER Advanced Research Computing High End Resource
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
Introduction to Hartree Centre Resources: IBM iDataPlex Cluster and Training Workstations Rob Allan Scientific Computing Department STFC Daresbury Laboratory.
University of Illinois at Urbana-Champaign Using the NCSA Supercluster for Cactus NT Cluster Group Computing and Communications Division NCSA Mike Showerman.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.
CCJ introduction RIKEN Nishina Center Kohei Shoji.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.
Intel Xeon Phi Training - Introduction Rob Allan Technology Support Manager The Hartree Centre.
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
GRID COMPUTING.
Welcome to Indiana University Clusters
HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.
HPC usage and software packages
Belle II Physics Analysis Center at TIFR
Using Dogwood Instructor: Mark Reed
NCSA Supercluster Administration
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Support for ”interactive batch”
Using Dogwood Instructor: Mark Reed
Introduction to High Performance Computing Using Sapelo2 at GACRC
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

Getting Started on Topsail Charles Davis ITS Research Computing April 8, 2009

2  History of Topsail  Structure of Topsail  File Systems on Topsail  Compiling on Topsail  Topsail and LSF Outline

3 Initial Topsail Cluster  Initially: 1040 CPU Dell Linux Cluster 520 dual socket, single core nodes  Infiniband interconnect  Intended for capability research  Housed in ITS Franklin machine room  Fast and efficient for large computational jobs

4 Topsail Upgrade 1  Topsail upgraded to 4,160 CPU replaced blades with dual socket, quad core  Intel Xeon 5345 (Clovertown) Processors Quad-Core with 8 CPU/node  Increased number of processors, but decreased individual processor speed (was 3.6 GHz, now 2.33)  Decreased energy usage and necessary resources for cooling system  Summary: slower clock speed, better memory bandwidth, less heat Benchmarks tend to run at the same speed per core Topsail shows a net ~4X improvement Of course, this number is VERY application dependent

5 Topsail – Upgraded blades  52 Chassis: Basis of node names Each holds 10 blades -> 520 blades total Nodes = cmp-chassis#-blade#  Old Compute Blades: Dell PowerEdge Single core Intel Xeon EMT64T 3.6 GHZ procs 800 Mhz FSB 2MB L2 Cache per socket Intel NetBurst MicroArchitecture  New Compute Blades: Dell PowerEdge Quad core Intel 2.33 GHz procs 1333 Mhz FSB 4MB L2 Cache per socket Intel Core 2 MicroArchitecture

6 Topsail Upgrade 2  Most recent Topsail upgrade  Refreshed much of the infrastructure  Improved IBRIX filesystem  Replaced and improved Infiniband cabling  Moved cluster to ITS-Manning building Better cooling and UPS

7 Current Topsail Architecture  Login node: GHz Intel EM64T, 12 GB memory  Compute nodes: 4, GHz Intel EM64T, 12 GB memory  Shared disk: 39TB IBRIX Parallel File System  Interconnect: Infiniband 4x SDR  64bit Linux Operating System

8 Multi-Core Computing  Processor Structure on Topsail 500+ nodes 2 sockets/node 1 processor/socket 4 cores/processor (Quad-core) 8 cores/node 

9 Multi-Core Computing  The trend in High Performance Computing is towards multi-core or many core computing.  More cores at slower clock speeds for less heat  Now, dual and quad core processors are becoming common.  Soon 64+ core processors will be common And these may be heterogeneous!

10 The Heat Problem Taken From: Jack Dongarra, UT

11 More Parallelism Taken From: Jack Dongarra, UT

12 Infiniband Connections  Connection comes in single (SDR), double (DDR), and quad data rates (QDR). Topsail is SDR.  Single data rate is 2.5 Gbit/s in each direction per link.  Links can be aggregated - 1x, 4x, 12x. Topsail is 4x.  Links use 8B/10B encoding —10 bits carry 8 bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.  Data rate for Topsail is 8 GB/s (4x SDR).

13 Topsail Network Topology

14 Infiniband Benchmarks  Point-to-point (PTP) intranode communication on Topsail for various MPI send types  Peak bandwidth: 1288 MB/s  Minimum Latency (1-way): 3.6  s

15 Infiniband Benchmarks  Scaled aggregate bandwidth for MPI Broadcast on Topsail  Note good scaling throughout the tested range (from cores)

16 Login to Topsail  Use ssh to connect: ssh topsail.unc.edu  SSH Secure Shell with Windows  For using interactive programs with X- Windows Display: ssh –X topsail.unc.edu ssh –Y topsail.unc.edu  Off-campus users (i.e. domains outside of unc.edu) must use VPN connection

17 Topsail File Systems  39TB IBRIX Parallel File System  Split into Home and Scratch Space  Home: /ifs1/home/my_onyen  Scratch: /ifs1/scr/my_onyen  Mass Storage Only Home is backed up

18 File System Limits  500GB Total Limit per User  Home – 5GB limit for Backups  Scratch: No limit except 500GB total Not backed up Periodically cleaned  No installed packages/programs

19 Compiling on Topsail  Modules  Serial Programming Intel Compiler Suite for Fortran77, Fortran90, C and C++ - Recommended by Research Computing GNU  Parallel Programming MPI OpenMP  Must use Intel Compiler Suite  Compiler tag: -openmp  Must set OMP_NUM_THREADS in submission script

20 Compiling Modules  Module commands module – list commands module avail – list modules module add – add module temporarily module list – list modules being used module clear – remove module temporarily  Add module using startup files

21 Available Compilers  Intel – ifort, icc, icpc  GNU – gcc, g++, gfortran  Libraries - BLAS/LAPACK  MPI: mpicc/mpiCC mpif77/mpif90  mpixx is just a wrapper around the Intel or GNU compiler Adds location of MPI libraries and include files Provided as a convenience

22 Test MPI Compile  Copy cpi.c to scratch directory: cp /ifs1/scr/cdavis/Topsail/cpi.c /ifs1/scr/my_onyen/.  Add Intel module: module load hpc/mvapich-intel  Confirm Intel module: which mpicc  Compile code: mpicc –o cpi cpi.c

23 MPI/OpenMP Training  Courses are taught throughout year by Research Computing  Next course: MPI – Spring OpenMP – Spring

24 Running Programs on Topsail  Upon ssh to Topsail, you are on the Login node.  Programs SHOULD NOT be run on Login node.  Submit programs to one of 4,160 Compute nodes.  Submit jobs using Load Sharing Facility (LSF).

25 Job Scheduling Systems  Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc.  Many types of schedulers Load Sharing Facility (LSF) – Used by Topsail IBM LoadLeveler Portable Batch System (PBS) Sun Grid Engine (SGE)

26 Load Sharing Facility (LSF) Submission host LIM Batch API Master host MLIM MBD Execution host SBD Child SBD LIM RES User job LIM – Load Information Manager MLIM – Master LIM MBD – Master Batch Daemon SBD – Slave Batch Daemon RES – Remote Execution Server queue Load information other hosts other hosts bsub app

27 Submitting a Job to LSF  For a compiled MPI job: bsub -n " " -o out.%J -e err.%J -a mvapich mpirun./mycode  bsub – LSF command that submits job to compute node  bsub –o and bsub -e Job output saved to file in submission directory

28 Queue System on Topsail  Topsail uses queues to distribute jobs.  Specify queue with –q in bsub: bsub –q week …  No –q specified = default queue (week)  Queues vary depending on size and required time of jobs  See listing of queues: bqueues

29 Topsail Queues QueueTime LimitJobs/UserCPU Range int2 hrs debug2 hrs day24 hrs10244 – 1024 week1 week5124 – 256 month1 month1284 – cpu4 days – cpu4 days51232 – cpu2 days10244 – 32 chunk4 days512Batch Jobs

30 Submission Scripts  Easier to write submission script that can be edited for each job submission.  Example script file – run.hpl: #BSUB -n " " #BSUB -e err.%J #BSUB -o out.%J #BSUB -a mvapich mpirun./mycode  Submit with: bsub < run.hpl

31 More bsub options  bsub –x – NO LONGER USE!! Exclusive use of a node  bsub –n 4 –R span[ptile=4] Forces all 4 processors to be on same node Similar to –x  bsub –J job_name  see man pages for a complete description man bsub

32 Performance Test  Gromacs MD simulation of bulk water  Simulation setups: Case 1: -n 8 -R span[ptile=1] Case 2: -n 8 -R span[ptile=8]  Simulation times (1ns MD): Case 1: 1445 sec Case 2: 1255 sec  Using 1 node only improved speed by 13%

33 Following Job After Submission  bjobs bjobs –l JobID Shows current status of job  bhist bhist –l JobID More details information regarding job history  bkill bkill –r JobID Ends job prematurely

34 Submit Test MPI Job  Submit the test MPI program on Topsail bsub –q week –n 4 –o out.%J –e err.%J –a mvapich mpirun./cpi  Follow submission: bjobs  Output stored in out.%J file

35 Pre-Compiled Programs on Topsail  Some applications are precompiled for all users: /ifs1/apps Amber, Gaussian, Gromacs, NetCDF, NWChem  Add module to path using module commands: module list – shows available applications module add – add specific application  Once module command is used, executable is added to the full path

36 Test Gaussian Job on Topsail  Add Gaussian Application to path: module add apps/gaussian-03e01 module list  Copy input com file: cp /ifs1/scr/cdavis/water.com.  Check that executable has been added to path: echo $PATH  Submit job: bsub –q week –n 4 –e err.%J –o out.%J g03 water.com

37 Common Error 1  If job immediately dies, check err.%J file  err.%J file has error: Can't read MPIRUN_HOST  Problem: MPI enivronment settings were not correctly applied on compute node  Solution: Include mpirun in bsub command

38 Common Error 2  Job immediately dies after submission  err.%J file is blank  Problem: ssh passwords and keys were not correctly setup at initial login to Topsail  Solution: cd ~/.ssh/ mv id_rsa id_rsa-orig mv id_rsa.pub id_rsa.pub-orig Logout of Topsail Login to Topsail and accept all defaults

39 Interactive Jobs  To run long shell scripts on Topsail, use int queue  bsub –q int –Ip /bin/bash This bsub command provides a prompt on compute node Can run program or shell script interactively from compute node  Totalview debugger can also be run interactively from Topsail

40 Further Help with Topsail  More details about using Topsail can be found on the Getting Started on Topsail help document  For assistance with Topsail, please contact the ITS Research Computing group  For immediate assistance, see manual pages on Topsail: man