NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.

Slides:



Advertisements
Similar presentations
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
Advertisements

Profiling your application with Intel VTune at NERSC
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Workshop: Using the VIC3 Cluster for Statistical Analyses Support perspective G.J. Bex.
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005.
Introduction to HPC Workshop October Introduction Rob Lane HPC Support Research Computing Services CUIT.
ISG We build general capability Job Submission on the Olympus Cluster J. DePasse; S. Brown, PhD; T. Maiden Pittsburgh Supercomputing Center Public Health.
Job Submission on WestGrid Feb on Access Grid.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
HPCC Mid-Morning Break Interactive High Performance Computing Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
ISG We build general capability Purpose After this tutorial, you should: Be comfortable submitting work to the batch queuing system of olympus and be familiar.
JGI/NERSC New Hardware Training Kirsten Fagnan, Seung-Jin Sul January 10, 2013.
Introduction to UNIX/Linux Exercises Dan Stanzione.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
 Accessing the NCCS Systems  Setting your Initial System Environment  Moving Data onto the NCCS Systems  Storing Data on the NCCS Systems  Running.
Introduction to HPC resources for BCB 660 Nirav Merchant
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
HPC at HCC Jun Wang Outline of Workshop1 Overview of HPC Computing Resources at HCC How to obtain an account at HCC How to login a Linux cluster at HCC.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
How to get started on cees Mandy SEP Style. Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores.
Introduction to Using SLURM on Discover Chongxun (Doris) Pan September 24, 2013.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
Software Overview Environment, libraries, debuggers, programming tools and applications Jonathan Carter NUG Training 3 Oct 2005.
C o n f i d e n t i a l 1 Course: BCA Semester: III Subject Code : BC 0042 Subject Name: Operating Systems Unit number : 1 Unit Title: Overview of Operating.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Introduction to Hartree Centre Resources: IBM iDataPlex Cluster and Training Workstations Rob Allan Scientific Computing Department STFC Daresbury Laboratory.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Intel Xeon Phi Training - Introduction Rob Allan Technology Support Manager The Hartree Centre.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Information Technology Services Brett D. Estrade, LSU – High Performance Computing Phone:
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
GRID COMPUTING.
Specialized Computing Cluster An Introduction
Auburn University
Welcome to Indiana University Clusters
PARADOX Cluster job management
Unix Scripts and PBS on BioU
Assumptions What are the prerequisites? … The hands on portion of the workshop will be on the command-line. If you are not familiar with the command.
HPC usage and software packages
Welcome to Indiana University Clusters
Working With Azure Batch AI
How to use the HPCC to do stuff
BIOSTAT LINUX CLUSTER By Helen Wang October 29, 2015.
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
Postdoctoral researcher Department of Environmental Sciences, LSU
Welcome to our Nuclear Physics Computing System
College of Engineering
Compiling and Job Submission
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Welcome to our Nuclear Physics Computing System
High Performance Computing in Bioinformatics
Introduction to High Performance Computing Using Sapelo2 at GACRC
Software - Operating Systems
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable Energy, LLC. Peregrine New User Training Ilene Carpenter 6/3/2015

Outline Introduction to Peregrine and Gyrfalcon Connecting to Peregrine Transferring files Filesystem overview Running jobs on Peregrine Introduction to modules Scripting tools on Peregrine

Peregrine System Architecture Infiniband cluster with multiple node types – 4 Login nodes – Service nodes (various types) – 1440 Compute nodes – multiple types Each node has at least 2 Intel Xeon processors and runs the Linux operating system. Has NFS file system and Lustre parallel file systems – 2.25 PB of storage

Peregrine has several types of compute nodes dual 8-core Intel Xeon processors (16 processor cores), 32 GB of memory dual 12-core Intel Xeon processors (24 cores), 32 GB of memory dual 12-core Intel Xeon processors (24 cores), 64 GB of memory dual 8-core Intel Xeon processors, 256 GB of memory dual 8-core Intel Xeon processors, 32 GB of memory + 2 Intel Xeon Phi coprocessors

Gyrfalcon: Long term data storage Over 3 PB of storage Sometimes called Mass Storage or Mass Storage System (MSS) Files are stored on disk and tape, migration is transparent to users. Two copies kept automatically, not backed up – if you delete a file, both copies get deleted! Quota on both number of files and total storage used

Long term storage: /mss file system Each user has a directory in /mss/users/ Projects may request allocations of space shared by the project members. – Files will be in /mss/projects/ The /mss file system is mounted on Peregrine login nodes but not on the compute nodes.

Connecting to Peregrine ssh to peregrine.nrel.gov %ssh –Y Windows users need to install a program that allows one to ssh, such as PuTTY. Mac users can use terminal or X11. For more information see to-hpc-systems

Linux CLI and shells When you connect, you will be in a “shell” on a login node. A shell is a command line interface to the operating system on the node. Default is bash shell. If you are new to command line interfaces and HPC systems, see instructions online at g-started-for-users-new-to-high-performance- computing

Peregrine File Systems “home” file system: /home/ – Store your scripts, programs and other small files here. – Backed up – 40 GB per user – Use homequota.check script to check how much space you’ve used. “nopt” file system: /nopt – the location of applications, libraries and tools “ projects” file system: /projects/ – A parallel file system for collaboration among project members. Useful for storing large input files and programs. “scratch” file system: /scratch – A high performance parallel file system to be used by jobs doing lots of file I/O

How to Transfer Files Laptop to/from Peregrine – Mac OSX and Linux: scp,sftp, rsync in terminal session – Windows: WinSCP – Any system: Globus Online From Peregrine to/from Gyrfalcon – cp, mv, rsync from a login node (which can access both /mss and all of the Peregrine file systems) From a computer at another computer center to Peregrine – Globus Online

Running jobs on Peregrine Peregrine uses Moab for job scheduling and workload management and Torque for resource management. Node scheduling is exclusive (jobs don’t share nodes). Use the qsub command %qsub – Batch file contains options in PBS/Torque job submission language, preceded by #PBS, to specify resource limits such as number of nodes and wall time limit. qsub –V exports all environment variables to the batch job qsub –l resource_list qsub –I requests interactive session qsub –q short puts a job in the “short” queue Use man qsub for more options

Allocations Only jobs associated with a valid project allocation will run. Use -A either as option to qsub command or within job script qsub –A CSC000 –lnodes=1,walltime=0:45:00 –q short asks for 1 node for 45 minutes from the short queue and tells the system that the CSC000 project allocation should be used

Interactive Jobs You can use compute nodes for interactive work. execute commands and scripts interactively run applications with GUIs (such as Matlab, COMSOL, etc.) You request an interactive “job” with –I option to the qsub command: qsub –I –q –A The same resource limits apply to interactive jobs as to non-interactive jobs. These depend on the queue you submit your interactive job to.

Asking for particular node types All nodes assigned to your job will be the same type Use –l feature=X option to request specific node types – X can be “ 16core ”, “ 24core ”, “ 64GB ”, “256GB”, “phi”

Job Queues debug – For quick access to nodes for debugging short – For jobs that take less than 4 hours batch – For jobs that take 2 days or less large – For jobs that use at least 16 nodes long – For jobs that take up to 10 days (by request) bigmem – For jobs that need a lot of memory per node (by request) phi – For jobs that use the Phi coprocessors

Debug queue No production work allowed Maximum run time is 1 hour Max of two jobs per user Max of two nodes per job Queue has 2 of each type of node Submit with qsub –q debug

Short queue For short production jobs Maximum run time is 4 hours Up to 8 nodes per user Up to 8 nodes per job Has nodes of each type Submit job with qsub –q short

Batch queue This is the default queue. Max runtime of 2 days. Max of 296 nodes per user. Max of 288 nodes per job. Has 740 nodes, of the following types: – core, 32GB nodes – core, 32 GB nodes – core, 64 GB nodes

Large queue For jobs that use at least 16 nodes Maximum run time is 1 day Maximum number of nodes per user is 202 Maximum number of nodes per job is 202 Has 202 nodes with 24 cores and 32 GB of memory qsub –q large

Long queue For jobs that take more than 2 days to run. Maximum run time is 10 days. Access by request only, must have justified need. Maximum number of nodes per user is 120. Maximum number of nodes per job is nodes with 24 cores and 32 GB of memory 160 nodes with 16 cores and 32 GB of memory qsub –q long

Bigmem queue By request only, must have justified need. Maximum run time is 10 days. Maximum number of nodes per user is 46. Maximum number of nodes per job is nodes with 24 cores and 64 GB of memory. 52 nodes with 16 cores and 256 GB of memory.

Phi queue Intended for jobs that will use Intel Xeon Phi coprocessors. – Jobs may use both Phi and Xeon cores simultaneously. Maximum run time is 2 days. Maximum number of nodes per user is 32. Maximum number of nodes per job is nodes with 16 Xeon cores, 32 GB of memory and 2 Xeon Phi coprocessors These nodes run a slightly different software stack than nodes without Phi coprocessors.

Checking job status qstat will show the state of your job (queued, running, etc.) checkjob –v will give you information about why your job isn’t running yet shownodes shows what nodes are potentially available for running your jobs To get information about your job after it ran, use showhist.moab.pl – Shows submit time, start time, end time, exit code, node list and other useful information.

Sample serial job script #!/bin/bash #PBS –lwalltime=4:00:00 #PBS –lnodes=1 #PBS –N test1 #PBS –A CSC001 cd $PBS_O_WORKDIR./a.out

Sample MPI job script #!/bin/bash #PBS –lwalltime=4:00:00 #PBS –lnodes=4:ppn=16 #PBS –lfeature=16core #PBS –N test1 #PBS –A CSC001 cd $PBS_O_WORKDIR mpirun –np 64 /path/to/executable

Sample script with multiple serial jobs #!/bin/bash #PBS -l walltime=00:10:00 #PBS -l nodes=1:ppn=24 #PBS -N wait_test #PBS -o std.out #PBS -e std.err #PBS -A hpc-apps cd $PBS_O_WORKDIR JOBNAME=waitTest # Run 8 jobs N_JOB=8 for((i=1;i<=$N_JOB;i++)) do mkdir $JOBNAME.run$i cd $JOBNAME.run$i echo 10*10^$i | bc > input time../pi_test log & cd.. done #Wait for all wait echo echo "All done. Checking results:" grep "PI" $JOBNAME.*/log

Introduction to modules modules is a utility for allowing users to easily change their software environment. It allows a system to have multiple versions of software and enables easy use of installed applications. By default, two modules are loaded when you log in to Peregrine. These set up your environment to use the Intel compiler suite and the Intel MPI library. The module list command shows what is currently loaded: ~]$ module list Currently Loaded Modulefiles: 1) comp-intel/ ) impi-intel/ The module avail command shows what modules are available for you to use.

Why Scripting? Productivity Easily tuned to domain Performance where needed

Scripting Tools on Peregrine Shells: bash, csh, dash, ksh, ksh93, sh, tcsh, zsh, tclsh Perl, Octave, Lua, Lisp(Emacs, Guile), GNUplot, Tclx, Java, Ruby IDL, Python, R,.NET (Mono) SQL (Postgres, Sqlite, MySQL) MATLAB, GAMS

HPC website, help To report a problem, send to

Brief Introduction to Xeon Phi The Phi chips in Peregrine are “coprocessors”. – attached to the Xeon processors via PCIe – special Linux OS with limited capabilities – different instruction set from Xeon Each Phi coprocessor has peak performance of ~ 1 TFLOP (DP).

Phi/MIC Architecture ~ 60 cores – Each core has 4 hardware threads – Each core is slower than regular Xeon core – 512 bit vectors (SIMD) 8 GB of memory Phi is designed for highly parallel, well vectorized applications.

Application must scale to high level of task parallelism! Chart taken from Colfax Developer Boot Camp slides

Will your application benefit from MIC architecture? Chart taken from Colfax Developer Boot Camp slides

For more information, see book by Jim Jeffers and James Reinders: