Track 1: Cluster and Grid Computing

Track 1: Cluster and Grid Computing
NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing July 31, 2007 Wes Goodman 9/18/2018 © 2007 UC Regents

Cluster Pioneers In the mid-1990s, Network of Workstations project (UC Berkeley) and the Beowulf Project (NASA) asked the question: Can you build a high performance machine from commodity components? NOW pioneered the vision for clusters of commodity processors David Culler SunOS/SPARC First generation of Myrinet Glunix (Global Layer Unix) execution environment Beowulf popularized the notion and made it very affordable Tomas Sterling & Donald Becker Linux 9/18/2018 © 2007 UC Regents

Types of Clusters Highly Available (HA) Visualization Clusters
Generally small, less than 8 nodes Redundant components Multiple communication paths Visualization Clusters Each node drives a display OpenGL machines Computing (HPC Clusters) AKA Beowulf 9/18/2018 © 2007 UC Regents

Definition: Beowulf Collection of commodity PCs running an opensource operating system with a commodity network Network is usually Ethernet, although non-commodity networks are sometimes called Beowulfs Come to mean any Linux cluster 9/18/2018 © 2007 UC Regents

HPC Cluster Architecture
Frontend Node (Net addressable units as option) Power Distribution Public Ethernet Private Ethernet Network Application Network (Optional) Node Application Network Could be: Myrinet IB Gigabit 9/18/2018 © 2007 UC Regents

Clusters now Dominate High-End Computing
9/18/2018 © 2007 UC Regents

The Light Side of Clusters
Clusters are phenomenal price/performance computational engines … Mainstream tools for a variety of scientific fields Expanded performance from HPC to High availability Visualization Benefits come due to Using inexpensive commodity servers Open source software Large and expanding community of developers HA - small, redundant components, multiple communication paths Vis - each node drives a display, open GL machines Price/performance - why clusters dominate HPC 9/18/2018 © 2007 UC Regents

The Dark Side of Clusters
While clusters are phenomenal price/performance computational engines … Can be hard to manage without experience High-performance I/O is still unsolved Finding out where something has failed increases at least linearly as cluster size increases Not cost-effective if every cluster “burns” a person just for care and feeding Programming environment could be vastly improved Technology is changing very rapidly. Scaling up is becoming commonplace ( nodes) 9/18/2018 © 2007 UC Regents

Most Critical Problems with Clusters
The largest problem in clusters is software skew When software configuration on some nodes is different than on others Small differences (minor version numbers on libraries) can cripple a parallel program The second most important problem is lack of adequate job control of the parallel process Signal propagation Cleanup 9/18/2018 © 2007 UC Regents

Top 3 Problems with Software Packages
Software installation works only in interactive mode Need a significant work by end-user Often rational default settings are not available Extremely time consuming to provide values Should be provided by package developers but … Package is required to be installed on a running system Means multi-step operation: install + update Intermediate state can be insecure 9/18/2018 © 2007 UC Regents

Cluster Usage Session: NBCR clusters introduction July 31, 2007
Wes Goodman 9/18/2018 © 2007 UC Regents

Where to start National Biochemical Computational Research
How to get an account: Familiarize yourself with the account policy Subscribe to NBCR-support mailing list Subscribe to NBCR-announce mailing list 9/18/2018 © 2007 UC Regents

Where to get help For support email to support@nbcr.net
User services web page: Access to training sessions on Wiki Tools/downloads Documentation Cluster monitoring access User guides at Wiki 9/18/2018 © 2007 UC Regents

Generate public/private keypair
For linux: % ssh-keygen -t dsa Generally, it’s best to accept the default locations Enter a strong password to encrypt your private key 9/18/2018 © 2007 UC Regents

Remote login For login use ssh (not rsh or telnet)
% ssh or % ssh -l accname puzzle.nbcr.net You may have to specify your private key location % ssh -i /path/to/private/key puzzle.nbcr.net On first login, passphrase protect your private key % ssh-keygen -p You may now ssh to either kryptonite or oolite Available clusters: kryptonite.nbcr.net oolite.nbcr.net 9/18/2018 © 2007 UC Regents

Keys management with Agent
Login on a cluster % ssh Start an agent % eval `ssh-agent` Add identities to your agent % ssh-add or % ssh-add ~/.ssh/mykeys/my_special_key.pub Verify that identities are added % ssh-add -l 1024 e9:a6:59:89:f0:f1:87:8e:88:54 /Users/nadya/.ssh/id_dsa (DSA) - OK Could not open connection to your authentication agent - ERROR ! Can execute any command now on any node % cluster-fork ps -u$USER % ssh c0-0 9/18/2018 © 2007 UC Regents

Introduction to Sun Grid Engine
What is a grid? A collection of computing resources that perform tasks A grid node can be a compute server, data collector, visualisation terminal.. SGE is a resource management software Accepts jobs submitted by users Schedules them for execution on appropriate systems based on resource management policies Can submit 100s of jobs without worrying where it will run 9/18/2018 © 2007 UC Regents

What is SGE? Two versions of SGE: Sun Grid Engine (on Rocks clusters)
Distributed under the open source license From sunsource.net Sun N1 Grid Engine N1 stack is available at no cost Paid support from SUN 9/18/2018 © 2007 UC Regents

Job Management Not recommended to run jobs directly!
Use installed load scheduler SUN Grid Engine Load management tool for HETEROGENEOUS distributed computing environment PBS/Torque More sophisticated scheduling Why? You can submit multiple jobs and have it queued (and go home!) Fair Share Allow other people to use the cluster also! (for Myrinet MPI jobs) 9/18/2018 © 2007 UC Regents

Host Roles Master Host Controls overall cluster activity
Frontend, head node It runs the master daemon: sge_qmaster, controlling queues, jobs, status, user access permission Also the scheduler: sge_schedd Execution Host executes SGE jobs execution daemon: sge_execd Runs jobs on its hosts Forwards sys status/info to sge_qmaster 9/18/2018 © 2007 UC Regents

Host Roles continued Submit Host
They are allowed for submitting and controlling batch job only No daemon required to run in this type of host. Administration Host SGE administrator console usually 9/18/2018 © 2007 UC Regents

Job Management Your administrator must setup
a global default queue (all.q) More fine-tunned queues can be setup depending on cluster/user community short.q, long.q, weekend.q, fluent.0.q, fluent.1.q As a user, you only need to know how to Submit your jobs (serial or MPI) Monitor your jobs Get the results 9/18/2018 © 2007 UC Regents

Some SGE Commands Command Description
qconf SGE's cluster, queue etc configuration qmod Modify queue statues: enabled or suspended qacct Extract accounting information from cluster qalter Changes the attributes of submitted but pending jobs qdel Job deletion qhold Holds back submitted jobs for execution qhost Shows status information about SGE hosts qmon X-windows Motif interface qrsh SGE queue based rsh facility qselect List queue matching selection criteria qsh Opens an interactive shell on a low-loaded hosts qstat Status listing of jobs and queues qsub Commandline interface to submit jobs to SGE qtcsh SGE queue based TCSH facility qtcsh, qsh - extended command shells that can transparently distribute execution of programs/applications to least loaded hosts via SGE. 9/18/2018 © 2007 UC Regents

$ qhost $ qhost HOSTNAME ARCH NPROC LOAD MEMTOT MEMUSE SWAPTO SWAPUS
global compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M compute lx26-amd G M M 9/18/2018 © 2007 UC Regents

QMON GUI interface for SGE Administration/Submission
Requires you to run either Linux/Unix on your desktop or have a X-emulator (Hummingbird) on your Windows PC. 9/18/2018 © 2007 UC Regents

Submitting Jobs Command line (qsub) & Graphical (qmon)
Standard, Batch, Array, Interactive, Parallel SGE schedule jobs based on Job priorities User -> FIFO Admin -> can affect with priority settings Equal-Share-Scheduling Scheduler -> user_sort setting Prevents a single user from hogging the queues Recommended!!! 9/18/2018 © 2007 UC Regents

$ qsub Output/error by default in home directory
Look in /opt/gridengine/examples/jobs % qsub simple.sh your job 224 ("simple.sh") has been submitted % cd ~ % more simple.sh.e224 % more simple.sh.o224 Wed Aug 9 14:56:16 PDT 2006 Wed Aug 9 14:56:36 PDT 2006 Use qstat to check job status #!/bin/sh date sleep 10 hostname 9/18/2018 © 2007 UC Regents

Submit autodock job % qsub adsub.sh
your job 225 (adsub.sh") has been submitted #!/bin/sh # request Bourne shell as shell for job #$ -S /bin/sh # work from current dir and put stderr/stdout here #$ -cwd ulimit -s unlimited autodock3 -p test.dpf -l test.dlg status=$? if [ "$status" = "0" ] ; then echo "successful completion $status" else echo "error running autodock3" fi 9/18/2018 © 2007 UC Regents

$ qconf Show all the queues % qconf -sql Show the given queue
% qconf -sq all.q Show command usage % qconf -help Show complex attributes % qconf -sc 9/18/2018 © 2007 UC Regents

Advanced Submit Advanced or Batch jobs == shell scripts
Can be as complicated as you want or even an application! #!/bin/bash # # compiles my program every time and create the executable and run it! # change to my working directory cd TEST # compile the job f77 flow.f -o flow -lm -latlas # run the job ./flow myinput.dat 9/18/2018 © 2007 UC Regents

Requestable Attributes
User submit jobs by specifying a job requirement profile of the hosts or of the queues SGE will match the job requirements and run on suitable hosts Attributes Disk space CPU Memory Software (Fluent lic) OS 9/18/2018 © 2007 UC Regents

Attributes continued Relop Requestable Consumable
Relational operation used to compute whether a queue meets a user request Requestable Can be specified by user or not (eg in qsub) Consumable Manage limited resources, eg licence or cpu #name shortcut type value relop requestable consumable defs arch a STRING none == YES NO none num_proc p INT == YES NO load_avg la DOUBLE >= NO NO slots s INT <= YES YES % qsub -l arch=glinux load_avg=0.01 myjob.sh 9/18/2018 © 2007 UC Regents

Attributes continued By default, all requests are hard
Hard requests are checked first, followed by soft If hard request is not satisfied job is not run For soft requests, SGE attempts to run on “best” fit Important resources mt - memory total mf - memory free s - processor slots st - total swap How to request specifc memory/swap space/cpu/ ? % qsub -soft -l mt=250K,st=100K,mf=300G simple.sh % qsub -hard -l mt=250K,st=100K,mf=300G simple.sh 9/18/2018 © 2007 UC Regents

Array Jobs Parameterized and repeated execution of the same program (in a script) is ideal for the array job facility SGE provides efficient implementation of array jobs Handle computations as an array of independent tasks joined into a single job Can monitor and controlled as a total or by individual tasks or subset of tasks 9/18/2018 © 2007 UC Regents 35

$ qsub Submitting an Array Job from command line
-l option requests for a hard CPU time limit of 45mins -t option defines the task index range 2-10:2 specifies 2,4,6,8,10 Uses $SGE_TASK_ID to find out whether they are task 2, 4, 6, 8 or 10 To find input record As seed for random number generator % qsub -l h_cpu=0:45:0 -t 2-10:2 render.sh data.in 9/18/2018 © 2007 UC Regents

Job cleanup Use SGE command % qdel <job_id> Use Rocks command
% cluster-fork killall <your_executable_name> 9/18/2018 © 2007 UC Regents

SGE submit script Script contents make it executable #!/bin/tcsh
#$ -S /bin/tcsh setenv MPI=/opt/mpich/gnu/bin … $MPI/mpirun -machinefile machines -np $NSLOTS appname make it executable $ chmod +x runprog.sh 9/18/2018 © 2007 UC Regents

Submit file options # meet given resource request #$ -l h_rt=600
# specify interpreting shell for the job #$ -S /bin/sh # use path for standard output of the job #$ -o /your/path # execute from current dir See “man qsub” for more options #$ -cwd # run on 32 processes in mpich PE #$ -pe mpich 32 # Export all environmental variables #$ -V # Export these environmental variables #$ -v MPI_ROOT,FOOBAR=BAR 9/18/2018 © 2007 UC Regents

SGE Hands-On: Rocks 4.1 Submitting using qsub
What does a qsub script look like? cat sleep.sh #!/bin/bash # #$ -cwd #$ -j y #$ -S /bin/bash date sleep 10 9/18/2018 © 2007 UC Regents

SGE Hands-On So what does this do? What about all these options?
Prints the date Sleeps for 10 seconds What about all these options? SGE options prefaced with #$ -cwd: execute the job from the current working directory -j y: merge stderr into stdout -S /bin/bash: specify the interpreting shell for the job to be bash 9/18/2018 © 2007 UC Regents

SGE Hands-On Lets do something more complicated!
Run linpack in parallel! cat linpack.sh #!/bin/bash # #$ -cwd #$ -j y #$ -S /bin/bash MPI_DIR=/opt/mpich/gnu $MPI_DIR/bin/mpirun -np $NSLOTS -machinefile $TMP/machines \ /opt/hpl/gnu/bin/xhpl Submit job using ‘qsub -pe mpich N linpack.sh’ N is the number of processes/slots to allocate to MPI 9/18/2018 © 2007 UC Regents

SGE Hands-On What other SGE options are available?
-o/-e: Redirect stdout and stderr -l: Walltime ex: -l h_rt=24:00:00 #24 hour run There are resource limits for walltime Also, queues: -l short / -l medium / -l long 16 / 24 / 48 hour walltime respectively Notification -M 9/18/2018 © 2007 UC Regents

SGE Hands-On Additional options: -R y/n -N foo -hold_jid job_id
Resource reservation Up to 20 reservations supported -N foo Sets the name of the job to foo -hold_jid job_id holds current job execution until job ‘job_id’ is done useful for sequencing jobs 9/18/2018 © 2007 UC Regents

NAMD Hands-On Want to make a qsub script. Why?
Will make sure our job doesn’t run on the frontend Instead, job will run on a compute node Frees up CPU cycles on the frontend to keep the cluster responsive 9/18/2018 © 2007 UC Regents

NAMD Hands-On Get tutorial data from the NAMD website
This is already present on kryptonite cp ~wes/Track1/namd-tutorial-files ~/<username>/. 9/18/2018 © 2007 UC Regents

NAMD Hands-On namd_qsub.sh #!/bin/bash # #$ -cwd #$ -S /bin/bash
#$ -o namdqsub.out #$ -e namdqsub.err NAMD_DIR=/home/install/usr/apps/NAMD $NAMD_DIR/namd2 namd-tutorial-files/1-2-sphere/ubq_ws_eq.conf 9/18/2018 © 2007 UC Regents

AutoDock Hands-On Copy files I’ve set up Edit submit-VS-screen.csh
cp -r /share/apps/track1/autodockhandson ~/<username>/. Edit submit-VS-screen.csh Replace instances of: /share/apps/track1/ with: /home/nbcr07/<username>/ 9/18/2018 © 2007 UC Regents

AutoDock Hands-On Examining the submission file: #!/bin/bash
#$ -N sim4520 #$ -o sim4520_krel1-nowats.std.out #$ -e sim4520_krel1-nowats.std.err #$ -t 1-119 #$ -S /bin/bash #$ -cwd #$ -m e export STACK_SIZE="unlimited" TASK=`ls /share/apps/track1/autodock_handson/Sim_45208/Sim_42508_dockings | head - $SGE_TASK_ID | tail -1` cd /share/apps/track1/autodock_handson/Sim_45208/Sim_42508_dockings/${TASK} F=`ls *.dpf` NAME=`basename $F .dpf` /home/install/usr/apps/autodock4/bin/autodock4 -p ${NAME}.dpf -l ${NAME}.dlg 9/18/2018 © 2007 UC Regents

AutoDock Hands-On This is an array job Some other cool tricks:
Notice the -t flag. This tells SGE the task index range for instances We can access this variable with $SGE_TASK_ID Some other cool tricks: -N: Specifies job name -m e: mail notification on errors Setting STACK_SIZE=“unlimited” is a fix for autodock4 segmentation faults 9/18/2018 © 2007 UC Regents

Track 1: Cluster and Grid Computing

Similar presentations

Presentation on theme: "Track 1: Cluster and Grid Computing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Track 1: Cluster and Grid Computing

Similar presentations

Presentation on theme: "Track 1: Cluster and Grid Computing"— Presentation transcript:

Similar presentations

About project

Feedback