Presentation is loading. Please wait.

Presentation is loading. Please wait.

Koç University High Performance Computing Labs Hattusas & Gordion.

Similar presentations


Presentation on theme: "Koç University High Performance Computing Labs Hattusas & Gordion."— Presentation transcript:

1 Koç University High Performance Computing Labs Hattusas & Gordion

2 General Information There are two clusters in High Performance Labs. The name of the first cluster is Hattusas. It consist of 32 nodes and each node has one CPU. Hattusas will be used by both Science and Engineering Faculty. The name of the other cluster is Gordion. It consist of 8 nodes and each node has one CPU. Gordion will be used for special purpose projects by only Engineering Faculty. Linux Red Hat 9.0 is installed on both clusters.

3 The Hardware of Hattusas The hardware of the nodes of Hattusas is: CPU: 1 Intel Pentium 4 2.4 Ghz Memory: Storage: Network: CD-Rom: The hardware of Hattusas server is: CPU: 2 Intel Pentium 4 2.4 Ghz Memory: Storage: Network: CD-Rom:

4 Hattusas Installation Hattusas is installed by the latest version of the OSCAR version 3.0. “OSCAR version 3.0 is a snapshot of the best known methods for building, programming, and using clusters. It consists of a fully integrated and easy to install software bundle designed for high performance cluster computing. Everything needed to install, build, maintain, and use a modest sized Linux cluster is included in the suite, making it unnecessary to download or even install any individual software packages on your cluster.” http://oscar.openclustergroup.org

5 The Software of Hattusas The software of the Hattusas which is automatically installed by OSCAR is: C3 - http://www.csm.ornl.gov/torc/C3/http://www.csm.ornl.gov/torc/C3/ LAM/MPI - http://www.lam-mpi.org/http://www.lam-mpi.org/ Maui PBS Scheduler - http://supercluster.org/maui/http://supercluster.org/maui/ MPICH - http://www-unix.mcs.anl.gov/mpi/mpich/http://www-unix.mcs.anl.gov/mpi/mpich/ OpenPBS - http://www.openpbs.org/http://www.openpbs.org/ OpenSSH - http://www.openssh.com/http://www.openssh.com/ OpenSSL - http://www.openssl.org/http://www.openssl.org/ PVM - http://www.csm.ornl.gov/pvm/http://www.csm.ornl.gov/pvm/ System Installation Suite - http://www.sisuite.org/http://www.sisuite.org/ Older OSCAR version : LUI - http://oss.software.ibm.com/developerworks/projects/lui/ http://oss.software.ibm.com/developerworks/projects/lui/

6 Partition on Hattusas HattusasNode Home70Projects70Others70Scretch70Others20Projects20Home70 The home partition on the nodes are mounted from Hattusas server. Consequently, you can read from and write to your account from both server and nodes If your program generates large temporary files, you can use projects partition on the nodes. It is available to write and each node has its own projects partition whose capacity is 20 GB.

7 How Hattusas is controlled OpenPBS is responsible for the Job Management in Hattusas. “OpenPBS is the original version of the Portable Batch System. It is a flexible batch queueing system developed for NASA in the early to mid-1990s. It operates on networked, multi-platform UNIX environments.”NASA http://www.openpbs.org/ A detailed guide for OpenPBS (Portable Batch System Release 2.3) can be found at: http://www.chl.chalmers.se/~eb/pbs/files/v2.3_admin.pdf

8 Queue Structure of Hattusas There two kinds of queues, the first one is routing queue and the second one is execution queue. In Hattusas, a routing queue named submitq and 8 execution queue is defined. However job submission is only done to the routing queue. The job of routing is the look at the resource requirements of the program and decide to which queue will the the job go. According resource requirments, it will send to the correct queue. By the queue structure, the aim is to optimize the usage of cluster. So, the other 8 execution queues are designed for this purpose. The last thing that you should be aware of is that this queue sturucture is not a first in first out queue. Every queue has its own priority. Therefore, the job that goes to that queue has the same priority of that queue. However, inside the each individual queues, the working mechenism is first in first out.

9 How to Create an Account on Hattusas Account creation will be held by the CIT. After your account has been created, you will be able to use Hattusas cluster.

10 How to Login to Hattusas Hattusas server is open to network so you can connect from everywhere. You can connect to Hattusas by either ssh or telnet. You don’t need to know the IP address of the Hattusas inside the school. For instance: ssh hattusas.eng.hpc.ku.edu.tr The name of the server is automatically known by all the terminals inside the school. To able to connect to Hattusas, you need only an account on Hattusas server.

11 How to Use Cluster To be able to use Hattusas, you need only an account on Hattusas. After your account has been created, you will be able to use the cluster. During the account creation progress, your username will be added to the all nodes. Consequently, you will be able to pass from Hattusas to its nodes by rsh or ssh(which can be a requirment for certain programs).

12 OpenPBS The Portable Batch System, PBS, is a batch job and computer system resource management package. While you are using cluster, you will need OpenPBS to control your jobs. PBS consist of four major components: commands, the job Server, the job executor, and the job Scheduler. Commands: PBS supplies both command line commands and a graphical interface. Job Server: The Server’s main function is to provide the basic batch services such as receiving/creating a batch job, modifying the job, protecting the job against system crashes, and running the job. Job Executor: The job executor is the daemon which actually places the job into execution. Job Scheduler: The Job Scheduler is another daemon which contains the site’s policy controlling which job is run and where and when it is run.

13 OpenPBS Currently, PBS provides two GUIs: xpbs and xpbsmon. xpbs provides a user-friendly point-and-click interface to the PBS commands. The xpbs(1) man page provides full information on configuring and running xpbs. xpbsmon is the node monitoring GUI for PBS. It is used for displaying graphically information about execution hosts in a PBS environment. Its view of a PBS environment consists of a list of sites where each site runs one or more Servers, and each Server runs jobs on one or more execution hosts (nodes).

14 How to submit Jobs There are two ways two submit your job. The first one is submitting through scripts and the second one is through the command line. To submit your job, the first thing, that you should be aware of, is the resources that your job requires. Because the scheduling system of Hattusas needs job requirements to put your job in the correct queue. Consequently, while submitting your job, you should specify the job requirements; otherwise your job won’t begin execution.

15 How to submit Jobs The first way is through the command line. The first line is always standard for any shell script which specifies the name of the shell for executing the commands. Then, it will consist of resource requirements, job attributes and the executable name. All pbs directives for resources and job attributes in a shell script start with #PBS. The executable can have arguments too. Example of a PBS sample job script that runs the executable name `subrun': #! /bin/sh #PBS -l walltime=2:00:00 #PBS -l mem=800mb #PBS -l ncpus=1 #PBS -j oe cd /homes/agarwal/release/workdir./subrun

16 How to submit Jobs To submit the script to PBS, you use the qsub command. For instance, if the script were called myscript you'd submit it usingqsub qsub myscript 3212.hattusas.eng.hpc.ku.edu.tr The second line is the job identifier returned by PBS, and indicates that the script has been accepted. Notes You must specify the number of CPUs your job needs (-l ncpus=), memory (-l mem=)(optional), and the maximum wall clock time (-l walltime=). If you do not specify any values, your job will not be accepted by PBS.

17 How to submit Jobs Checking the Status of PBS Jobs The qstat command is for checking the PBS job status. If you want to display full or long information about the job whose id is 3212, use: [test@hattusas test]$ qstat -f 3212 Deleting a Job from the Queue The qdel command is used to delete any job from the queue. Suppose you want to delete a job with the job id 3212, then use: [test@hattusas test]$ qdel 3212

18 How to submit Jobs The second way is the submit through the command line using qsub command. [test@hattusas test]$ qsub -V -l nodes=muse21 -N myjob Here myjob is the specified job name Again the other check and delete commands are valid also by this method.

19 A Sample PBS Batch Submission Script #!/bin/csh # # file: pbs.template # # purpose: template for PBS (Portable Batch System) script # # remarks: a line beginning with # is a comment; # a line beginning with #PBS is a pbs command; # assume (upper/lower) case to be sensitive; # # use: submit job with # qsub pbs.template # # job name (default is name of pbs script file) #PBS -N myjob # # resource limits: number of CPUs to be used #PBS -l ncpus=25 # # resource limits: amount of memory to be used #PBS -l mem=213mb

20 A Sample PBS Batch Submission Script # resource limits: max. wall clock time during which job can be running #PBS -l walltime=3:20:00 # # path/filename for standard output #PBS -o mypath/my.out # # path/filename for standard error #PBS -e mypath/my.err # # queue name, one of {submit, special express} # The default queue, "submit", need not be specified #PBS -q submit # # group account (for example, g12345) to be charged #PBS -W group_list=g12345 # # files to be copied to execution server before script processing starts # usage: -W stagein=local-filename@remotehost:remote-filename #PBS -W stagein=my.input@msa01-h:runs/input/my.inputstagein=my.input@msa01-h:runs/input/my.input # # files to be copied from execution server after script processing # usage: -W stageout=local-filename@remotehost:remote-filename #PBS -W stageout=my.output@msa01-h:runs/output/my.outout

21 A Sample PBS Batch Submission Script # start job only after MMDDhhmm, where M=Month, D=Day, h=hour, m=minute # e.g., July 4th, 14:30 #PBS -a 07041430 # # send me mail when job begins #PBS -m b # send me mail when job ends #PBS -m e # send me mail when job aborts (with an error) #PBS -m a # if you want more than one message, you must group flags on one line, # otherwise, only the last flag selected executes: #PBS -mba # # do not rerun this job if it fails #PBS -r n # # export all my environment variables to the job #PBS -V

22 A Sample PBS Batch Submission Script # Using PBS - Environment Variables # When a batch job starts execution, a number of environment variables are # predefined, which include: # # Variables defined on the execution host. # Variables exported from the submission host with # -v (selected variables) and -V (all variables). # Variables defined by PBS. # # The following reflect the environment where the user ran qsub: # PBS_O_HOST The host where you ran the qsub command. # PBS_O_LOGNAME Your user ID where you ran qsub. # PBS_O_HOME Your home directory where you ran qsub. # PBS_O_WORKDIR The working directory where you ran qsub. # # These reflect the environment where the job is executing: # PBS_ENVIRONMENT Set to PBS_BATCH to indicate the job is a batch job, or # to PBS_INTERACTIVE to indicate the job is a PBS interactive job. # PBS_O_QUEUE The original queue you submitted to. # PBS_QUEUE The queue the job is executing from. # PBS_JOBID The job's PBS identifier. # PBS_JOBNAME The job's name. ###

23 END


Download ppt "Koç University High Performance Computing Labs Hattusas & Gordion."

Similar presentations


Ads by Google