Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.

Similar presentations

Presentation on theme: "Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes."— Presentation transcript:

1 Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes

2 NIC CLUSTER OVERVIEW  Start page:  Node Allocation and Usage policy: The Shared NIC Cluster Hardware and Software  The NIC cluster had 64-bit nodes ( with an Ethernet network and eventually Infiniband interconnect, with the following standard software suite: The Torque/PBS scheduler. Compilers: GCC, Intel-9, Intel-10, and Intel-11 Compiler Suites. Applications and Libraries listed at  InfiniBand offers point-to-point bidirectional serial links intended for the connection of processors with high-speed peripherals such as disks. InfiniBand also offers multicast operations.serial links

3 Cluster pictures

4 PBS Job Scripts  NIC cluster uses PBS (Portable Batch System)  Why?  Improves overall system efficiency  Fair access to all users since it maintains a scheduling policy  Provides protection against dead nodes

5 How PBS works  User writes a batch script for the job and submits it to PBS with the qsub command.  PBS places the job into a queue based on its resource requests and runs the job when those resources become available.  The job runs until it either completes or exceeds one of its resource request limits.  PBS copies the job’s output into the directory from which the job was submitted and optionally notifies the user via email that the job has ended.

6 Step 1: Login  Off-campus machine: Connect to campus using MST VPNusing MST VPN > ssh  On-campus machine (VPN is not required) > ssh  Use the following to set up your MPI path correctly. $ module load openmpi/gnu  To make this your default run $ savemodules Visit to get information about OpenMP

7  DFS files are not directly accessible at the cluster  Use sftp command to transfer any files from your DFS space (S: drive)  e.g.> sftp > get X.c > quit  You may also use WinSCP in Windows or Fugu from OS X. No DFS Support

8 Step 2: Compile MPICH Programs Syntax: C: mpicc –o hello hello.c C++: mpiCC –o hello hello.cpp Note: Before compilation, make sure the MPICH library path is set or use the export command like below: export PATH=/opt/mpich/gnu/bin: $PATH Executable file

9 Step 3: Write PBS batch script file Ex1: A simple script file (pbs_script) A job named “HELLO” requests 8 nodes and at most 15 minutes of runtime. #!/bin/bash #PBS –N HELLO #PBS –l walltime=0:15:00 #PBS –l nodes=8 #PBS –q mpirun –n 8 /nethome/users/ercal/MPI/hello

10 Some PBS Directive options -N jobname (name the job “jobname”) -q (The cluster address to send the job to) -e errfile (redirect standard error to a file named errfile) -o outfile (redirect standard output to a file named outfile) -j oe (combine standard output and standard error) -l walltime=N (request a walltime of N in the form hh:mm:ss) -l cput=N (request N sec of CPU time; or in the form hh:mm:ss) -l mem=N[KMG][BW] (request total N kilo| mega| giga} {bytes| words} of memory on all requested processors together) -l nodes=N:ppn=M (request N nodes with M processors per node)

11 Step 3.1: Submit a Job  Use PBS command qsub Syntax : qsub pbs-job-filename Example : > qsub pbs_script returns the message (555 is the job ID that PBS automatically assigns to your job)

12 Result after job completion  An error file and an output file are created. The names are usually of the form: jobfilename.o(jobid) jobfilename.e(jobid) Ex:simplejob.e555 – Contains STDERR simplejob.o555 – Contains STDOUT  -j oe (combine standard output and standard error)

13 Ex2: Another sample batch script (pbs_script) #PBS -N hello #PBS -l mem=200mb #PBS -l walltime=0:15:00 #PBS -l nodes=2:ppn=2 #PBS -j oe #PBS –m abe mpirun –n 8 /nethome/users/ercal/MPI/hello This job “hello” requests 15 minutes of wall-time, 2 nodes using 2 processors each (4 processors), and 200MB of memory (100MB per node; 50MB per processor). Also, the output and error are written to one file. How many processes are created?

14 Tools  qstat (jobid)  qstat –u (userid)  qstat -a This command returns the status of specified job(s). qdel (jobid) This command deletes a job. size executable_file_name gives O/P in the following format: text data bss dec hex filename 1650312 71928 6044136 7766376 768168 hello (This can help to check memory requirements before submitting a job)

15 Tips for programming in MPICH  Use compiler optimizing flags for faster code. Some of them are:  -O2 (moderate optimization)  -funroll-loops (enables loop unrolling optimizations)  -Wall (enables all common warnings)  -ansi (enables ANSI C/C++ compliance)  -pedantic (enables strictness of language compliance)  Avoid using pointers in your program, unless absolutely necessary.

16  No scanf allowed in MPICH (fscanf is allowed). Instead, pass input to your program using command line arguments argc and argv in the mpiexec command of the PBS script Example: mpirun –n 8 /nethome/users/ercal/MPI/cpi …arguments…  To give your job a high priority, set wall-time ≤ 15 minutes (#PBS -l walltime=0:15:00) and number of nodes ≤ 32 (#PBS -l nodes=16:ppn=2) Tips (cont.)

Download ppt "Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes."

Similar presentations

Ads by Google