Presentation is loading. Please wait.

Presentation is loading. Please wait.

Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room 1030 10:00 AM - 11:00 AM.

Similar presentations


Presentation on theme: "Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room 1030 10:00 AM - 11:00 AM."— Presentation transcript:

1 Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room :00 AM - 11:00 AM

2 Welcome and Agenda Welcome General Updates – Upgrades to the Illinois Campus Cluster Program webpage – Governance – Status of the second instance of the Illinois Campus Cluster Program, Golub “Cluster Basics - Submitting arrays of serial Mathematica jobs” an Overview presented by Jonathan Manton Open Floor for User Discussion Adjourn

3 Illinois Campus Cluster Program Webpage Update User Discussion Forum System Status Usage/Queue Info Exchange calendar for events/outages SSH terminal (java) Globus Online interface

4 Governance Update Executive Governance Committee Membership Finalized – Jonathan Greenberg, Asst. Prof, Geography (Investor Forum chair) – Mark Neubauer, Asst. Prof, Physics (Investor Forum representative) – Eric Shaffer, Asst. Dir., Computational Science and Engineering (Investor Forum representative) – Patty Jones, Assoc. Dir. for Research, Beckman (OVCR representative) – Chuck Thompson, Asst. Dean and Dir., Engineering IT Shared Services (IT Governance Research Committee Representative) – Neal Merchen, Assoc. Dean for Research, ACES (Office of the Chancellor Representative) – Charley Kline, IT Architect, CITES (CIO's Office Representative) – John Towns, Dir. of Collaborative eScience Programs, NCSA (ex-officio, Manager of Illinois Campus Cluster Operators) First meeting scheduled for May 1 st Next Investor Forum being planned

5 Golub Status All Hardware is on site as of 4/8/2013. Worked through a power design issue with DELL. Last PM+ migrated Torque/Moab over to Golub End game schedule being developed.

6 Cluster Basics Submitting arrays of serial Mathematica jobs Jonathan Manton

7 Topics Intended Audience Introduction Conceptual Overview Job Preparation PBS File Preparation Submitting and Monitoring Post-Processing Recap

8 Intended Audience Intended audience is novice users Also applicable to experienced users who have never used PBS job arrays If this is not new to you, please provide feedback to improve this tutorial

9 Introduction This tutorial covers submitting large numbers of identical serial (vs. parallel) jobs to the cluster Each job does the same work, but on different input parameters or data sets Procedure developed for Mathematica jobs, but can be easily adapted for Matlab, C, etc. This presentation is based on a post from the user forum

10 Conceptual Overview Create a program that takes arguments from the command line to determine what work to do Prepare a text file that has the appropriate input parameters, each line representing one run of the program Use the PBS job array functionality to launch many jobs, each with different input parameters If necessary, consolidate the data at the end

11 Job Preparation Mathematica Script #!/usr/local/Mathematica-8.0/bin/math -script (* tell the shell to interpret the rest of this file using Mathematica *) (* grab the command line arguments and put them in Mathematica variables *) i = ToExpression[$CommandLine[[4]]]; j = ToExpression[$CommandLine[[5]]]; (* multiply the prime before the first argument by the prime after the second *) f[x_, y_] := NextPrime[x,-1] * NextPrime[y,1] (* format result using comma-separated values, to make post- processing easy *) fmt[x_, y_, result_] := StringJoin[ToString[x],",",ToString[y],",",ToString[result]]; (* print to standard output *) Print[StandardForm[fmt[ i, j, f[i,j] ]]];

12 Job Preparation Script Permissions and Input File Need to set Mathematica script file to be executable chmod a+x myscriptname.m Input file has one line per program run, arguments separated by spaces Example input text file for an array of 5 jobs:

13 PBS File Preparation The arguments at the top of the PBS file: – Tell the scheduler not to waste resources and to schedule many (12) jobs per node – Defines the size of the job array and the starting and ending job array indices The shell script at the bottom of the PBS file: – Figures out what job array index is running – Grabs the appropriate line from the input data file – Executes the program with those arguments – Saves output data in a filename made unique using those arguments

14 PBS File Preparation Example (1 of 2) #!/bin/bash # # This batch script is an example of running lots of copies of a serial # job using a single input file with one line per set of input arguments. # # Many of our "casual" users want to just run a lot of jobs, not necessarily # one job with lots of cores. # #PBS -l walltime=00:10:00 ## ## these tell the scheduler to use one core per job, and to schedule multiple ## jobs per node. On the secondary queue at least, if you don't have the ## naccesspolicy=singleuser argument, you will schedule one job per *node*, ## wasting 11 out of 12 cores for serial jobs #PBS -l nodes=1:ppn=1 #PBS -l naccesspolicy=singleuser ## #PBS -N primepair_test #PBS -q secondary ## ## This says to run with job array indices 3 through 7. These indices are used ## below to get the right lines from the input file #PBS -t 3-7 ## #PBS -j oe # CONTINUED NEXT SLIDE…

15 PBS File Preparation Example (2 of 2) …CONTINUED FROM PREVIOUS SLIDE ## grab the job id from an environment variable and create a directory for the ## data output export JOBID=`echo "$PBS_JOBID" | cut -d"[" -f1` mkdir $PBS_O_WORKDIR/"$JOBID" cd $PBS_O_WORKDIR/"$JOBID" module load mathematica ## grab the appropriate line from the input file. Put that in a shell variable ## named "PARAMS" export PARAMS=`cat $HOME/mexample/inputs | sed -n ${PBS_ARRAYID}p` ## grab the arguments, using the linux "cut" command to get the right field ## modify to match the number of arguments you have for your program export ST1=`echo "$PARAMS" | cut -d" " -f1` export ST2=`echo "$PARAMS" | cut -d" " -f2` ## Run Mathematica script, directing output to a file named based on the ## input parameters. The assumption is the combination of parameters is unique ## for the job. Modify for the number of parameters you have by adding $ST#. $HOME/mexample/primepair.m $ST1 $ST2 > data$ST1-$ST2

16 Submitting and Monitoring Submit as usual qsub job-inputarray.pbs Monitor as usual qstat | grep $USER Note that job arrays have a different job name format – Instead of a job with the name cc- mgmt1 it will be [].cc-mgmt1 – Some commands will behave differently, brackets are a special character in bash

17 Post-Processing The example PBS script will create a subdirectory with the same name as the job number Each job will have a different output file in that directory It is easy to consolidate the small files into one big file using the Linux cat command cat /data* > outputs.csv

18 Recap Create your program so it takes input from the command line Put your unique inputs, one per line, in a text file Submit the job – one instance of the program for each line in the input file Consolidate your data at the end

19 Open Discussion This is your opportunity to share with the Illinois Campus Cluster Program Team your thoughts. Next meeting – July timeframe (meet in summer?)

20 Adjourn Thank you for providing your thoughts. Please feel free to contact us at


Download ppt "Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room 1030 10:00 AM - 11:00 AM."

Similar presentations


Ads by Google