Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP Advanced Portable Batch System.

Slides:



Advertisements
Similar presentations
Introduction to Macromedia Director 8.5 – Lingo
Advertisements

Cluster Computing at IQSS Alex Storer, Research Technology Consultant.
Network for Computational Nanotechnology (NCN) UC Berkeley, Univ.of Illinois, Norfolk State, Northwestern, Purdue, UTEP First-Time User Guide BJT Lab V2.0.
Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room :00 AM - 11:00 AM.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
CNT 4603: Managing/Maintaining Server 2008 – Part 3 Page 1 Dr. Mark Llewellyn © CNT 4603: System Administration Spring 2014 Managing And Maintaining Windows.
An End-User Perspective On Using NatQuery Building a Dynamic Variable T
Programming with Alice Computing Institute for K-12 Teachers Summer 2011 Workshop.
Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP Basic Portable Batch System (PBS)
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
ISG We build general capability Job Submission on the Olympus Cluster J. DePasse; S. Brown, PhD; T. Maiden Pittsburgh Supercomputing Center Public Health.
High Performance Computing
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Linux+ Guide to Linux Certification, Second Edition
Introduction to EMF Server Communication and Cases Beta Testing November 4, 2009.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Chapter 4 MATLAB Programming Logical Structures Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Critical Flags, Variables, and Other Important ALCF Minutiae Jini Ramprakash Technical Support Specialist Argonne Leadership Computing Facility.
Introduction to UNIX/Linux Exercises Dan Stanzione.
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Queues Chapter 3. Objectives Introduce the queue abstract data type. – Queue methods – FIFO structures Discuss inheritance in object oriented programming.
1 Functions 1 Parameter, 1 Return-Value 1. The problem 2. Recall the layout 3. Create the definition 4. "Flow" of data 5. Testing 6. Projects 1 and 2.
Introduction to Arrays. definitions and things to consider… This presentation is designed to give a simple demonstration of array and object visualizations.
Chapter 41 Processes Chapter 4. 2 Processes  Multiprogramming operating systems are built around the concept of process (also called task).  A process.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Network for Computational Nanotechnology (NCN) UC Berkeley, Univ.of Illinois, Norfolk State, Northwestern, Purdue, UTEP First-Time User Guide Drift-Diffusion.
Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, MIT, Molecular Foundry, UC Berkeley, Univ. of Illinois, UTEP Multi.
Summer Computing Workshop. Introduction  Boolean Expressions – In programming, a Boolean expression is an expression that is either true or false. In.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Network Queuing System (NQS). Controls batch queues Only on Cray SV1 Presently 8 queues available for general use and one queue for the Cray analyst.
Linux+ Guide to Linux Certification, Second Edition Chapter 10 Managing Linux Processes.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Guide to Linux Installation and Administration, 2e1 Chapter 11 Using Advanced Administration Techniques.
Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP ADEPT 2.0 First-Time User Guide.
Chapter 1 (PART 1) Introduction to OS (concept, evolution, some keywords) Department of Computer Science Southern Illinois University Edwardsville Summer,
Homework Assignment #1 J. H. Wang Oct. 6, 2011.
HTCondor and Workflows: An Introduction HTCondor Week 2015 Kent Wenger.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
The Stable Marriage Problem
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Introduction to Operating Systems Prepared by: Dhason Operating Systems.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
CERN Running a LCG-2 Site – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
Agenda The Bourne Shell – Part I Redirection ( >, >>,
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Copyright © Curt Hill More on Operating Systems Continuation of Introduction.
Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
MT311 Java Application Development and Programming Languages Li Tak Sing( 李德成 )
© 2011 Pittsburgh Supercomputing Center XSEDE 2012 Intro To Our Environment John Urbanic Pittsburgh Supercomputing Center July, 2012.
Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.
Advanced Computing Facility Introduction
PARADOX Cluster job management
Unix Scripts and PBS on BioU
OpenPBS – Distributed Workload Management System
IW2D migration to HTCondor
Chapter 4 MATLAB Programming
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Scripts & Functions Scripts and functions are contained in .m-files
Ainsley Smith Tel: Ex
Sun Grid Engine.
Introduction to OS (concept, evolution, some keywords)
Introduction to OS (concept, evolution, some keywords)
Presentation transcript:

Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP Advanced Portable Batch System (PBS) Xufeng Wang, Kaspar Haume, Gerhard Klimeck Network for Computational Nanotechnology (NCN) Electrical and Computer Engineering Last reviewed May 2013

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Overview of Education Materials Introduction to computing clusters [Done. Summer, 2009] Fundamentals of computers, clusters. Concept of massive computation via cluster resources. Introduction to Subversion (originated from “Data preservation via SVN for NCN students) [Done. Fall, 2008] Data preservation. Subversion. SVN clients on windows and mac. Data storage system. Project accesses. Front-end machine access [by Ben Haley] Basic Portable Batch System [Done. Summer, 2009, Review May 2013] PBS queue system. Basic manipulations. Advanced Portable Batch System [Done. Summer/Fall, 2009] PBS queue system. Advanced manipulations. 2

Xufeng Wang, Kaspar Haume, Gerhard Klimeck From Basic Portable Batch System Definition of Portable Batch System (PBS) Composition of a PBS script PBS job submission (qsub) PBS queue related commands (qstat) Simple PBS job manipulation (qdel, qselect, etc.) 3

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Outline Advanced manipulation of PBS jobs in queue Batch jobs and job array Job dependencies Passing variables to jobs 4

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Right after the morning coffee… Boss: “Deadline is today! Run program A, B, and C now!” 5

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Hold a job Boss: “Oh wait, hold program C! I need check first!” The pause of a job in PBS is called “hold”. The job can be running, or queuing. A job being “held” basically means its execution stops. It will no longer utilize any CPU, and its state is preserved (which means it can resume at the same point later). How to pause a job in PBS? qhold 6

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Release a job Boss: “Nevermind. Go ahead with program C” The un-pause of a job in PBS is called “release”. If a previously queued job was held, and then released, it will be in “queued (Q)” state again; if it was running (R), it will be in “waiting (W)” state upon release. How to un-pause a job in PBS? qrls 7

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Move and modify a job Boss: “Hmm! Job A will take about 6 hours to run!” The above two actions can only apply to “queued” or “held” jobs, not to others such as “running” jobs. How to move a job to new queue in PBS? qmove How to modify a queued job’s walltime? qalter 8

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Reorder a job Boss: “Sorry! I meant job B, not A.” In PBS, you can only exchange the queue order between two jobs. You cannot really “squeeze” a job into a certain position, but rather you have to “swap”. You can only reorder jobs that are either “queued” or “held”. How to reorder jobs in PBS? qorder Error 9

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Reorder a job PBS reorder will swap the jobs, not their queues! Queues are like seats; they do not move when two persons are switching seats. The 10 hours walltime is not acceptable in standby. That’s what the error means. In this case, we have to modify the walltime first, and then reorder. How to reorder jobs in PBS? qorder Final 10

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Summary Advanced manipulation of PBS jobs in queue CommandUsageQ jobs? R jobs? H jobs? qhold Hold a queued/executing PBS jobYes No qrls Release a held PBS jobNo Yes qmove Move jobs between queuesYesNoYes qalter Alternate the attributes of a PBS jobYesNoYes qorder Reorder PBS jobYesNoYes 11

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job array Boss: “Good! Sweep program D. With inputs 1:1:100!” Often, we need to sweep a certain parameter of a program, thus creating an “array” of similar but “incoherent” jobs. Such things can be archived by writing some shell script generating PBS scripts one by one, or some other “pre- processing” method. PBS has inherent support for such batch of similar jobs. This concept is called a “job array” in PBS. How to sweep jobs in PBS? #PBS -t 12

Xufeng Wang, Kaspar Haume, Gerhard Klimeck …… Input=4 Job array Key characteristics of job array is “different parameters, but same executable”. PBS job array #PBS -t PBS script … … Job[4] Job[3] Job[2] Job[1] Input=3 Input=2 Input=1 13

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job array #PBS –t 1-10 creates 10 instances the job, indexed 1 to 10. Environmental variable PBS_ARRAYID corresponds to the index of each array element. Each job runs with the commands specified in the script, meaning that if procs had been set to 10, then each job will run with 10 cores In this case job[1] will run helloworld_1 and output to output_1.txt How to form a job array #PBS -t 14

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job array Array jobs have unified IDs with index, like array values in MATLAB Use qstat with option –t to see the individual jobs Job ID for the entire job array has to contain empty brackets To refer to one of the jobs, use the brackets, qstat –f [1] Submit the job array #PBS -t 15

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job array By using a table or certain method or relating the index to a set of parameters, users can have great flexibility in batch job inputs. PBS job array extensions #PBS -t PBS script … … Job[4] Job[3] Job[2] Job[1] …… 4A=0,C=2 3D=3,C=1 2A=4,C=5.4 1A=1, B=4.5 16

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job array An example of relating ARRAYID to job input parameters. Here myProgram takes an input file and two arguments, all determined by the array ID PBS job array extensions #PBS -t 17

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency Boss: “Post process results from D with E. If D fails, run F.” PBS as a “Batch System” manager has the ability to arrange the execution order of a series of jobs and decide which to run upon the outcome of others. This is called “Job Dependency”. User can now specify a list of jobs to run with different execution conditions. This allows a user to submit these jobs at once and leave them to PBS. How to stage one job’s execution upon the completion of another? #PBS -W 18

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency “PBS –W depend=“ is the line for specifying dependency. Immediately following is the dependency condition, which in this case is “after” (after job has begun executing). Immediately following the condition is the job ID of the depended job. In this case, it is a job array. For dependency condition of failed execution, “afternotok” is the keyword: Specify job dependencies #PBS –W depend Job dependency specification submit_E.pbs submit_F.pbs 19

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency All jobs with dependencies will start in status of “hold”. If its dependency condition is met, its status will change to “queued” and start execution as soon as possible. The job might remain in the queue forever or be removed if its depended job is lost, deleted, or would never satisfy its condition,. You have to be careful with these residues. Specify job dependencies #PBS –W depend If condition not yet met, the jobs will initially hold Before D finishes After D finishes 20

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency (after) Specify job dependencies #PBS –W depend This job may begin… afterAfter depended job has started execution afterokAfter depended job has successfully terminated afternotokAfter depended job has terminated with errors afteranyAfter depended job has terminated with or without errors 21 With many types of dependency conditions available, the user is able to schedule the execution of jobs upon the outcomes of others, and thus build a complicated network of jobs with deep and nested dependencies.

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency: example (after) Standard output of qsub is the job ID. Let’s use that Write shell script (here newjob.pbs) Specify job dependencies #PBS –W depend Turn into executable and run it as a batch file 22

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency (before) Specify job dependencies #PBS –W depend 23 Depended job may begin… beforeWhen this job has begun execution beforeokWhen this job has terminated successfully beforenotokWhen this job has terminated with errors beforeanyWhen this job has terminated with or without errors It is also possible to tell a job to run before another job. This can be useful if many jobs should run before a given job. The commands are like those for after Example on next slide

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Job dependency: example (before) Specify job dependencies #PBS –W depend 24 The job that should run after a series of jobs must have the command #PBS –W depend=on:count where count is the number of jobs that this job depends on. Submit it and note the ID. The count other jobs that should run before then get any of the before commands listed on previous slide, together with the Job ID that the depended job returned.

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Passing Environment Variables to Job qsub -v With the option –v it is possible to pass variables from the command line. As with all qsub options, this may also be done in the pbs script #PBS –v var1=“5”,var2=“1’,var3=“data” 25

Xufeng Wang, Kaspar Haume, Gerhard Klimeck Looking for help man If something is not clear or does not work, I encourage you to look up the function, for example man qsub or man qstat Some websites with examples    The guides for the clusters Your neighbor 26