PBSpro Advanced Information Systems & Technology Advanced Campus Services Prepared by Chao “Bill” Xie, PhD student Computer Science Fall 2005.

Slides:



Advertisements
Similar presentations
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
Advertisements

NODEMANAGER WEBLOGIC SERVER. 1.Creating logical machines 2.Using nodemanager for server startup and shutdown GETTING STARTED.
Koç University High Performance Computing Labs Hattusas & Gordion.
Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP Basic Portable Batch System (PBS)
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Introduction to Unix (CA263) Introduction to Shell Script Programming By Tariq Ibn Aziz.
Fork and Exec Unix Model Tutorial 3. Process Management Model The Unix process management model is split into two distinct operations : 1. The creation.
CTEC 1863 – Operating Systems Shell Scripting. CTEC F2 Overview How shell works Command line parameters –Shift command Variables –Including.
Condor Project Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Advanced Shell Programming. 2 Objectives Use techniques to ensure a script is employing the correct shell Set the default shell Configure Bash login and.
Reusable Code For Your Appx Processes Presented By: Gary Rogers.
Agenda Control Flow Statements Purpose test statement if / elif / else Statements for loops while vs. until statements case statement break vs. continue.
Introduction to Shell Script Programming
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
A proposal for standardizing the working environment for a LCG/EGEE job David Bouvet - Grid Computing team - CCIN2P3 HEPIX Karlsruhe 13/05/2005.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Database-Driven Web Sites, Second Edition1 Chapter 5 WEB SERVERS.
Chapter 8 Cookies And Security JavaScript, Third Edition.
Linux+ Guide to Linux Certification, Third Edition
Introduction to Using SLURM on Discover Chongxun (Doris) Pan September 24, 2013.
Linux Operations and Administration
UNIX Commands. Why UNIX Commands Are Noninteractive Command may take input from the output of another command (filters). May be scheduled to run at specific.
Process Control. Module 11 Process Control ♦ Introduction ► A process is a running occurrence of a program, including all variables and other conditions.
Oracle Data Integrator Procedures, Advanced Workflows.
Network Queuing System (NQS). Controls batch queues Only on Cray SV1 Presently 8 queues available for general use and one queue for the Cray analyst.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Data types Function handle – grpc_function_handle_t A structure that contains a mapping between a client and an instance of a remote function Object handle.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February Session 11.
Guide to Linux Installation and Administration, 2e1 Chapter 11 Using Advanced Administration Techniques.
Shell Advanced Features. Module 8 Shell Advanced Features ♦ Introduction In Linux systems, the shells are often referred to as command line interfaces.
Chapter 2: Introduction to HyperMesh Process Auomation
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
1 © 2000 John Urrutia. All rights reserved. Session 5 The Bourne Shell.
CSCI 330 UNIX and Network Programming Unit IX: Shell Scripts.
Linux Commands C151 Multi-User Operating Systems.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
Lab 8 Overview Apache Web Server. SCRIPTS Linux Tricks.
Introduction to Bash Shell. What is Shell? The shell is a command interpreter. It is the layer between the operating system kernel and the user.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February Session 12.
Process Control Management Prepared by: Dhason Operating Systems.
Lesson 8-Specifying Instructions to the Shell. Overview An overview of shell. Execution of commands in a shell. Shell command-line expansion. Customizing.
Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.
Advanced Computing Facility Introduction
Managing User Desktops with Group Policy
Auburn University
PARADOX Cluster job management
Unix Scripts and PBS on BioU
David Bouvet Fabio Hernandez IN2P3 Computing Centre - Lyon
OpenPBS – Distributed Workload Management System
Agenda Bash Shell Scripting – Part II Logic statements Loop statements
Is 221: Database Administration
CommLab PC Cluster (Ubuntu OS version)
CompTIA Linux+ Powered by LPI 2 LX0-104 Dumps PDF LX0-104 Dumps LX0-104 Braindumps LX0-104 Question Answers LX0-104 Study Material.
Agenda Control Flow Statements Purpose test statement
Microsoft Visual Basic 2005 BASICS
Download LX0-104 Exam Dumps Questions & Answers - LX0-104 Braindumps Dumps4download
Exam 1 Material Study Guide
Computer Science Core Concepts
Automating SAS through the Power of VB Script
Chapter 3 The UNIX Shells
Presentation transcript:

PBSpro Advanced Information Systems & Technology Advanced Campus Services Prepared by Chao “Bill” Xie, PhD student Computer Science Fall 2005

Using PBSpro Advanced, IS&T Advanced Campus Services2 Syllabus Environment Variables Checkpointing

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services3 Environment variables Environment Variables Taken from the user ’ s environment Created by PBS Created by users All names start with “ PBS_ ” Some names start with “ PBS_O_ ” Indicating the variable is from the job ’ s originating environment

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services4 Important variables PBS_O_HOME Value of HOME from submission environment PBS_O_HOST Host name on which the qsub command was executed PBS_O_PATH Value of path from submission environment PBS_O_QUEUE original queue name to which the job was submitted PBS_O_SHELL Value of shell from submission environment PBS_O_SYSTEM Operation system name where qsub was executed PBS_O_WORKDIR Absolute path of directory where qsub was executed

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services5 Important variables (cont1) PBS_DEFAULT Name of the default PBS server PBS_EVIRONMENT Indicate job types: PBS_BATCH or PBS_INTERACTIVE PBS_JOBID Job identify assigned to the job or job array PBS_JOBNAME Job name supplied by the user PBS_MOMPORT Port number on which this job ’ s MOMs will communicate

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services6 Important variables (cont2) PBS_NODEFILE Filename containing a list of nodes assigned to the job PBS_NODENUM Logical node number of this node allocated to the job PBS_QUEQUE Name of the queue from which the job is executed PBS_TASKNUM Tasks (process) number for the job on this node TMPDIR Job-specific temporary directory for this job

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services7 Checkpointing Two methods of checkpoint / restart: OS-specific method SGI IRIX and Cray UNICOS Generic site-specific method Specify the checkpointing directory “ -C path ” command line option to pbs_mom PBS_CHECKPOINT_PATH environment variable “ $checkpoint_path path ” option in MOM ’ s config file default value

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services8 Checkpointing (cont) Manually checkpointing a job Use the qhold command Checkpointing jobs during PBS shutdown Append the -t immediate option to the qterm statement in the PBS start/stop script Suspending/checkpointing multi-node jobs Save the complete session state in a file A open socket will cause the operation to fail

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services9 Site-specific method Modify file mom_priv/config “ periodic ” job checkpoint action (during job execution) $action checkpoint TIME_OUT SCRIPT_PATH ARGS [...] Checkpoint just before the job is to be terminated $action checkpoint_abort TIME_OUT SCRIPT_PATH ARGS [...] Job restart action $action restart TIME_OUT SCRIPT_PATH ARGS [...]

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services10 Site-specific method (cont) $restart_background (true|false) A boolean flag that modifies how MOM performs a restart “ false ” (the default), MOM runs the restart operation and waits for the result “ true ”, restart operations are done by a child of MOM which only returns when all the restarts for all the local tasks of a job are done, while the parent (main) MOM continue processing without being blocked $restart_transmogrify (true|false) A boolean flag that controls how MOM launches the restart script/program “ false ” (the default), MOM will run the restart script and block until the restart operation is complete “ true ”, MOM will run the restart script/program in such a way that the script will “ become ” the task it is restarting.

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services11 Specify checkpoint in job “ -c interval ” option defines the checkpoint interval (in minutes) The interval argument is specified as: n No checkpointing is to be performed. s Checkpointing is to be performed only when the server executing the job is shutdown. c Checkpointing is to be performed at the default minimum time for the Server executing the job. c=minutes Checkpointing is to be performed at an interval of minutes u Checkpointing is unspecified, thus resulting in the same behavior as “ s ”. If “ -c ” is not specified, the checkpoint attribute is set to the value “ u ”. qsub – c c=10 myjob

Fall 2005 Using PBSpro Advanced, IS&T Advanced Campus Services12 References PBS Professional 7 Quick Start PBS Professional 7 User Guide PBS Professional 7 Administration Guide

Thank you! Contacts: Bill Victor