Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.

Slides:



Advertisements
Similar presentations
Globus Workshop at CoreGrid Summer School 2006 Dipl.-Inf. Hamza Mehammed Leibniz Computing Centre.
Advertisements

Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
NorduGrid Grid Manager developed at NorduGrid project.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.
Part 7: CondorG A: Condor-G B: Laboratory: CondorG.
CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely.
13/05/2004Janusz Martyniak Imperial College London 1 Using Ganga to Submit BaBar Jobs Development Status.
A Computation Management Agent for Multi-Institutional Grids
6a.1 Globus Toolkit Execution Management. Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Assignment 3 Using GRAM to Submit a Job to the Grid James Ruff Senior Western Carolina University Department of Mathematics and Computer Science.
Workload Management Massimo Sgaravatto INFN Padova.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Overview of TeraGrid Resources and Usage Selim Kalayci Florida International University 07/14/2009 Note: Slides are compiled from various TeraGrid Documentations.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Grid Compute Resources and Job Management. 2 Local Resource Managers (LRM)‏ Compute resources have a local resource manager (LRM) that controls:  Who.
June 21-25, 2004Lecture2: Basic Grid Skills1 Lecture 2 Basic Grid Skills Presenter Name Presenter Institution Presenter address Grid Summer Workshop.
© 2007 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing July 31, 2007 Wilfred.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
Condor-G A Quick Introduction Alan De Smet Condor Project University of Wisconsin - Madison.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Part Five: Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Review of Condor,SGE,LSF,PBS
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Grid Compute Resources and Job Management. 2 How do we access the grid ?  Command line with tools that you'll use  Specialised applications Ex: Write.
File Systems cs550 Operating Systems David Monismith.
Job Submission with Globus, Condor, and Condor-G Selim Kalayci Florida International University 07/21/2009 Note: Slides are compiled from various TeraGrid.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
Grid Compute Resources and Job Management. 2 Job and compute resource management This module is about running jobs on remote compute resources.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
Madison, Apr 2010Igor Sfiligoi1 Condor World 2010 Condor-G – A few lessons learned by Igor UCSD.
Workload Management Workpackage
Information System testing for LCG-1
Reading e-Science Centre
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Example: Rapid Atmospheric Modeling System, ColoState U
Peter Kacsuk – Sipos Gergely MTA SZTAKI
GWE Core Grid Wizard Enterprise (
and Alexandre Duarte OurGrid/EELA Interoperability Meeting
Creating and running applications on the NGS
Module 4 Remote Login.
CS1010 Programming Methodology
Grid Compute Resources and Job Management
Building Grids with Condor
Condor: Job Management
US CMS Testbed.
Part Three: Data Management
Initial job submission and monitoring efforts with JClarens
Oxana Smirnova (Lund, EPF) 3rd NorduGrid Workshop, May23, 2002
Wide Area Workload Management Work Package DATAGRID project
gLite Job Management Christos Theodosiou
GRID Workload Management System for CMS fall production
CST8177 Scripting 2: What?.
Condor-G Making Condor Grid Enabled
Grid Computing Software Interface
The DZero/PPDG D0/PPDG mission is to enable fully distributed computing for the experiment, by enhancing SAM as the distributed data handling system of.
Condor-G: An Update.
Presentation transcript:

Globus Job Management

Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun

A: GRAM

GRAM: What is it? Given a job specification: Create an environment for a job Stage files to/from the environment Submit a job to a local scheduler Monitor a job Send job state change notifications Stream a job’s stdout/err during execution

GRAM: Some Terminology We speak loosely most of the time, but: Globus Job Management Service Starts up and monitors jobs Stages data in and out GRAM Protocol to communicate with the job management service We often say “GRAM” as a shorthand for either of these

Local Resource Manager GRAM: How Does it Work? Head Node a.k.a “Gatekeeper” Compute Resource Gatekeeper (Authenticates & Authorizes) GRAM Client Process Local Resource Manager Results Job Manager (Submits job & Monitors job)

GRAM: What is a “Local Resource Manager?” It’s usually a batch system that allows you to run jobs across a cluster of computers Examples: Condor PBS LSF Sun Grid Engine Most systems allow you to access “fork” It’s the default It runs on the gatekeeper: a bad idea in general, but okay for testing

GRAM: RSL The client describes the job with the Resource Specification Language (RSL) & (executable = a.out) (directory = /home/nobody ) (arguments = arg1 "arg 2") You don’t usually need to specify RSL directly, unless you have special needs. http://www.globus.org/gram/rsl_spec1.html

GRAM: Security GRAM uses GSI for security Submitting a job requires a full proxy The remote system & your job will get a limited proxy The job will run—you had a full proxy when you submitted But your job cannot submit other jobs

Making your job batch ready Must be able to run in the background: no interactive input, windows, GUI, etc. Can still use STDIN, STDOUT, and STDERR (the keyboard and the screen), but files are used for these instead of the actual devices Organize data files Must be able to be run multiple times, sometimes incomplete

GRAM: Basic Usage globus-job-run hostX /bin/hostname This runs /bin/hostname on hostX It expects /bin/hostname to already be there globusrun -o -r hostX ‘&(executable=/bin/echo) (arguments=Hello Grid)’ This is the RSL We could specify lots of things here, but we didn’t These just ran with the fork job manager, not an “interesting” batch system

GRAM: Running on a Batch System Append the batch system to the hostname: globus-job-run hostX/jobmanager-condor /bin/hostname You will do this for most real work The batch system can handle many more jobs Batch systems are reliable and track your jobs Fork is not reliable, and your job may be lost

B: Globus Job Commands

Globus Job Commands globus-job-run ‘contact-string’ command globus-job-submit ‘contact-string’ command globus-job-status ‘contact-string’ globus-job-get-output ‘contact-string’ globus-job-clean ‘contact-string’ globusrun

Lab: globusrun

Lab: globusrun In this lab, you’ll: Set up your environment for job submission Submit simple jobs with globus-job-run and globus-job-submit Use globus & RSL Stage data with globusrun & RSL

Credits NSF disclaimer Portions of this presentation were adapted from the following sources: Jaime Frey, Condor Group, UW-Madison