GridShell + Condor How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner Edward Walker Miron Livney Todd Tannenbaum The Condor Development Team.

Slides:



Advertisements
Similar presentations
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Advertisements

Dealing with real resources Wednesday Afternoon, 3:00 pm Derek Weitzel OSG Campus Grids University of Nebraska.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Challenges in Executing Large Parameter Sweep Studies across Widely Distributed Computing Environments Edward Walker Chona S. Guiang.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
1 Workshop 20: Teaching a Hands-on Undergraduate Grid Computing Course SIGCSE The 41st ACM Technical Symposium on Computer Science Education Friday.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Introduction to Condor DMD/DFS J.Knudstrup December 2005.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Introduction to UNIX/Linux Exercises Dan Stanzione.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Prof. Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University FT-MPICH : Providing fault tolerance for MPI parallel applications.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
Progress Report Barnett Chiu Glidein Code Updates and Tests (1) Major modifications to condor_glidein code are as follows: 1. Command Options:
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
Grid Computing I CONDOR.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
Condor Birdbath Web Service interface to Condor
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
Experiences with a HTCondor pool: Prepare to be underwhelmed C. J. Lingwood, Lancaster University CCB (The Condor Connection Broker) – Dan Bradley
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
Grid Workload Management Massimo Sgaravatto INFN Padova.
Grid job submission using HTCondor Andrew Lahiff.
Using the NSF TeraGrid for Parametric Sweep CMS Applications Jeffrey P. Gardner Edward Walker Vladimir Litvin Pittsburgh Supercomputing Center Texas Advanced.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Grid Compute Resources and Job Management. 2 Local Resource Managers (LRM)‏ Compute resources have a local resource manager (LRM) that controls:  Who.
NGS Innovation Forum, Manchester4 th November 2008 Condor and the NGS John Kewley NGS Support Centre Manager.
Dealing with real resources Wednesday Afternoon, 3:00 pm Derek Weitzel OSG Campus Grids University of Nebraska.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
TeraGrid Advanced Scheduling Tools Warren Smith Texas Advanced Computing Center wsmith at tacc.utexas.edu.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David.
Job Submission with Globus, Condor, and Condor-G Selim Kalayci Florida International University 07/21/2009 Note: Slides are compiled from various TeraGrid.
Weekly Work Dates:2010 8/20~8/25 Subject:Condor C.Y Hsieh.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
GridShell/Condor: A virtual login Shell for the NSF TeraGrid (How do you run a million jobs on the NSF TeraGrid?) The University of Texas at Austin.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
Workload Management Workpackage
Dynamic Deployment of VO Specific Condor Scheduler using GT4
High Availability in HTCondor
Condor Glidein: Condor Daemons On-The-Fly
Basic Grid Projects – Condor (Part I)
HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.
Condor-G Making Condor Grid Enabled
Presentation transcript:

GridShell + Condor How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner Edward Walker Miron Livney Todd Tannenbaum The Condor Development Team Ryan Scranton Andrew Connolly Pittsburgh Supercomputing Center Texas Advanced Computing Center Texas Advanced Computing Center University of Wisconson University of Pittsburgh

Scientific Motivation Example: Astronomy Astronomy is increasingly being done by using large surveys with 100s of millions of objects. Analyzing large astronomical datasets frequently means performing the same analysis task on >100,000 objects. Each object may take several hours of computing. The amount of computing time required may vary, sometimes dramatically, from object to object.

Requirements Schedulers at each Teragrid site Should be able to gracefully handle ~1000 single-processor jobs at a time Metascheduler distributes jobs across the Teragrid Metascheduler must be able to handle ~100,000 jobs

Requirements Schedulers at each Teragrid site Should be able to gracefully handle ~1000 single-processor jobs at a time

Solution: PBS/LSF? In theory, existing Teragrid schedulers like PBS or LSF should provide the answer. In practice, this does not work. Teragrid nodes are multiprocessor Only 1 PBS job per node Teragrid machines frequently restrict the number of jobs a single user may run Asking for many processors at once communicates your actual resource requirements.

Solution: Clever shell scripts? We could submit a single PBS job that uses many processors. Now we have a reasonable number of PBS jobs. Scheduling priority would reflect our actual resource usage. This still has problems. Each job takes a different amount of time to run: we are using resources inefficiently.

Requirements Schedulers at each Teragrid site Should be able to gracefully handle ~1000 single-processor jobs at a time Metascheduler distributes jobs across the Teragrid Metascheduler must be able to handle ~100,000 jobs

Requirements Metascheduler distributes jobs across the Teragrid Metascheduler must be able to handle ~100,000 jobs Teragrid has no metascheduler

Metacheduler Solution: Condor-G? Condor-G will schedule an arbitrarily large number of jobs across multiple grid resources using Globus. However, 1 serial Condor-G job = 1 PBS job, so we are left with the same PBS limitations as before: Teragrid nodes are multiprocessor Only 1 PBS job per node Teragrid machines frequently restrict the number of jobs a single user may run.

The Real Solution: Condor+GridShell The real solution is to submit one large PBS on each Teragrid node, then use a private scheduler to manage serial work units within the PBS job. Vocabulary: JOB: (n) a thing that is submitted via Globus or PBS WORK UNIT: (n) An independent unit of work (usually serial), such as the analysis of a single astronomical object PBS Job PE Serial Work Units Private Scheduler

The Real Solution: Condor+GridShell The real solution is to submit one large PBS on each Teragrid node, then use a private scheduler to manage serial work units within the PBS job. Vocabulary: JOB: (n) a thing that is submitted via Globus or PBS WORK UNIT: (n) An independent unit of work (usually serial), such as the analysis of a single astronomical object PBS Job PE Serial Work Units Private Scheduler Condor GridShell

Condor Overview Condor was first designed as a CPU cycle harvester for workstations sitting on people’s desks. Condor is designed to schedule large numbers of jobs across a distributed, heterogeneous and dynamic set of computational resources.

Advantages of Condor Condor user experience is simple Condor is flexible Resources can be any mix of architectures Resources do not need a common filesystem Resources do not need common user accounting Condor is dynamic Resources can disappear and reappear Condor is fault-tolerant Jobs are automatically migrated to new resources if existing one become unavailable.

Central Manager collector Condor Daemon Layout (very simplified) Submission Machine schedd Execution Machine startd Startd sends system specifications (ClassAds) and system status to Central Manager negotiator (To simplify this example, the functions of the Negotiator are combined with the Collector)

Condor Daemon Layout (very simplified) Central Manager collector Submission Machine schedd Execution Machine startd Schedd sends job info to Central Manager User submits Condor job

Central Manager collector Condor Daemon Layout (very simplified) Submission Machine schedd Execution Machine startd Central Manager uses information to match Schedd jobs to available Startds

Condor Daemon Layout (very simplified) Submission Machine schedd Execution Machine startd Schedd sends job to Startd on assigned execution node Central Manager collector

“Personal” Condor on a Teragrid Platform Condor daemons can be run as a normal user. Condor “GlideIn”™ ability supports the ability to launch condor_startd’s on nodes within an LSF or PBS job.

“Personal” Condor on a Teragrid Platform (Condor runs with normal user permissions) Central Manager collector Submission Machine schedd Execution PE startd Execution PE startd Execution PE startd Login Node PBS Job on Compute Nodes- GlideIn

GridShell Overview Allows users to interact with distributed grid computing resources from a simple shell-like interface. extends TCSH version 6.12 to incorporates grid-enabled features: parallel inter-script message-passing and synchronization output redirection to remote files parametric sweep

GridShell Examples Redirecting the standard output of a command to a remote file location using GlobusFTP: a.out > gsiftp://tg-login.ncsa.teragrid.org/data Message passing between 2 parallel tasks: if ( $_GRID_TASKID == 0) then echo "hello" > task_1 else Set msg=`cat < task_0` endif Executing 256 instances of a job: a.out on 256 procs

Merging GridShell with Condor Use GridShell to launch Condor GlideIn jobs at multiple grid sites All Condor GlideIn jobs report back to a central collector This converts the entire Teragrid into your own personal Condor pool!

Merging GridShell with Condor Login Node Gridshell event monitor User starts GridShell Session at PSC PSC (Alpha) NCSA (IA64) TACC (IA32) GridShell process

Merging GridShell with Condor Login Node Gridshell event monitor Login Node Gridshell event monitor Login Node Gridshell event monitor GridShell session starts event monitor on remote login nodes via Globus PSC (Alpha) NCSA (IA64) TACC (IA32) GridShell process Condor process

Merging GridShell with Condor Login Node collectorschedd Gridshell event monitor Login Node Gridshell event monitor Login Node Gridshell event monitor Local event monitor starts condor daemons on login node PSC (Alpha) NCSA (IA64) TACC (IA32) GridShell process Condor process

Login Node collectorschedd Gridshell event monitor PBS-RMS Job Login Node Gridshell event monitor PBS Job Login Node Gridshell event monitor LSF Job gtcsh-ex TACC (IA32) PSC (Alpha) NCSA (IA64) All event monitors submit PBS/LSF jobs. These jobs start GridShell gtcsh-exec on all processors All event monitors submit PBS/LSF jobs. These jobs start GridShell gtcsh-exec on all processors gtcsh-ex Master gtcsh-exec GridShell process Condor process gtcsh-ex Master gtcsh-exec gtcsh-ex Master gtcsh-exec

Login Node collectorschedd Gridshell event monitor PBS-RMS Job Login Node Gridshell event monitor PBS Job Login Node Gridshell event monitor LSF Job gtcsh-ex TACC (IA32) PSC (Alpha) NCSA (IA64) gtcsh-exec on each processor starts a Condor startd. Heartbeat is maintained between all gtcsh-exec processes gtcsh-exec on each processor starts a Condor startd. Heartbeat is maintained between all gtcsh-exec processes gtcsh-ex Master gtcsh-exec GridShell process Condor process gtcsh-ex Master gtcsh-exec gtcsh-ex Master gtcsh-exec startd “Heartbeat”

Login Node collectorschedd Gridshell event monitor PBS-RMS Job Login Node Gridshell event monitor PBS Job Login Node Gridshell event monitor LSF Job gtcsh-ex TACC (IA32) PSC (Alpha) NCSA (IA64) gtcsh-exec on each processor starts a Condor startd gtcsh-ex Master gtcsh-exec GridShell process Condor process gtcsh-ex Master gtcsh-exec gtcsh-ex Master gtcsh-exec startd

Login Node collectorschedd Gridshell event monitor PBS-RMS Job Login Node Gridshell event monitor PBS Job Login Node Gridshell event monitor LSF Job gtcsh-ex TACC (IA32) PSC (Alpha) NCSA (IA64) Condor schedd distributes Condor jobs to compute nodes gtcsh-ex Master gtcsh-exec GridShell process Condor process gtcsh-ex Master gtcsh-exec gtcsh-ex Master gtcsh-exec startd

Demo: GridShell on the Teragrid We will launch and run withing a GridShell session on 3 Teragrid sites: PSC (Alpha) NCSA (IA64) TACC (IA32) We will use Condor to schedule work units of scientific application: Application “synfast” calculates Monte Carlo realizations of the Cosmic Microwave Background. Submit 200 independent “synfast” work units, each calculates 1 Monte Carlo realization

Start GridShell Session 1. Write a simple GridShell configuration script: # vo.conf: # A GridShell config script tg-login.ncsa.teragrid.org tg-login.tacc.teragrid.org 2. Start GridShell session. This submits PBS jobs at each site and starts local Condor daemons % vo-login –n 1:16 –H ~/vo.conf –G –T –W 60 Spawning on tg-login.ncsa.teragrid.org Spawning on tg-login.tacc.teragrid.org waiting for VO participants to callback... ###########Done. -n 1:16 Start 1 PBS job per Teragrid machine, each with 16 processors -W 60 Each PBS job has a wallclock limit of 60 minutes -H vo.conf Configuration file

Start GridShell Session 3. Check status of PBS jobs on all sites (grid)% agent_jobs GATEWAY: iam763.psc.edu iam763 (PENDING) GATEWAY: lonestar.tacc.utexas.edu (PENDING) GATEWAY: tg-login1.ncsa.teragrid.org tg-master.ncsa.teragrid.org (PENDING) (grid)% agent_jobs GATEWAY: iam763.psc.edu iam763 (RUNNING) GATEWAY: lonestar.tacc.utexas.edu (RUNNING) GATEWAY: tg-login1.ncsa.teragrid.org tg-master.ncsa.teragrid.org (RUNNING)

Submit Condor Job 4. We can now interact with Condor daemons: (grid)% condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime LINUX INTEL Unclaimed Idle :04:04 LINUX INTEL Unclaimed Idle :04:04 … IA64 INTEL Unclaimed Idle :04:04 IA64 INTEL Unclaimed Idle :04:04 … OSF1 ALPHA Unclaimed Idle :00:03 OSF1 ALPHA Unclaimed Idle :00:08 … Machines Owner Claimed Unclaimed Matched Preempting INTEL/LINUX Total

Submit Condor Job 5. Write a simple Condor job description file: # SC2004demo.cmd: Condor Job Description Universe = Vanilla Executable = SC2004demo.sh # Arguments for SC2004demo.sh Arguments = $(Process) # stderr and stdout Error = SC2004.$(Process).err Output = SC2004.$(Process).out # Log file for all Condor jobs Log = SC2004.log # Queue up 200 Condor jobs Queue 200

Submit Condor Job 6. Write a the SC2004demo.sh script: #! /bin/sh DO SIMULATION # DO SIMULATION cd $SC2004_SCRATCH $SC2004_EXEC_DIR/synfast <<EOF synfast arguments EOF Copy output to central repository # Copy output to central repository cat >/tmp/transfer.csh<<EOF #!$GRIDSHELL_LOCATION/gtcsh grid –io on cat datafile > gsiftp://repository.org/scratch/experiment/data EOF chmod 755 transfer.csh./transfer.csh Environment variables can be defined in.cshrc Use GridShell to transfer output files

Submit Condor Job 7. Submit Condor job: (grid)% condor_submit SC2004demo.cmd submitting jobs……………… logging submit events……………… 200 jobs submitted to cluster 1 8. Examine Condor queue: (grid)% condor_q -- Submitter: iam763 : : iam763 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 gardnerj 11/1 21: :00:00 R SC2004demo.cmd 1.1 gardnerj 11/1 21: :00:00 R SC2004demo.cmd … gardnerj 11/1 21: :00:00 I SC2004demo.cmd

GridShell in a NutShell We have used GridShell to turn the TeraGrid into our own personal Condor pool We can submit Condor jobs, and Condor will schedule these jobs across multiple TeraGrid site TeraGrid sites do not need to share architecture or queuing systems GridShell also allows us to use TeraGrid protocols to transfer our input and output data

GridShell in a NutShell LESSON: Using GridShell coupled with Condor one can easily harness the power of the Teragrid to process large numbers of independent work units. All of this fits into existing Teragrid software.