The Condor JobRouter.

Slides:



Advertisements
Similar presentations
Jaime Frey, Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison OGF.
Advertisements

Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Community Grids Lab1 CICC Project Meeting VOTable Developed VotableToSpreadsheet Service which accepts VOTable file location as an input, converts to Excel.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Condor Parallel Universe.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Blackbird: Accelerated Course Archives Using Condor with Blackboard Sam Hoover, IT Systems Architect Matt Garrett, System Administrator.
1 Using Stork Barcelona, 2006 Condor Project Computer Sciences Department University of Wisconsin-Madison
Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Condor Project Computer Sciences Department University of Wisconsin-Madison What’s new in Condor? What’s coming up? Condor Week 2009.
Utilizing Condor and HTC to address archiving online courses at Clemson on a weekly basis Sam Hoover 1 Project Blackbird Computing,
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
HTPC - High Throughput Parallel Computing (on the OSG) Dan Fraser, UChicago OSG Production Coordinator Horst Severini, OU (Greg Thain, Uwisc) OU Supercomputing.
High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.
Workflow Management in Condor Gökay Gökçay. DAGMan Meta-Scheduler The Directed Acyclic Graph Manager (DAGMan) is a meta-scheduler for Condor jobs. DAGMan.
National Alliance for Medical Image Computing Grid Computing with BatchMake Julien Jomier Kitware Inc.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
Condor Project Computer Sciences Department University of Wisconsin-Madison Advanced Condor mechanisms CERN Feb
High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, UWisc Condor Week April 13, 2010.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
Intermediate Condor Rob Quick Open Science Grid HTC - Indiana University.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Grid job submission using HTCondor Andrew Lahiff.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Condor-G A Quick Introduction Alan De Smet Condor Project University of Wisconsin - Madison.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Dan Bradley University of Wisconsin-Madison Condor and DISUN Teams Condor Administrator’s How-to.
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Derek Wright Computer Sciences Department University of Wisconsin-Madison New Ways to Fetch Work The new hook infrastructure in Condor.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
22 nd Oct 2008Euro Condor Week 2008 Barcelona 1 Condor Gotchas III John Kewley STFC Daresbury Laboratory
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Job Router.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Condor Project Computer Sciences Department University of Wisconsin-Madison Running Interpreted Jobs.
Condor Project Computer Sciences Department University of Wisconsin-Madison Using New Features in Condor 7.2.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Greg Thain Computer Sciences Department University of Wisconsin-Madison HTPC on the OSG.
UCS D OSG Summer School 2011 Overlay systems OSG Summer School An introduction to Overlay systems Also known as Pilot systems by Igor Sfiligoi University.
Condor Week May 2012No user requirements1 Condor Week 2012 An argument for moving the requirements out of user hands - The CMS experience presented.
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
Arlington, Dec 7th 2006 Glidein Based WMS 1 A pilot-based (PULL) approach to the Grid An overview by Igor Sfiligoi.
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)
Primer for Site Debugging
Workload Management System
Harnessing the Power of Condor for Human Genetics
Mardi Gras Distributed Applications Conference Baton Rouge, LA
Grid Compute Resources and Job Management
Building Grids with Condor
Condor: Job Management
US CMS Testbed.
Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Understanding Supernovae with Condor
Condor and Multi-core Scheduling
Condor Glidein: Condor Daemons On-The-Fly
Genre1: Condor Grid: CSECCR
HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.
Condor: Firewall Mirroring
Condor-G Making Condor Grid Enabled
GLOW A Campus Grid within OSG
Condor-G: An Update.
Presentation transcript:

The Condor JobRouter

aka “schedd on the side” Dan, Condor Week 2008

Status It’s in the current development series: Condor 7.1.0, unix (windows soonish) Used heavily by CMS physics experiment for simulation on Open Science Grid (millions of jobs routed) Dan, Condor Week 2008

What is “job routing”? original (vanilla) job routed (grid) job Universe = “vanilla” Executable = “sim” Arguments = “seed=345” Output = “stdout.345” Error = “stderr.345” ShouldTransferFiles = True WhenToTransferOutput = “ON_EXIT” Universe = “grid” GridType = “gt2” GridResource = \ “cmsgrid01.hep.wisc.edu/jobmanager-condor” Executable = “sim” Arguments = “seed=345” Output = “stdout” Error = “stderr” ShouldTransferFiles = True WhenToTransferOutput = “ON_EXIT” JobRouter Routing Table: Site 1 … Site 2 final status Dan, Condor Week 2008

Routing is just site-level matchmaking With feedback from job queue number of jobs currently routed to site X number of idle jobs routed to site X rate of recent success/failure at site X And with power to modify job ad change attribute values (e.g. Universe) insert new attributes (e.g. GridResource) add a “portal” grid proxy if desired Dan, Condor Week 2008

Configuring the Routing Table JOB_ROUTER_ENTRIES list site ClassAds in configuration file JOB_ROUTER_ENTRIES_FILE read site ClassAds periodically from a file JOB_ROUTER_ENTRIES_CMD read periodically from a script example: query a collector such as Open Science Grid Resource Selection Service Dan, Condor Week 2008

Syntax Read the 7.1 manual. It’s in the chapter on Grid Computing [ Name = “Grid Site 1”; GridResource = “gt2 gatekeeper…”; MaxIdleJobs = 10; FailureRateThreshold = 0.01; ] Dan, Condor Week 2008

What Types of Input Jobs? Vanilla Universe Self Contained (everything needed is in file transfer list) High Throughput (many more jobs than cpus) Dan, Condor Week 2008

What Target Grid Types? Globus, Condor-C work well others untested, but should be fine Why only target the grid universe? no reason at all 7.1.1 now allows any destination universe Dan, Condor Week 2008

Grid Gotchas Globus gt2 no exit status from job (reported as 0) must explicitly list desired output files Dan, Condor Week 2008

JobRouter vs. Glidein Glidein - Condor overlays the grid JobRouter job never waits in remote queue job runs in its normal universe private networks doable, but add to complexity need something to submit glideins on demand JobRouter some jobs wait in remote queue (MaxIdleJobs) job must be compatible with target grid semantics simple to set up, fully automatic to run Dan, Condor Week 2008