Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison

Slides:



Advertisements
Similar presentations
UK Condor Week NeSC – Scotland – Oct 2004 Condor Team Computer Sciences Department University of Wisconsin-Madison The Bologna.
Advertisements

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Virtual Memory (II) CSCI 444/544 Operating Systems Fall 2008.
CS 149: Operating Systems February 3 Class Meeting
Operating Systems Process Scheduling (Ch 3.2, )
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
IT Systems Multiprocessor System EN230-1 Justin Champion C208 –
Chapter 11 Operating Systems
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.
G Robert Grimm New York University Scheduler Activations.
Research Computing with Newton Gerald Ragghianti Nov. 12, 2010.
Jim Basney Computer Sciences Department University of Wisconsin-Madison Managing Network Resources in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Derek Wright Computer Sciences Department, UW-Madison Lawrence Berkeley National Labs (LBNL)
Utilizing Condor and HTC to address archiving online courses at Clemson on a weekly basis Sam Hoover 1 Project Blackbird Computing,
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
History of the National INFN Pool P. Mazzanti, F. Semeria INFN – Bologna (Italy) European Condor Week 2006 Milan, 29-Jun-2006.
Gilbert Thomas Grid Computing & Sun Grid Engine “Basic Concepts”
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
Windows 2000 Scheduling Computing Department, Lancaster University, UK.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
1 HawkEye A Monitoring and Management Tool for Distributed Systems Todd Tannenbaum Department of Computer Sciences University of.
Scientific Computing Division Juli Rew CISL User Forum May 19, 2005 Scheduler Basics.
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
Chapter 41 Processes Chapter 4. 2 Processes  Multiprogramming operating systems are built around the concept of process (also called task).  A process.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Grid Computing I CONDOR.
Win32 Programming Lesson 10: Thread Scheduling and Priorities.
HTCondor and Beyond: Research Computer at Syracuse University Eric Sedore ACIO – Information Technology and Services.
Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.
PROOF work progress. Progress on PROOF The TCondor class was rewritten. Tested on a condor pool with 44 nodes. Monitoring with Ganglia page. The tests.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Operating Systems Process Management.
Condor Team Welcome to Condor Week #10 (year #25 for the project)
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
Using the BYU SP-2. Our System Interactive nodes (2) –used for login, compilation & testing –marylou10.et.byu.edu I/O and scheduling nodes (7) –used for.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Dan Bradley University of Wisconsin-Madison Condor and DISUN Teams Condor Administrator’s How-to.
Using Map-reduce to Support MPMD Peng
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
ITFN 2601 Introduction to Operating Systems Lecture 4 Scheduling.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Condor on WAN D. Bortolotti - INFN Bologna T. Ferrari - INFN Cnaf A.Ghiselli - INFN Cnaf P.Mazzanti - INFN Bologna F. Prelz - INFN Milano F.Semeria - INFN.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Lecture 4 Page 1 CS 111 Summer 2013 Scheduling CS 111 Operating Systems Peter Reiher.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
The Scheduling Strategy and Experience of IHEP HTCondor Cluster
Chapter 1: Introduction
Operating Systems.
CPU SCHEDULING.
Sun Grid Engine.
GLOW A Campus Grid within OSG
Presentation transcript:

Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison The Bologna Batch System: Flexible Policy with Condor

The Bologna Batch System › Custom batch scheduling system for local users at INFN in Bologna, Italy.  “Istituto Nazionale di Fisica Nucleare”  Dr. Paolo Mazzanti initiated the idea. › Implement on a small subset of machines within the larger nationwide INFN Condor pool:  INFN Condor Pool: ~300 CPUs  INFN-Bologna Condor: ~100 CPUs  Bologna Batch System: ~50 CPUs

The Bologna Batch System › Paolo wanted two things:  Resources to remain in the larger INFN Condor pool  But local jobs prioritized and managed specially & differently than elsewhere at INFN.  Local control over policies of local resources -- without abandoning commitment to a shared resource pool.

Where We Started › Basic Condor Policy  Opportunistic resources Jobs only run when machines are otherwise idle Jobs can be preempted for machine owners or higher-priority users  Fair-share across INFN pool Highest priority user in the pool gets first crack at a given resource The more you use, the worse your priority becomes › Some problems:  Long-running vanilla jobs (with no checkpointing) were frequently preempted before running to completion  Users dislike waiting for a resource if they only want to run a short job  High-priority users from other INFN sites running on local resources while lower-priority local users wait.

BBS Policy Requirements › Prioritize local work  Share resources, but run outside jobs as backfill › Treat local servers as “dedicated” resources for local jobs, but “opportunistic” resources for other jobs.  Run outside Condor jobs only if the server is idle.  Run local batch jobs regardless of other system load or console activity.  Preempt outside Condor jobs to allow local batch jobs to run, but don’t preempt local jobs for outside work.

BBS Policy Requirements › Ensure resource availability for both short and long-running jobs  Prioritize short batch jobs so that they are never kept waiting by long batch jobs.  Prevent long batch jobs from being preempted or starved by short jobs. › Never waste resources  No idle CPUs when jobs are waiting to run!  No preemption of vanilla jobs! Preemption ideal if you can checkpoint, but here we can’t…

A Contradiction! › No way to guarantee resource availability for short or long jobs without “reserving” some CPUs for each… ›...But no way to avoid idle CPUs without allowing them to start any kind of job:  If CPUs reserved for short jobs are used for long jobs, they become unavailable to run short jobs.  If CPUs reserved for short jobs are not used for long jobs, they’re being wasted when there are no short jobs to run. › What to do, what to do…

A Solution! › Allow resources to be temporarily overcommitted  We treat one CPU as two…  On a two-CPU machine, define four Condor VMs (virtual machines): two for short jobs and two for long jobs. › Allow jobs to be suspended rather than preempted  Think of as “checkpointing to swap”… › OR allow jobs to be “de-prioritized” temporarily  If memory is adequate, allow “suspended” long jobs to continue running at a poor OS priority and steal cycles whenever “active” short jobs are busy doing I/O.

Everybody wins! › Short jobs start right away on dedicated “short” VMs › Long jobs aren’t preempted by short jobs, but rather suspend temporarily or run at a lower priority. › Outside jobs run only when no Bologna jobs waiting. › All CPUs available to all types of jobs.  No idle CPUs when jobs are waiting.

Okay, how? › Flipside of flexibility is complexity! › It’s pretty cool that Condor allows you to combine dedicated and opportunistic scheduling in one system, but it takes a bit of work to get it all set up…  It took the world-experts in writing Condor policy expressions (that’s us) many weeks of concerted effort to get this all working smoothly. › Luckily for y’all, we’ve already done the hard part, and now you can copy it.

Copy it from where? › Bologna Batch System document  › A detailed walk-through of the specific policies and the necessary Condor configuration to make each one work. › Line by line examples of how we implemented each. › A work in progress.  Just this February, we added material.  Open to contributions – what clever things have you done? › What’s in it? Let’s take a look…

Advanced Policies In the BBS Document › How to mark certain jobs as being of one type or another.  (And how to verify jobs are what they say they are.) › How to mark certain machines as being of one type or another. › How to configure multiple Condor VMs on a single CPU. › How to implement different policies for different VMs on the same machine.  New: how to make one VM aware of what’s running on another VM

Simple for Users › Although policy is complicated, the interface for users is kept simple:  Users call bbs_submit_long or bbs_submit_short, just as they would condor_submit… Short jobs start quickly, but those that run for >1 hour are killed. Long jobs will run to completion...  bbs_submit_* scripts automatically add the appropriate classad attributes to the job to take advantage of the long or short running VMs on Bologna Batch Servers.

What About COD? › Condor’s “Computing On Demand” › Designed to solve a similar problem  short-lived interactive computations that need to run immediately, without preempting longer batch jobs. › Some differences:  COD focused on truly interactive work, not batch computation  BBS focused on local control of a shared resource, and balancing the needs of short and long-running batch jobs  COD tasks not represented as jobs in a queue, accounted for resource usage, etc. – they “just run”. COD is more of a instant “Condor rsh”  BBS preserves normal Condor job management structure – jobs are submitted like always, accounted for in the same way, etc.

Any Questions? › Ask me now... › Feel free to me at › Check the Bologna Batch System document at › Thanks!