Nicholas Coleman Computer Sciences Department University of Wisconsin-Madison Distributed Policy Management.

Slides:



Advertisements
Similar presentations
Community Grids Lab1 CICC Project Meeting VOTable Developed VotableToSpreadsheet Service which accepts VOTable file location as an input, converts to Excel.
Advertisements

Condor Project Computer Sciences Department University of Wisconsin-Madison Condor's Use of the Cisco Unified Computing System.
More HTCondor 2014 OSG User School, Monday, Lecture 2 Greg Thain University of Wisconsin-Madison.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.
Matchmaking in the Condor System Rajesh Raman Computer Sciences Department University of Wisconsin-Madison
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
Douglas Thain Computer Sciences Department University of Wisconsin-Madison (In Bologna for June 2000) Condor.
Resource Selector Chuang Liu. What do we want to do? A smart Resource Selector App R S Resource requirement.
Douglas Thain Computer Sciences Department University of Wisconsin-Madison October Condor by Example.
Design and Evaluation of a Resource Selection Framework for Grid Applications University of Chicago.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Jim Basney Computer Sciences Department University of Wisconsin-Madison Managing Network Resources in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
April Open Science Grid Building a Campus Grid Mats Rynge – Renaissance Computing Institute University of North Carolina, Chapel.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
An Introduction to High-Throughput Computing Rob Quick OSG Operations Officer Indiana University Some Content Contributed by the University of Wisconsin.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
TeraGrid Science Gateways: Scaling TeraGrid Access Aaron Shelmire¹, Jim Basney², Jim Marsteller¹, Von Welch²,
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
Grid Computing I CONDOR.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Hunter of Idle Workstations Miron Livny Marvin Solomon University of Wisconsin-Madison URL:
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G Operations.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Alain Roy Computer Sciences Department University of Wisconsin-Madison ClassAds: Present and Future.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.
Software Life Cycle The software life cycle is the sequence of activities that occur during software development and maintenance.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Grid and Cloud Computing Alessandro Usai SWITCH Sergio Maffioletti Grid Computing Competence Centre - UZH/GC3
Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison
Condor Tutorial for Users INFN-Bologna, 6/29/99 Derek Wright Computer Sciences Department University of Wisconsin-Madison
Nick LeRoy Computer Sciences Department University of Wisconsin-Madison Hawkeye.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week.
An Introduction to High-Throughput Computing With Condor Tuesday morning, 9am Zach Miller University of Wisconsin-Madison.
Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.
Weekly Work Dates:2010 8/20~8/25 Subject:Condor C.Y Hsieh.
Dan Bradley Condor Project CS and Physics Departments University of Wisconsin-Madison CCB The Condor Connection Broker.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Condor Tutorial NCSA Alliance ‘98 Presented by: The Condor Team University of Wisconsin-Madison
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor and Virtual Machines.
John Kewley e-Science Centre CCLRC Daresbury Laboratory 15 th March 2005 Paradyn / Condor Week Madison, WI Caging the CCLRC Compute Zoo (Activities at.
The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission and other tools Future developments.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
Quick Architecture Overview INFN HTCondor Workshop Oct 2016
Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)
A Distributed Policy Scenario
Accounting, Group Quotas, and User Priorities
Job Matching, Handling, and Other HTCondor Features
Basic Grid Projects – Condor (Part I)
HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.
Improving ARC backends: Condor and SGE/GE LRMS interface
Presentation transcript:

Nicholas Coleman Computer Sciences Department University of Wisconsin-Madison Distributed Policy Management and Comprehension with Classified Advertisements

A Distributed Policy Scenario › A user submits a job to Condor › The user has designed a policy defining requested services › Machines in condor pool have policies restricting the use of services › The user’s job won’t run - Why?  Is user’s policy to restrictive?  Was job rejected by machine policies?

Policy Management › Resource allocation challenges  Resource heterogeneity  Policy heterogeneity › How to allocate resources?  Conventional centralized allocation not sufficient  Solution: Matchmaking with Classified Advertisements (ClassAds)

Job ? [ MyType = “Job”; Rank =... Requirements =... ] [ MyType = “Job”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] [ MyType = “Machine”; Rank =... Requirements =... ] Matchmaking

Classified Advertisements › Represent entities (e.g. jobs, machines) and their policies › A ClassAd is a set of named expressions called attributes › Types of attributes:  Characteristics of an entity ( Arch, OpSys, Memory )  Constraints for requested resource ( Requirements )  Preferences for requested resource ( Rank )

Typical Classads [ Type = “Machine”; KeybrdIdle = ‘00:23:12’; Memory = 256M; LoadAvg = ; Kflops = 21893; Arch = “INTEL”; OpSys = “LINUX”; Name = “foo.cs.wisc.edu”; Rank = (DayTime() >= ‘9:00’) && ((DayTime() <= ‘17:00’) ? 1/other.ImageSize : 0); Requirements = (other.Type == “Job”) && (other.Owner != “riffraff”) && (LoadAvg < 0.3) && (KeybrdIdle > ‘00:15’); ] [ Type = “Job”; Owner = “ncoleman”; Cmd = “run_sim”; Memory = 31m; Rank = KFlops/1E3 + other.Memory/32; Requirements = (other.Type == “Machine”) && (other.Arch == “INTEL”) && (other.Opsys == “LINUX”) && (other.Memory >= 128); ]

Policy Comprehension › Why won’t my job run?  My policy is too restrictive  My job is rejected by machines in the pool › Looking for answers  Use condor tools (condor_q, condor_status)  Stare at job ClassAd to find out what’s wrong

Condor Tools › condor_q –analyze: Of 105 resource offers, 105 do not satisfy the request's constraints 64 resource offer constraints are not satisfied by this request › User wants more details:  Which parts of job requirements expression are problematic?  Is job ClassAd missing any attributes?

Two Cases to Examine 1. No machines meet the job’s requirements 2. The job does not meet any machine’s requirements One or both of these issues may be preventing the job from running, but they are not interdependent. We can analyze each one separately.

[ Requirements = (Arch==“SPARK”) &&(OpSys==“SOLARIS2.7”) ] JOB Example 1

[ Requirements = (Arch==“SPARK”) &&(OpSys==“SOLARIS2.7”) ] JOB Example 1 Result: (Arch == “SPARK"): did not match - suggestion: REMOVE (Opsys == "SOLARIS2.7"): matched: 2 - suggestion: KEEP

[ Requirements = (Arch==“ALPHA”) && (OpSys==“WINNT”) && (Memory>=64) ] JOB Example 2

[ Requirements = (Arch==“ALPHA”) && (OpSys==“WINNT”) && (Memory>=64) ] JOB Example 2 Result: (Arch == "ALPHA"): matched: 1 - suggestion: REMOVE (OpSys == "WINNT"): matched: 2 - suggestion: KEEP (Memory >= 64): matched: 4 - suggestion: KEEP

[ Owner = “jsmith”; ImageSize = ; Requirements =... ] [ Requirements = (ImageSize <= 50176) && (MemoryReq < 49) ] [ Requirements = (ImageSize <= ) && (MemoryReq < 98) ] [ Requirements = (ImageSize <= 50176) && (MemoryReq < 49) ] JOB MACHINES [ Requirements = (ImageSize <= 50176) && (MemoryReq < 49) ] [ Requirements = (ImageSize <= ) && (MemoryReq < 98) ] Example 3

  ImageSize MemoryReq 1,2,3,4,5 3, 

[ Owner = “jsmith”; ImageSize = ; Requirements =... ] JOB Example 3 Results of Test The following attributes are missing from the job classad: MemoryReq The following attributes should be added or modified: ImageSize: - suggestion: use a value less than or equal to MemoryReq: - suggestion: use a value less than 49

Current Work › ClassAd analysis prototype implemented in Java  Job requirements analysis  Machine requirements analysis  Current version supports a simple menu driven interface › Working on integrating with Condor tools  condor_q –analyze  condor_status

Future Work › Applications to other uses of ClassAds in Condor › Analysis of a successful match › Graphical interface › Analysis of gang matching › ClassAds as an authorization language

Conclusions › Automated fine grained policy expression analysis is useful and feasible › Different issues arise with job requirements analysis and machine requirements analysis › The ClassAd language is ideal for these purposes.