Presentation is loading. Please wait.

Presentation is loading. Please wait.

how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David.

Similar presentations


Presentation on theme: "how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David."— Presentation transcript:

1

2 how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David Lambert, Stephen Moore, Arnie Miles, Chad La Joie, Brent Putman, Jess Cannata

3 Large amounts of computing power goes untapped, yet researchers cannot typically find computing power. Resource owners must set policies for the use of their equipment. Users must find and leverage resources that apply to their needs. The Paradox of Grid Computing

4 Secure grid-like installations are not growing beyond small groups of known players. but....WHY? The only method currently available for ensuring security of a resource involves personal interaction between resource owners and resource consumers. Enabling a user or resource to access a resource requires manually adding user to a local map file. Various methods of grouping users and resources to share certificates have sprung up.

5 On the other hand Grids that encourage resource owners to connect their machines to a central portal that only allows specific efforts to run have exploded. S.E.T.I. United Devices Grid.org IBM's World Community Grid What does this mean? Historically, getting massive quantities of resources on the grid has been a challenge. However, in situations where the potential resource owners are relieved of heavy administrative burdens, resource owners flock to the grid. When massive numbers of resources are made available to researchers, real work gets accomplished.

6 How are jobs executed? Modern Job Scheduling software include: Condor Sun Grid Engine (N1) PBS (Pro and Open) LSF Platform

7 Job scheduling software is unsurpassed in environments where there is only one administrative domain. Beowulf Clusters High Performance n-way devices Unfortunately, as soon as you begin to cross any sort of administrative line, these products become less robust. Intra-Campus grids Inter-Campus grids Attempts to leverage existing grid tools to handle this have resulted in compromises. Groups of users sharing one certificate. User management issues. Accounting issues.

8 In general, job scheduling software accepts a job description file that describes the work to be done. Job file is free form text, containing name-value pairs. We can therefore add anything we want to these files, as long as we teach the execution machines to understand.

9 # Example condor_submit input file # (Lines beginning with # are comments) Universe = vanilla Executable = /home/arnie/condor/my_job.condor Input = my_job.stdin Output = my_job.stdout Error = my_job.stderr Arguments = -arg1 -arg2 InitialDir = /home/arnie/condor/run_1 Queue Example Submission file (Condor)

10 Condor in the Beowulf, Supercomputer, or campus Grid world. Universe = vanilla Executable = /home/arnie/condor/my_job.condor Input = my_job.stdin Output = my_job.stdout Error = my_job.stderr Arguments = -arg1 -arg2 InitialDir = /home/arnie/condor/run_1 Queue User has an account on the cluster or HP device, all nodes are in a closely controlled administrative domain.

11 Schedd Collector Negotiator Central Manager (CONDOR_HOST ) Collector Negotiator Pool-Foo Central Manager Collector Negotiator Pool-Bar Central Manager Submit Machine Condor Grid with Flocking “Flocks” are introduced to each other by hostname or IP address.

12 Job Scheduling with Conventional “Grid” Products: Globus and Condor-G User submits job via Globus enabled version of Condor. Any number of resources “on the grid” accept jobs from Globus Gatekeeper and are distributed to Globus Job Managers to be distributed to resources. Each resource must physically map a Globus x.509 certificate to a local user account.

13 User and Resources Management Problems How does the owner of a grid resource grant access to large numbers of individuals? Summary of Limitations from Previous Examples How does the owner of a grid resource know when a user granted access by membership in an organization leaves that organization? How does a user easily get added to a resource? How does a user find available resources?

14 SAML based solutions provide secure access to attributes about a user to a resource to become a powerful partner to existing batch job schedulers. While Condor was already able to leverage user attributes from a local LDAP store, this project demonstrates the first time that Condor can consume user attributes from a remote store.

15 LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LDAP DB UW Condor Schedd Federation WAYF Georgetown IdP 1 8 7 4 6 3 2 5 4 Bob @ Georgetown University Running Job 10 9 Resource Condor Schedd Job ClassAd UW Shib/Condor Portal LD AP DB Shib/Cond or Portal Condor Schedd Condor Schedd Job ClassAd Resource ClassAd User at Site 'A'Resource at Site 'B' WAYF IdP Runn ing Job 1 1 1010 8 9 7 4 6 3 2 5 4 Condor Startd What we are doing now with Shibboleth, LDAP, and Condor User at Site 'A' is aware of a Resource at Site 'B' and Owner of Resource 'B' has granted access to Site 'A'. We leverage the free-text job submission files to add attributes from SAML to our jobs.

16 Resource Scheduler Running Job 1010 Resource Scheduler Running Job 1010 Resource Scheduler Running Job 1010 Resource Scheduler Running Job 1010 Company “A” University “B” Resource Discovery Network Node Resource Discovery Network Job Submission Client Identity Provider User Job File New Work: Phase II SAML based grid work engine with intelligent resource management Now, Resource owners can grant access to users based upon their attributes instead of their identities. Management of users is again the responsibility of the local administration, as it should be. When Resource Owners can easily set policies without worrying about user management and group memberships, they will become willing to attach their resources to this new computational Grid.

17 Intelligent Resource Management Users have their own policy decisions to make: Processor type, Operating System Type, executable location, data location, memory requirements, etc. In the perfect world, Users will have multiple Resources to choose from. These Resources will have different configurations that can match the User policy requirements. These varied Resources will also have an ever-changing availability! An Intelligent Resource Management System will allow users to launch jobs from their portal and trust that the work will be sent to the Resource that not only correctly matches the User's job policy, but has the least load on it. This will be done without the User being aware of where the work will be executed. This solution will be scheduler agnostic. An Intelligent Resource Management System will allow users to launch jobs from their portal and trust that the work will be sent to the Resource that not only correctly matches the User's job policy, but has the least load on it. This will be done without the User being aware of where the work will be executed. This solution will be scheduler agnostic.

18 Identity Provider Job Submission Client User Job File Resource Discovery Network Company “A” University “B” Resource Discovery Network Node Resource Discovery Network Node Resource Discovery Network Node Schedule r Running Job Running Job Running Job Running Job Example of Intelligent Agent

19 Acknowledgments Georgetown University: Charlie Leonhardt Steve Moore Arnie Miles Chad La Joie Bent Putman Jess Cannata University of Wisconsin: Miron Livny Todd Tannenbaum Ian Alderman Internet2: Ken Klingenstein, Mike McGill


Download ppt "how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David."

Similar presentations


Ads by Google