CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.

Slides:



Advertisements
Similar presentations
CSF4 Meta-Scheduler Tutorial 1st PRAGMA Institute Zhaohui Ding or
Advertisements

National Institute of Advanced Industrial Science and Technology Advance Reservation-based Grid Co-allocation System Atsuko Takefusa, Hidemoto Nakada,
CSF4 Meta-Scheduler PRAGMA13 Zhaohui Ding or College of Computer.
PRAGMA BioSciences Portal Raj Chhabra Susumu Date Junya Seo Yohei Sawai.
Gfarm v2 and CSF4 Osamu Tatebe University of Tsukuba Xiaohui Wei Jilin University SC08 PRAGMA Presentation at NCHC booth Nov 19,
Wei Lu 1, Kate Keahey 2, Tim Freeman 2, Frank Siebenlist 2 1 Indiana University, 2 Argonne National Lab
TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
MyProxy Jim Basney Senior Research Scientist NCSA
WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
1 G2 and ActiveSheets Paul Roe QUT Yes Australia!
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
A Computation Management Agent for Multi-Institutional Grids
PRAGMA9 – Demo Bioinformatics applications inside Gfarm using meta-scheduler (CSF) and local schedulers (LSF/SGE/etc) Dr. Xiaohui Wei, JLU, China Dr. Wilfred.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
The Cactus Portal A Case Study in Grid Portal Development Michael Paul Russell Dept of Computer Science The University of Chicago
Massimo Cafaro GridLab Review GridLab WP10 Information Services Massimo Cafaro CACT/ISUFI University of Lecce, Italy.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Workload Management Massimo Sgaravatto INFN Padova.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Globus 4 Guy Warner NeSC Training.
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
TeraGrid Information Services John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration and Information.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
OPEN GRID SERVICES ARCHITECTURE AND GLOBUS TOOLKIT 4
NeSC Grid Apps Workshop Exposing Legacy Applications as OGSI Components using pyGlobus Keith R. Jackson Distributed Systems Department Lawrence Berkeley.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab TeraGrid 2007 Madison, WI.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting June 13-14, 2002.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Computational grids and grids projects DSS,
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Web Services Load Leveler Enabling Autonomic Meta-Scheduling in Grid Environments Objective Enable autonomic meta-scheduling over different organizations.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
Institute For Digital Research and Education Implementation of the UCLA Grid Using the Globus Toolkit Grid Center’s 2005 Community Workshop University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
 Abstract  The applications in many scientific fields, like bioinformatics and high-energy physics etc, increasingly demand the computing infrastructures.
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Part Five: Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Portal Update Plan Ashok Adiga (512)
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
CSF. © Platform Computing Inc CSF – Community Scheduler Framework Not a Platform product Contributed enhancement to The Globus Toolkit Standards.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
CSF4 Meta-Scheduler Zhaohui Ding College of Computer Science & Technology Jilin University.
Wide Area Workload Management Work Package DATAGRID project
Condor-G: An Update.
Presentation transcript:

CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University

2 Agenda CSF4 CSF4 integrate with SGE(Done at SDSC) CSF4 integrate with Gfarm(Done at SDSC) CSF4 other new functionalities(Done at SDSC)

3 CSF4 What is CSF CSF4 Functionalities & Services CSF4 Architecture

4 CSF4 – What is CSF What is CSF Full name: Community Scheduler Framework Full name: Community Scheduler Framework CSF is a meta-scheduler working at grid level contributed by Platform (a CA software company/LSF) CSF is a meta-scheduler working at grid level contributed by Platform (a CA software company/LSF) The first version of CSF, CSF3, was developed based on GT3-OGSI The first version of CSF, CSF3, was developed based on GT3-OGSI CSF4 is the GT4-WSRF compliant version of CSF with providing the same functionalities with CSF3 CSF4 is the GT4-WSRF compliant version of CSF with providing the same functionalities with CSF3 CSF is an open source project and can be accessed at (the cvs mainline code is csf4) CSF is an open source project and can be accessed at (the cvs mainline code is csf4) The development team of CSF4 is from Jilin University, PRC The development team of CSF4 is from Jilin University, PRC

5 CSF4 – What is CSF ( cont.) CSF4 is a contribution to GT4

6 What is CSF ( cont.) Meta-scheduler vs Local Resource Manger In a Grid-computing environment, there is a common requirement for users to query, negotiate access and manage resources existing within different administrative domains at Grid level. The meta- scheduler is designed to perform such global wide policies. Typically different Resource Management soft wares (RMs) like LSF, PBS, and Sun Grid Engine are responsible for load balancing and resource sharing within each local administrative domain. Typically different Resource Management soft wares (RMs) like LSF, PBS, and Sun Grid Engine are responsible for load balancing and resource sharing within each local administrative domain.

7 What is CSF ( cont.) A typical deployment for Meta-Scheduler and RMs.

8 CSF4 – Functionalities Functionalities provided by CSF4 Submit, control and monitor jobs at a grid level. Submit, control and monitor jobs at a grid level. Create and manage advanced reservations at the grid level. (only support LSF now) Create and manage advanced reservations at the grid level. (only support LSF now) Send job and advanced reservation operations to local resource managers. Send job and advanced reservation operations to local resource managers. A plug in scheduler interface so that site and user specific scheduling policies can be implemented regardless of the underlying resource manager. A plug in scheduler interface so that site and user specific scheduling policies can be implemented regardless of the underlying resource manager. Create queues of jobs, each with separately define- able scheduling policies. Create queues of jobs, each with separately define- able scheduling policies.

9 CSF4 – Services ( cont.) CSF4 will start the following services in GT4 container - Services available for grid users: JobService JobService csf-job-create, csf-job-start, csf-job-submit, csf-job-status, csf-job-stop, csf-job-resume, …, ReservationService ReservationService csf-rsv-create, csf-rsv-status, csf-rsv-cancel csf-job-submit, … QueuingService QueuingService csf-queue-create, csf-queue-conf, csf-queue-data

10 CSF4 – Services ( cont.) Services for internal use only: ResourceManagerFactoryService ResourceManagerFactoryService Used by Job Service to Start a job via RM Adapter (Configuration needed for LSF Adapters) ResourceManagerLsfService ResourceManagerLsfService Used by Job Service to start a job in LSF via LSF RM Adapter ResourceManagerGramService (new service) ResourceManagerGramService (new service) Used by Job Service to start a job in Gatekeeper via Java Cog Kit. They are not supposed to be used by the user directly, and there is no user client provided.

11 CSF4 – Schedule plugin & scheduling policies Scheduling polices are defined in a queue Each policy is implemented inside a scheduling plugin module A queue can load multiple plugin modules FCFS plugin is mandatory for all the queues and is always loaded by CSF Throttle plugin is an optional plugin provided by CSF to constraint the number of jobs lunched in one scheduling cycle Users can write their own plugins to realize customized scheduling policies

12 CSF4 – Architecture

13 CSF4 – Architecture ( cont. ) Terms RM Gram Adapter Services to submit jobs to resource manager via GRAM protocol. (LSF/PBS/Condor/SGE) Specific RM Adapter MetaScheduler sends job/reservation requests to RM. Every resource manager needs to implement its own RM Adapter. Now only LSF Adapter is available. RM LSF Adapter is able to talk with remote LSF clusters. Support more functionalities: advance reservation, complex job control.

14 CSF4 Integrate with SGE for SGE6 Before Globus Toolkit 4 released, the scheming documents said GT4 wouldn't offer scheduler adapter for SGE, so CSF4 didn't support SGE, either. A third-part adapter developed by London e-Science Centre, Gridwise Technologies and MCNC is released. The adapter only supports SGE6.0. With extendable architecture, CSF4 can support SGE easily.

15 CSF4 Integrate with SGE for SGE5.3 Now SGE5.3 can only integrate with GT2.x. Most clusters in SDSC is using GT2 and SGE 5.3. GT2 is still popular. CSF4 Supports GT2 gatekeeper is significant. How to support GT2? Java CoG Kit: The Java Commodity Grid Kit provides convenient access to Grid middleware through the Java framework. It supports GT2.

16 CSF4 Integrate with SGE for SGE5.3(Cont.) ResourceManagerGramService (new service) ResourceManagerGramService (new service) Used by Job Service to start a job in Gatekeeper via Java Cog Kit. Used by Job Service to start a job in Gatekeeper via Java Cog Kit. Config gatekeeper at resourcemanager-config.xml for example: Config gatekeeper at resourcemanager-config.xml for example: gatekeeper32 gatekeeper32 GRAM GRAM rocks-32.sdsc.edu/jobmanager-fork rocks-32.sdsc.edu/jobmanager-fork <version>2.4</version></cluster>

17 CSF4 integrate with Gfarm Gfarm Security Share Secure Key Share Secure Key GSI Authentication GSI Authentication User certificate User certificate Delegate Delegate Proxy certificate Proxy certificate

18 CSF4 integrate with Gfarm(Cont.) Introduce four terms Two kinds of proxy Full proxy : Generally, is a proxy that has been created by grid-proxy-init or a proxy created from such a proxy by full delegation mechanisms. Full proxy : Generally, is a proxy that has been created by grid-proxy-init or a proxy created from such a proxy by full delegation mechanisms. Limited proxy : is a proxy that is created from a full Proxy when it delegated with the limited delegation mechanism. The first time a proxy is created by the limited delegation mechanism a level 1 Limited Proxy is created. Any subsequent delegation (limited or full) of a level N Limited Proxy creates a level N+1 limited proxy. Limited proxy : is a proxy that is created from a full Proxy when it delegated with the limited delegation mechanism. The first time a proxy is created by the limited delegation mechanism a level 1 Limited Proxy is created. Any subsequent delegation (limited or full) of a level N Limited Proxy creates a level N+1 limited proxy.

19 CSF4 integrate with Gfarm(Cont.) Two Kinds of Delegation Full delegation : Full delegation of a Full Proxy results in a Full Proxy on the remote side. Full delegation of a level N Limited Proxy results in a level N+1 Limited Proxy Full delegation : Full delegation of a Full Proxy results in a Full Proxy on the remote side. Full delegation of a level N Limited Proxy results in a level N+1 Limited Proxy Limited delegation : Limited Delegation of a Full Proxy results in a level 1 Limited Proxy. Limited delegation of a level N Limited Proxy results in a level N+1 Limited Proxy. Limited delegation : Limited Delegation of a Full Proxy results in a level 1 Limited Proxy. Limited delegation of a level N Limited Proxy results in a level N+1 Limited Proxy.

20 CSF4 integrate with Gfarm(Cont.) Three Server authentication flags Default, In this mode a Full Proxy or a level 1 Limited Proxy will be accepted for authentication. (e.g. Gfarm1.1) (middle) Default, In this mode a Full Proxy or a level 1 Limited Proxy will be accepted for authentication. (e.g. Gfarm1.1) (middle) GSS_C_GLOBUS_LIMITED_PROXY_FLAG, With this flag only a Full Proxy will be accepted for authentication. This mode should be used by applications that do job start- up (e.g. the gatekeeper and ws-gram). (strict) GSS_C_GLOBUS_LIMITED_PROXY_FLAG, With this flag only a Full Proxy will be accepted for authentication. This mode should be used by applications that do job start- up (e.g. the gatekeeper and ws-gram). (strict) GSS_C_GLOBUS_LIMITED_PROXY_MANY_FLAG, With this flag any Full Proxy or Limited Proxy (of any level) will be accepted. (e.g. GridFTP, Gfarm1.2) GSS_C_GLOBUS_LIMITED_PROXY_MANY_FLAG, With this flag any Full Proxy or Limited Proxy (of any level) will be accepted. (e.g. GridFTP, Gfarm1.2) (loose) (loose)

21 CSF4 integrate with Gfarm(Cont.) Two factors WS-Gram and Gatekeeper client delegate user certificate with limited delegation. WS-Gram and Gatekeeper client delegate user certificate with limited delegation. gfsd (Gfarm1.1 file node deamon) only accept 1 level limited proxy and full proxy gfsd (Gfarm1.1 file node deamon) only accept 1 level limited proxy and full proxy Two conclusions CSF4 must delegate to local scheduler with full delegation CSF4 must delegate to local scheduler with full delegation CSF4 can’t re-use WS-GRAM and Gatekeeper client library CSF4 can’t re-use WS-GRAM and Gatekeeper client library

22 CSF4 integrate with Gfarm(Cont.) How to Support Full Delegation for WS-GRAM Delegation Service : A new component of GT4, this component provides an interface for delegation of credentials to a hosting environment. This enables a single delegated credential to be shared across multiple invocations of services on that hosting environment Delegation Service : A new component of GT4, this component provides an interface for delegation of credentials to a hosting environment. This enables a single delegated credential to be shared across multiple invocations of services on that hosting environment jobcredentialEndpoint: In the Job Description, a new schema element is supported, an EndpointReference which points to the deletated credential resource (i.e. DelegationService Resource). jobcredentialEndpoint: In the Job Description, a new schema element is supported, an EndpointReference which points to the deletated credential resource (i.e. DelegationService Resource).

23 CSF4 integrate with Gfarm(Cont.) How to enable WS-GRAM run the job with full delegation? CSF4 will access the DelegationFactoryService of the hosting which the job will be submit to firstly, get a DelegationService EndpointReference. (With the EndpointReference, user can retrieve a resource property that contain a full proxy), CSF add the EndpointReferce to job description as jobCredentialEndpoint, then the job will be delegate to GRAM Server with full delegation. CSF4 will access the DelegationFactoryService of the hosting which the job will be submit to firstly, get a DelegationService EndpointReference. (With the EndpointReference, user can retrieve a resource property that contain a full proxy), CSF add the EndpointReferce to job description as jobCredentialEndpoint, then the job will be delegate to GRAM Server with full delegation.

24 CSF4 integrate with Gfarm(Cont.)

25 CSF4 integrate with Gfarm(Cont.) How to Support Full Delegation for GateKeeper Java CoG Kit support full delegation A class org.globus.gram.GramJob which represents a simple gram job A class org.globus.gram.GramJob which represents a simple gram job The class support full delegation. The class support full delegation.

26 CSF4 new functionality One meta-scheduler work with Multiple-site via GRAM

27 CSF4 new functionality(Cont.) a new interface to query all clusters available A new command line csf-job-RmInfo will print all clusters information available

28 CSF4 – Demo environment Set up GT4/CSF4 at rocks-110(frontend) Set up GT2 at rocks-32 cluster Set up SGE6.0u4 at rocks-110 cluster and SGE adapter Set up SGE5.3 at rocks-32 cluster and SGE adapter Set up Gfarm1.2 at rocks-32 and rocks-110 clusters

29 CSF4 – Demo Configuration for CSF Resource Manager (resourcemanager-config.xml) (resourcemanager-config.xml)<cluster> gatekeeper32 gatekeeper32 GRAM GRAM rocks-32.sdsc.edu/jobmanager-fork rocks-32.sdsc.edu/jobmanager-fork <version>2.4</version></cluster><cluster> sge32 sge32 GRAM GRAM rocks-32.sdsc.edu/jobmanager-sge rocks-32.sdsc.edu/jobmanager-sge <version>2.4</version></cluster>

30 CSF4 – Demo (Job Execution) Demo 1 – Query Clusters available Demo 2 – Ran a job in local SGE6.0 Demo 3 – Run a job (need full delegation) in remote GT2 gatekeeper with different delegation Demo 4 – Run a Gfarm job in remote SGE5.3

31 CSF4 – Related work Condor-G Condor-G integrate Condor project and Globus project. Condor-G integrate Condor project and Globus project. USE GRAM protocol, only support Globus Toolkit 2.x USE GRAM protocol, only support Globus Toolkit 2.x

32 CSF4 – Related work Moab Grid Scheduler (SILVER)

33 CSF4 – Our Future Plans A new policy plug-in to Support work flow We’re considering to integrate CSF with INFORMNET( A new policy plug-in to schedule Gfarm job in CSF4 Data aware plug-in Data aware plug-in Make grid level & cluster level data-aware scheduling work together efficiently

34 Seeking for Collaboration Opportunities Q/AThanks