Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.

Similar presentations


Presentation on theme: "Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova."— Presentation transcript:

1 Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova

2

3 Functionalities foreseen for the 1 st release First release of job description language (JDL) used when the job is submitted, to specify the job characteristics Application Input data set id Output data location Resources (required and preferable) …

4 Functionalities foreseen for the 1 st release First version of resource broker, that chooses the computing resources where to submit jobs Published resource access lists are checked as a first step in the resource matchmaking The “accessible” resource are then matched with the job requests according to Availability of the requested input data set Availability of the appropriate application sandbox Availability of the requested amount of scratch space Resource characteristics and status We assume that all the above info is published in an Information Space

5 Functionalities foreseen for the 1 st release First version of job submission service First version of bookkeeping service For information related to jobs First version of logging For significant events occurred in the workload management system First version of user interface Command line Mainly for job management (job submission, job status monitoring, job removal)

6 Approach Fast prototyping As first prototype, let’s try to “put together” existing tools and technologies that could provide some useful services

7 1 st prototype Job description language: Condor ClassAds Resource Access Lists: Globus grid-mapfiles Information Space: Globus GIS Broker: Use of Condor matchmaking library Match between the info published in the Information Space and the ClassAds specified in the JDL Job submission service: Condor-G Submission of jobs to Globus resources (farms managed by local resource management systems)

8 Workload management system (1 st prototype) Globus GRAM CONDOR Globus GRAM LSF Globus GRAM PBS Site1 Site2Site3 Job submission service Condor-G Broker Grid Information Service (GIS) Submit jobs (using Class-Ads) Resource Discovery Information on characteristics and status of local resources Local Resource Management Systems Globus GRAM as uniform interface to different local resource management systems Condor-G able to provide a reliable/crashproof job submission service Master chooses in which Globus resources the jobs must be submitted Farms Other info

9 On-going activities Evaluating the existing components and “putting together” the various building blocks Evaluation of Globus Collaboration with WP 1 of INFN-GRID project (Evaluation of the Globus toolkit) http://www.infn.it/globus http://www.infn.it/globus Evaluation of Globus services INFNGRID distribution Toolkit to make Grid software (in particular Globus at the moment) deployment easier and more automatic Possibility to “implement” specific INFN customizations Certificates signed by INFN CA “Hierarchical” architecture of GIS

10 Globus GRAM evaluation Evaluation of GRAM functionalities, in particular to evaluate GRAM as uniform interface to different underlying resource management system (LSF, Condor, PBS) Necessary to “address” some “major”problems (i.e. scalability and reliability) Evaluation of Globus RSL as uniform language to describe resources More flexibility is required “Cooperation” between GRAM and GIS The information on characteristics and status of local resources (farms) and on jobs don’t meet our needs Proposal for a first possible modification of the default schema under discussion

11 Evaluation of Condor-G The current implementation is a prototype Various problems Problems with scalability in the submitting machine Problems with logging …

12 Tests Tests with a real CMS MC production Real applications (Pythia) Real production environments Jobs submitted from Padova using Condor-G and executed in Bologna (Condor) and Pisa (LSF) Many memory leaks found in the Globus jobmanager Fixes (provided by Francesco Prelz) submitted to Globus team

13 Layout for CMS production Globus GRAM CONDOR Globus GRAM LSF globusrun Bologna Pisa condor_submit (Globus Universe) Condor-G Submit jobs Local Resource Management Systems Production manager (Ivano Lippi – Padova) Farms Padova


Download ppt "Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova."

Similar presentations


Ads by Google