Presentation is loading. Please wait.

Presentation is loading. Please wait.

Resource access in the EGEE project Massimo Sgaravatto INFN Padova

Similar presentations


Presentation on theme: "Resource access in the EGEE project Massimo Sgaravatto INFN Padova"— Presentation transcript:

1 Resource access in the EGEE project Massimo Sgaravatto INFN Padova
INFN-GRID Workshop Resource access in the EGEE project Massimo Sgaravatto INFN Padova EGEE is a project funded by the European Union under contract INFSO-RI

2 Problem Provide an interface for requesting and using remote computing resources Allow remote job submission and job control A computing resource is typically a cluster of PCs managed by a Local Resource Management System (LSF, PBS, …)  Single, standard interface needed INFN-GRID workshop - 2

3 State of the art a short while ago
Globus GRAM de-facto standard for resource access Various job submission systems built on top of it E.g. Condor-G Used by most Grid projects worldwide E.g. LCG Allowed interoperability among different Grid systems INFN-GRID workshop - 3

4 State of the art now Is Globus GRAM still the reference resource access system ??? Various problems in design and implementation (highlighted after years of experience) Performance problem Not mechanisms available to throttle the load on the head node caused by processing jobs Globus support model far from being perfect Still waiting for Globus GRAM-2 Many other systems also used for resource access now Condor-C based computing element Deployed now in the EGEE prototype testbed Alien CE ARC Computing Resource (NorduGrid) Unicore No LCG-2 (Globus based) CEs in the EGEE prototype testbed WMS right now supports submission to LCG-2 based CEs and Condor-C based CEs INFN-GRID workshop - 4

5 Resource access in EGEE
We are also supposed to address the resource access problem in the context of the EGEE project We  JRA1 IT-CZ cluster  Padova group Goals Provide a simple Computing Element which must prove to efficiently allow remote job submission and job control To be used by the “Broker” (Workload Manager) or by a generic client (e.g. an end-user) Possibly addressing open problems not addressed by other systems Stick to emerging standards Service oriented architecture Facilitate the integration of other important software components already implemented/being implemented INFN-GRID workshop - 5

6 CE functionality Job management Run jobs
Including the staging of the required files Support different job types Simple, batch job Interactive jobs MPI jobs Checkpointable jobs DAG jobs ? Get information to assess how “good” is a specific CE How many resources (WNs) match the job requirements ? What is the estimated time to have the job starting its execution ? INFN-GRID workshop - 6

7 CE functionality Job management (cont.d)
Cancel jobs Suspend and resume jobs Send signals to jobs Get job status Be notified on specific job status Provision of information about the CE itself CE characteristics CE status INFN-GRID workshop - 7

8 Push and pull CE architecture accommodated to support both push and pull model Push model: the job is pushed to the CE by the Broker (Workload Manager, WM) Pull model: the CE asks the Broker for jobs, according to local policies specified by local admin (e.g. when the CE local queue is empty or getting empty) These two models are somewhat mirrored in the resource information flow In order to 'pull' a job, a resource must choose where to 'push' information about itself Which Broker(s) must be notified ? INFN-GRID workshop - 8

9 Support for heterogeneous CEs
Goal: support CE encompassing heterogeneous (in hardware, software, enforced policies) Worker Nodes The underlying resource management system must be instructed so that the job gets dispatched to a WN matching the specified requirements Requiring that all WNs “belonging” to the same CE have to be homogeneous (EDG/LCG model) was considered a huge pain by sysadmins Need to setup many different homogeneous CEs  Need to setup many different batch queues Problem not fully addressed by any existing resource access system INFN-GRID workshop - 9

10 Integration of other software components
Having “control” on the Computing Element can facilitate the integration of other software components to be deployed in the CE Not necessary anymore endless discussions with other partners Relevant examples: Grid accounting DGAS sensors for resource metering Policy framework Integration of G-PBox For setting site policies For policy evaluation given a submission request Resource monitoring GridICE Resource reservation and co-allocation (?) INFN-GRID workshop - 10

11 CE Architecture CE Mon Web service accepting job management requests
Client JobSubmit JobAssess JobKill JobSuspend JobResume JobGetStatus WEB WEB CE Mon Web service accepting job management requests LSF PBS ? Worker Nodes INFN-GRID workshop - 11

12 CE Architecture CE Mon Worker Nodes WEB WEB Async. notifications
Client Notifications Job requests WEB WEB CE Mon Async. notifications about job/CE events Job requests (for CE working in pull mode) LSF PBS ? Worker Nodes INFN-GRID workshop - 12

13 CEMon – InformationSuperMarket
Repository of resource information available to matchmaker Updated via notifications and/or active polling on sources INFN-GRID workshop - 13

14 Status First implementation of CEMon available CE
Provide information about the CE according to Glue schema For Globus-based (LCG-2) CEs For Condor-C CEs Client can subscribe to get notifications, and then “manage” subscriptions Web service interface “Plain” Web services We had to drop up the proposed approach to rely on WS-BaseNotifications (seen as emerging standard) Lack of support from “official” software releases (e.g. Axis) Deployed in the Glite prototype testbed Integration with the ISM (main customer) on-going CE Interfaces and internal architecture defined WSDL available Implementation being started INFN-GRID workshop - 14

15 Standardization - Interoperability
Many different incompatible resource access systems In this scenario interoperability between different Grids would not be guaranteed ! This is an issue No serious efforts to address this issue within GGF Try to define some standard interfaces following the process used with SRM or with Glue schema ? Issue raised with Globus (I. Foster) and Condor (M. Livny) They agreed to come up with a proposal on how to proceed INFN-GRID workshop - 15

16 Conclusions Globus GRAM is not anymore the only player in the market for what concerns resource access Many other resource access mechanisms implemented/being implemented We are also supposed to tackle the problem Many problems experienced with Globus GRAM will likely be addressed Standardization and interoperability will be the real issues ! INFN-GRID workshop - 16


Download ppt "Resource access in the EGEE project Massimo Sgaravatto INFN Padova"

Similar presentations


Ads by Google