Resource access in the EGEE project Massimo Sgaravatto INFN Padova

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Advertisements

Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE CREAM: a WebService based CE Massimo Sgaravatto INFN Padova On behalf of the JRA1 IT-CZ Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Grid Workload Management Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE Status and Plans of gLite Middleware Erwin Laure 4 th ARDA Workshop 7-8 March 2005.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
EGEE is a project funded by the European Union under contract IST WS-Based Advance Reservation and Co-allocation Architecture Proposal T.Ferrari,
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE Grid Services for Resource Reservation and Allocation Tiziana Ferrari Istituto Nazionale di.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
CE design report Luigi Zangrando
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE is a project funded by the European Union under contract IST Padova report Massimo Sgaravatto On behalf of the INFN Padova JRA1 Group.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
– n° 1 The Grid technology and infrastructure in Italy present and future Cristina Vistoli INFN CNAF, Bologna Italy.
INFSO-RI Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
JRA1/Job Submission and Monitoring
Argus EMI Authorization Integration
Massimo Sgaravatto INFN Padova
Bob Jones EGEE Technical Director
Practical using C++ WMProxy API advanced job submission
Gri2Win: Porting gLite to run under Windows XP Platform
Workload Management Workpackage
INFNGRID Technical Board, Feb
CEMon
First proposal for a modification of the GIS schema
JRA1 IT-CZ cluster meeting Milano, May 3-4, 2004
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
CE-Monitor Luigi Zangrando INFN-Padova
Design rationale and status of the org.glite.overlay component
Workload Management System ( WMS )
The gLite Workload Management System
Preview Testbed Massimo Sgaravatto – INFN Padova
and Alexandre Duarte OurGrid/EELA Interoperability Meeting
GDB 8th March 2006 Flavia Donno IT/GD, CERN
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
Comparison of LCG-2 and gLite v1.0
Massimo Sgaravatto INFN Padova On behalf of the CREAM product team
Grid2Win: Porting of gLite middleware to Windows XP platform
Introduction to Grid Technology
Grid2Win: Porting of gLite middleware to Windows XP platform
Network Requirements Javier Orellana
Gri2Win: Porting gLite to run under Windows XP Platform
a VO-oriented perspective
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
The GENIUS portal and the GILDA t-Infrastructure
DGAS Today and tomorrow
Wide Area Workload Management Work Package DATAGRID project
I Datagrid Workshop- Marseille C.Vistoli
GRID Workload Management System for CMS fall production
JRA 1 Progress Report ETICS 2 All-Hands Meeting
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

Resource access in the EGEE project Massimo Sgaravatto INFN Padova INFN-GRID Workshop Resource access in the EGEE project Massimo Sgaravatto INFN Padova www.eu-egee.org EGEE is a project funded by the European Union under contract INFSO-RI-508833

Problem Provide an interface for requesting and using remote computing resources Allow remote job submission and job control A computing resource is typically a cluster of PCs managed by a Local Resource Management System (LSF, PBS, …)  Single, standard interface needed INFN-GRID workshop - 2

State of the art a short while ago Globus GRAM de-facto standard for resource access Various job submission systems built on top of it E.g. Condor-G Used by most Grid projects worldwide E.g. LCG Allowed interoperability among different Grid systems INFN-GRID workshop - 3

State of the art now Is Globus GRAM still the reference resource access system ??? Various problems in design and implementation (highlighted after years of experience) Performance problem Not mechanisms available to throttle the load on the head node caused by processing jobs … Globus support model far from being perfect Still waiting for Globus GRAM-2 Many other systems also used for resource access now Condor-C based computing element Deployed now in the EGEE prototype testbed Alien CE ARC Computing Resource (NorduGrid) Unicore No LCG-2 (Globus based) CEs in the EGEE prototype testbed WMS right now supports submission to LCG-2 based CEs and Condor-C based CEs INFN-GRID workshop - 4

Resource access in EGEE We are also supposed to address the resource access problem in the context of the EGEE project We  JRA1 IT-CZ cluster  Padova group Goals Provide a simple Computing Element which must prove to efficiently allow remote job submission and job control To be used by the “Broker” (Workload Manager) or by a generic client (e.g. an end-user) Possibly addressing open problems not addressed by other systems Stick to emerging standards Service oriented architecture Facilitate the integration of other important software components already implemented/being implemented INFN-GRID workshop - 5

CE functionality Job management Run jobs Including the staging of the required files Support different job types Simple, batch job Interactive jobs MPI jobs Checkpointable jobs DAG jobs ? Get information to assess how “good” is a specific CE How many resources (WNs) match the job requirements ? What is the estimated time to have the job starting its execution ? … INFN-GRID workshop - 6

CE functionality Job management (cont.d) Cancel jobs Suspend and resume jobs Send signals to jobs Get job status Be notified on specific job status Provision of information about the CE itself CE characteristics CE status INFN-GRID workshop - 7

Push and pull CE architecture accommodated to support both push and pull model Push model: the job is pushed to the CE by the Broker (Workload Manager, WM) Pull model: the CE asks the Broker for jobs, according to local policies specified by local admin (e.g. when the CE local queue is empty or getting empty) These two models are somewhat mirrored in the resource information flow In order to 'pull' a job, a resource must choose where to 'push' information about itself Which Broker(s) must be notified ? INFN-GRID workshop - 8

Support for heterogeneous CEs Goal: support CE encompassing heterogeneous (in hardware, software, enforced policies) Worker Nodes The underlying resource management system must be instructed so that the job gets dispatched to a WN matching the specified requirements Requiring that all WNs “belonging” to the same CE have to be homogeneous (EDG/LCG model) was considered a huge pain by sysadmins Need to setup many different homogeneous CEs  Need to setup many different batch queues Problem not fully addressed by any existing resource access system INFN-GRID workshop - 9

Integration of other software components Having “control” on the Computing Element can facilitate the integration of other software components to be deployed in the CE Not necessary anymore endless discussions with other partners Relevant examples: Grid accounting DGAS sensors for resource metering Policy framework Integration of G-PBox For setting site policies For policy evaluation given a submission request Resource monitoring GridICE Resource reservation and co-allocation (?) INFN-GRID workshop - 10

CE Architecture CE Mon Web service accepting job management requests Client JobSubmit JobAssess JobKill JobSuspend JobResume JobGetStatus WEB WEB CE Mon Web service accepting job management requests LSF PBS ? Worker Nodes INFN-GRID workshop - 11

CE Architecture CE Mon Worker Nodes WEB WEB Async. notifications Client Notifications Job requests WEB WEB CE Mon Async. notifications about job/CE events Job requests (for CE working in pull mode) LSF PBS ? Worker Nodes INFN-GRID workshop - 12

CEMon – InformationSuperMarket Repository of resource information available to matchmaker Updated via notifications and/or active polling on sources INFN-GRID workshop - 13

Status First implementation of CEMon available CE Provide information about the CE according to Glue schema For Globus-based (LCG-2) CEs For Condor-C CEs Client can subscribe to get notifications, and then “manage” subscriptions Web service interface “Plain” Web services We had to drop up the proposed approach to rely on WS-BaseNotifications (seen as emerging standard) Lack of support from “official” software releases (e.g. Axis) Deployed in the Glite prototype testbed Integration with the ISM (main customer) on-going CE Interfaces and internal architecture defined WSDL available Implementation being started INFN-GRID workshop - 14

Standardization - Interoperability Many different incompatible resource access systems In this scenario interoperability between different Grids would not be guaranteed ! This is an issue No serious efforts to address this issue within GGF Try to define some standard interfaces following the process used with SRM or with Glue schema ? Issue raised with Globus (I. Foster) and Condor (M. Livny) They agreed to come up with a proposal on how to proceed INFN-GRID workshop - 15

Conclusions Globus GRAM is not anymore the only player in the market for what concerns resource access Many other resource access mechanisms implemented/being implemented We are also supposed to tackle the problem Many problems experienced with Globus GRAM will likely be addressed Standardization and interoperability will be the real issues ! INFN-GRID workshop - 16