DataGrid is a project funded by the European Commission under contract IST-2000-25182 3 rd EU Review – 19-20/02/2004 WP1 activity, achievements and plans.

Slides:



Advertisements
Similar presentations
WP1 Grid Workload Management Massimo Sgaravatto INFN Padova
Advertisements

An open source approach for grids Bob Jones CERN EU DataGrid Project Deputy Project Leader EU EGEE Designated Technical Director
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
High Performance Computing Course Notes Grid Computing.
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The CrossGrid project Juha Alatalo Timo Koivusalo.
27-29 September 2002CrossGrid Workshop LINZ1 USE CASES (Task 3.5 Test and Integration) Santiago González de la Hoz CrossGrid Workshop at Linz,
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
Introduction to Systems Analysis and Design
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 DataGrid Quality Assurance On behalf.
Client/Server Grid applications to manage complex workflows Filippo Spiga* on behalf of CRAB development team * INFN Milano Bicocca (IT)
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
“Grey areas” of the new architecture Massimo Sgaravatto INFN Padova.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
DataGrid is a project funded by the European Commission under contract IST GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
A DΙgital Library Infrastructure on Grid EΝabled Technology ETICS Usage in DILIGENT Pedro Andrade
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Grid Monitoring Services Robin Middleton RAL/PPD24-May-01.
A Grid Computing Use case Datagrid Jean-Marc Pierson.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Grid Workload Management Massimo Sgaravatto INFN Padova.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Grid checkpointing in the European DataGrid Project Alessio Gianelle – INFN Padova Rosario Peluso – INFN Padova Francesco Prelz – INFN Milano Massimo Sgaravatto.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
Budapest, September 5th, 2002 DataGrid Accounting System DGAS Current status & plans Stefano Barale INFN Budapest, September.
Authors: Ronnie Julio Cole David
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
May http://cern.ch/hep-proj-grid-fabric1 EU DataGrid WP4 Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
DataGRID PTB, Geneve, 10 April 2002 Testbed Software Test Plan Status Laurent Bobelin on behalf of Test Group.
EGEE MiddlewareLCG Internal review18 November EGEE Middleware Activities Overview Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
WP3 Information and Monitoring Rob Byrom / WP3
EMI INFSO-RI ARC tools for revision and nightly functional tests Jozef Cernak, Marek Kocan, Eva Cernakova (P. J. Safarik University in Kosice, Kosice,
6th EDG conference 2003 – WP11 1 Dissemination & Exploitation Report and status.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 The EU DataGrid Project Three years.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
WP1 Status and plans Francesco Prelz, Massimo Sgaravatto 4 th EDG Project Conference Paris, March 6 th, 2002.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
CERN 1 DataGrid Architecture Group Bob Jones CERN.
EMI INFSO-RI SA2: Quality Assurance Status Report Alberto Aimar(SA2) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Workload Management Workpackage
EGEE Middleware Activities Overview
EO Applications Parallel Session
WP1 activity, achievements and plans
WP7 objectives, achievements and plans
Leigh Grundhoefer Indiana University
Wide Area Workload Management Work Package DATAGRID project
I Datagrid Workshop- Marseille C.Vistoli
Presentation transcript:

DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 WP1 activity, achievements and plans Where we came from, what we did and learned, how we can put this to good use. Francesco Prelz

WP1activity, achievements and plans - n° 2 Talk Outline u Objectives of WP1, and how they matched the structure and timeline of the DataGrid project. u The WP1 workload management solution, and what WP1 delivered over time. u Lessons learned. u Plans for the ’present’ and the future. u Questions. (10’)

WP1activity, achievements and plans - n° 3 WP1 objectives Or, where we started from... u Task we were given in the TA: “To define and implement a suitable architecture for distributed scheduling and resource management on a GRID environment”. n Real life scientific applications meet ‘computer science’ grid projects, with mutual benefit. n And with the guarantee of a fast-paced three-year schedule. s First version of software to be delivered at project month 9! In other words: n We found little room for a traditional requirements/architecture/implementation/deployment cycle. n But this was good to get early user involvement and feedback.

WP1activity, achievements and plans - n° 4 The beginnings of WP1 u A number of products and services that were in existence when the project started were evaluated, and many were adopted.

WP1activity, achievements and plans - n° 5 What did WP1 develop? n A well-defined, as lightweight as reasonably achievable, User Interface, API and GUI. n A modular match- making/"brokering" service. Its policies make use of resource authorisation information and data requirements. n Scaffolding around CondorG, the “reliable job submission” component. n A Logging and Book-keeping service to build a uniform view of jobs out of information coming from many system components. u To provide “an integrated workload management system, with distributed scheduling”, we needed to add:

WP1activity, achievements and plans - n° 6 The WP1 development cycle u It was functional, with the novel ability to deal with data requirements, but soon met a very steep scalability barrier. u Early feedback and troubleshooting in the field were vital in locating the trouble spots (will talk briefly about this process). u The experience was merged into the design for Release 2. u Batch, interactive, parallel and checkpointable jobs are supported. u Release 2 survived tests to the scale of workload requests a day. The services on an individual RB node are routinely tested with storms of 2000 requests from 20 concurrent users. u The cycle is continuing... u The first WP1 prototype release was delivered at PM 10.

WP1activity, achievements and plans - n° 7 Expectations and reality u Problems we expected: n Service scalability problems, with increasing user and job load. n We imagined we would simply hit the resource roof on one node, and address this with service configuration. u Problems where we spent most time: n Controlling the direct and indirect interplay of the various integrated components. n Addressing stability issues and bottlenecks in a non-linear system. n Predicting (or failing to predict) where the next bottleneck will appear in the job processing network.

WP1activity, achievements and plans - n° 8 Lessons we learned u Many “principles” at the level of implementation: n Always apply reliable, properly backtraceable communication among services n For every component there should be another one ‘taking care’ of it. n Monolithic, long-lived processes should be avoided when there are many, complex layers of external dependencies. n Etc, etc…: these could be captured for Release 2! u The effort of integrating together 30+ external packages tends to be minimised. u A test strategy should appear at the same time as the code. u To safely process thousands of requests through, the ‘high watermark’ of the system should be in the tens of thousands…  Many thanks to our colleagues in the LCG Certification Group.

WP1activity, achievements and plans - n° 9 What else did WP1 develop ? u Grid accounting infrastructure, based on an economic model (DGAS). u Infrastructure and user API for automatically partitionable jobs. u Support for job dependencies (via DAGman), demonstrated at this review.

WP1activity, achievements and plans - n° 10 A much more complex picture...

WP1activity, achievements and plans - n° 11 The bright future... u There still are few issues to be addressed to harden the system: s Collection/distribution of resource information. How do we get to know that a resource is there?. s Resource access. How do we know that a resource is available now ? s Handling of data movement/"placement". How do we move data at the right time? s Streamlining of error reporting. Why is this still going wrong? s System scalability. How can I submit this 1 million jobs and keep them together? u There is also a commitment by basically all partners and staff of WP1 to continue in the context of future projects.  LCG, CrossGrid, DataTAG, GridPP and grid.it are now using and developing on the WP1 code. We keep supporting them.

WP1activity, achievements and plans - n° 12 Thanks everyone! u We thank the European Commission for having given us the opportunity for development in this field. u We acknowledge the support of our funding agencies: INFN, CESNET and PPARC. u We value very much the cultural exchange and partnership with Datamat Spa, our industrial partner. u We wish to thank all the colleagues in DataGrid for the professional growth coming from our many interchanges. Thanks to you all.

WP1activity, achievements and plans - n° 13 A little bibliography  Home page for the Grid Workload Management workpackage of the DataGrid project:  Home page for the Condor project  Home page for the Globus project  EU DataGrid WP1 (G. Avellino et al.), "The EU DataGrid Workload Management System: towards the second major release", Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, Get PDF.Get PDF  EU DataGrid WP1 (G. Avellino et al.), "The first deployment of workload management services on the EU DataGrid Testbed: feedback on design and implementation", Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, Get PDF.Get PDF