Presentation is loading. Please wait.

Presentation is loading. Please wait.

GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid Neighborhood Meeting Linköping, October 20, 2004 Uses material.

Similar presentations


Presentation on theme: "GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid Neighborhood Meeting Linköping, October 20, 2004 Uses material."— Presentation transcript:

1 gLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid Neighborhood Meeting Linköping, October 20, 2004 Uses material from E.Laure and F.Hemmer

2 2 gLite What is gLite:  “the next generation middleware for grid computing”  “collaborative efforts of more than 80 people in 10 different academic and industrial research centers”  “Part of the EGEE project (http://www.eu-egee.org)”  “bleeding-edge, best-of-breed framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet” EGEE Activity Areas (quoted from http://www.glite.org) Nordic contributors: HIP, PDC, UiB

3 3 Architecture guiding principles Lightweight services  Easily and quickly deployable  Use existing services where possible as basis for re-engineering  “Lightweight” does not mean less services or non- intrusiveness – it means modularity Interoperability  Allow for multiple implementations Performance/Scalability & Resilience/Fault Tolerance  Large-scale deployment and continuous usage Portability  Being built on Scientific Linux and Windows Co-existence with deployed infrastructure  Reduce requirements on participating sites  Flexible service deployment  Multiple services running on the same physical machine (if possible)  Co-existence with LCG-2 and OSG (US) are essential for the EGEE Grid service Service oriented approach 60+ external dependencies …

4 4 Service-oriented approach By adopting the Open Grid Services Architecture, with components that are:  Loosely coupled (messages)  Accessible across network; modular and self-contained; clean modes of failure  Can change implementation without changing interfaces  Can be developed in anticipation of new use cases Follow WSRF standardization  No mature WSRF implementations exist to-date so start with plain WS WSRF compliance is not an immediate goal, but the WSRF evolution is followed WS-I compliance is important

5 5 Globus 2 basedWeb services based gLite-2gLite-1LCG-2LCG-1 gLite vs LCG-2 Intended to replace LCG-2 Starts with existing components Aims to address LCG-2 shortcoming and advanced needs from applications (in particular feedback from DCs) Prototyping short development cycles for fast user feedback Initial web-services based prototypes being tested with representatives from the application groups

6 6 Approach Exploit experience and components from existing projects  AliEn, VDT, EDG, LCG, and others Design team works out architecture and design  Architecture: https://edms.cern.ch/document/476451https://edms.cern.ch/document/476451  Design: https://edms.cern.ch/document/487871/https://edms.cern.ch/document/487871/  Feedback and guidance from EGEE PTF, EGEE NA4, LCG GAG, LCG Operations, LCG ARDA Components are initially deployed on a prototype infrastructure  Small scale (CERN & Univ. Wisconsin)  Get user feedback on service semantics and interfaces After internal integration and testing components to be deployed on the pre-production service EDGVDT... LCG...AliEn

7 7 Subsystems/components LCG2: componentsgLite: services User Interface AliEn Computing Element Worker Node Workload Management System Package Management Job Provenance Logging and Bookkeeping Data Management Information & Monitoring Job Monitoring Accounting Site Proxy Security Fabric management

8 8 Workload Management System

9 9 Computing Element Works in push or pull mode Site policy enforcement Exploit new Globus GK and Condor-C (close interaction with Globus and Condor team) CEA … Computing Element Acceptance JC … Job Controller MON … Monitoring LRMS … Local Resource Management System

10 10 Data Management Scheduled data transfers (like jobs) Reliable file transfer Site self-consistency SRM based storage

11 11 Storage Element Interfaces SRM interface  Management and control  SRM 1.1 (with possible evolution) Posix-like File I/O  File Access  Open, read, write  Not real posix (like rfio) SRM interface rfiodcapchirpaio Castor dCacheNeST Disk POSIX API File I/O Control User

12 12 Catalogs File Catalog Metadata Catalog LFN Metadata File Catalog  Filesystem-like view on logical file names  Keeps track of sites where data is stored  Conflict resolution Replica Catalog  Keeps information at a site (Metadata Catalog)  Attributes of files on the logical level  Boundary between generic middleware and application layer Replica Catalog Site A GUIDSURL LFN Replica Catalog Site B GUIDSURL LFN GUID Site ID

13 13 Information and Monitoring R-GMA for  Information system and system monitoring  Application Monitoring No major changes in architecture  But re-engineer and harden the system Co-existence and interoperability with other systems is a goal  E.g. MonaLisa MPP – Memory Primary Producer DbSP – Database Secondary Producer Job wrapper MPP DbSP Job wrapper MPP Job wrapper MPP e.g: D0 application monitoring:

14 14 “The Grid” Joe Pseudonymity Service (optional) Credential Storage 1. 2. 3. 4. Obtain Grid (X.509) credentials for Joe “Joe → Zyx” “Issue Joe’s privileges to Zyx” “User=Zyx Issuer=Pseudo CA” Attribute Authority myProxy tbd VOMS GSI LCAS/LCMAP S Security

15 15 GAS & Package Manager Grid Access Service (GAS)  Discovers and manages services on behalf of the user  File and metadata catalogs already integrated Package Manager  Provides application software at execution site  Based upon existing solutions  Details being worked out together with experiments and operations

16 16 Current Prototype WMS  AliEn TaskQueue, EDG WMS, EDG L&B (CNAF) CE (CERN, Wisconsin)  Globus Gatekeeper, Condor-C, PBS/LSF, “Pull component” (AliEn CE) WN  23 at CERN + 1 at Wisconsin SE (CERN, Wisconsin)  External SRM implementations (dCache, Castor), gLite-I/O Catalogs (CERN)  AliEn FileCatalog, RLS (EDG), gLite Replica Catalog Data Scheduling (CERN)  File Transfer Service (Stork) Data Transfer (CERN, Wisc)  GridFTP Metadata Catalog (CERN)  Simple interface defined Information & Monitoring (CERN, Wisc)  R-GMA Security  VOMS (CERN), myProxy, gridmapfile and GSI security User Interface (CERN & Wisc)  AliEn shell, CLIs and APIs, GAS Package manager  Prototype based on AliEn PM

17 17 Summary, plans Most Grid systems (including LCG2) are batch-job production oriented, gLite addresses distributed analysis  Most likely will co-exist, at least for a while A prototype exists, new services are being added:  Dynamic accounts, gLite CEmon, Globus RLS, File Placement Service, Data Scheduler, fine-grained authorization, accounting… A Pre-Production Testbed is being set up  more sites, tested/stable services First release due end of March 2005  Functionality freeze at Christmas  Intense integration and testing period from January to March 2005 2 nd release candidate: November 2005  May: revised architecture doc, June: revised design doc


Download ppt "GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid Neighborhood Meeting Linköping, October 20, 2004 Uses material."

Similar presentations


Ads by Google