Presentation is loading. Please wait.

Presentation is loading. Please wait.

Middleware Development and Deployment Status

Similar presentations


Presentation on theme: "Middleware Development and Deployment Status"— Presentation transcript:

1 Middleware Development and Deployment Status
Tony Doyle 9 November 2004 PPE & PPT Lunchtime Talk

2 PPE & PPT Lunchtime Talk
Contents What are the Challenges? What is the scale? How does the Grid work? What is the status of (EGEE) middleware development? What is the deployment status? What is GridPP doing as part of the International effort? What was GridPP1? Is GridPP a Grid? What is planned for GridPP2? What lies ahead? Summary Why? What? How? When? 9 November 2004 PPE & PPT Lunchtime Talk

3 Science generates data and might require a Grid?
Earth Observation Bioinformatics Astronomy Digital Curation Healthcare ? Collaborative Engineering 9 November 2004 PPE & PPT Lunchtime Talk

4 What are the challenges?
Must share data between thousands of scientists with multiple interests link major (Tier-0 [Tier-1]) and minor (Tier-1 [Tier-2]) computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely by 2007 9 November 2004 PPE & PPT Lunchtime Talk

5 What are the challenges? Data Management, Security and Sharing
2. Software efficiency 1. Software process 3. Deployment planning 4. Link centres 10. Policies 5. Share data Data Management, Security and Sharing 9. Accounting 8. Analyse data 7. Install software 6. Manage data 9 November 2004 PPE & PPT Lunchtime Talk

6 PPE & PPT Lunchtime Talk
Tier-1 Scale Step-1.. financial planning Step-2.. Compare to (e.g. Tier-1) expt. requirements Ian Foster / Carl Kesselman: "A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities." Step-3.. Conclude that more than one centre is needed Step-4.. A Grid? Currently network performance doubles every year (or so) for unit cost. 9 November 2004 PPE & PPT Lunchtime Talk

7 What is the Grid? Hour Glass
I. Experiment Layer e.g. Portals II. Application Middleware e.g. Metadata III. Grid Middleware e.g. Information Services IV. Facilities and Fabrics e.g. Storage Services 9 November 2004 PPE & PPT Lunchtime Talk

8 How do I start? http://www.gridpp.ac.uk/start/
Getting started as a Grid user Quick start guide for LCG2 GridPP guide to starting as a user of the Large Hadron Collider Computing Grid. Getting an e-science certificate In order to use the Grid you need a Grid certificate. This page introduces the UK e-Science Certification Authority, which issues cerficates to users. You can get a certificate from here. Using the LHC Computing Grid (LCG) CERN's guide on the steps you need to take in order to become a user of the LCG. This includes contact details for support. LCG user scenario This describes in a practical way the steps a user has to follow to send and run jobs on LCG and to retrieve and process the output successfully. Currently being improved.. 9 November 2004 PPE & PPT Lunchtime Talk

9 Job Submission (behind the scenes)
Replica Catalogue UI JDL Input “sandbox” DataSets info grid-proxy-init Information Service Output “sandbox” SE & CE info Resource Broker Output “sandbox” Expanded JDL Job Submit Event Job Query Job Status Input “sandbox” + Broker Info Publish Author. &Authen. Storage Element Globus RSL Job Submission Service Job Status Logging & Book-keeping Compute Element Job Status 9 November 2004 PPE & PPT Lunchtime Talk

10 Enabling Grids for E-sciencE
Deliver a 24/7 Grid service to European science build a consistent, robust and secure Grid network that will attract additional computing resources. continuously improve and maintain the middleware in order to deliver a reliable service to users. attract new users from industry as well as science and ensure they receive the high standard of training and support they need. 100 million euros/4years, funded by EU >400 software engineers + service support 70 European partners 9 November 2004 PPE & PPT Lunchtime Talk

11 Prototype Middleware Status & Plans (I)
Workload Management AliEn TaskQueue EDG WMS (plus new TaskQueue and Information Supermarket) EDG L&B Computing Element Globus Gatekeeper + LCAS/LCMAPS Dynamic accounts (from Globus) CondorC Interfaces to LSF/PBS (blahp) “Pull components” AliEn CE gLite CEmon (being configured) Blue: deployed on development testbed Red: proposed LHCC Comprehensive Review – November

12 Prototype Middleware Status & Plans (II)
Storage Element Existing SRM implementations dCache, Castor, … FNAL & LCG DPM gLite-I/O (re-factored AliEn-I/O) Catalogs AliEn FileCatalog – global catalog gLite Replica Catalog – local catalog Catalog update (messaging) FiReMan Interface RLS (globus) Data Scheduling File Transfer Service (Stork+GridFTP) File Placement Service Data Scheduler Metadata Catalog Simple interface defined (AliEn+BioMed) Information & Monitoring R-GMA web service version; multi-VO support LHCC Comprehensive Review – November

13 Prototype Middleware Status & Plans (III)
Security VOMS as Attribute Authority and VO mgmt myProxy as proxy store GSI security and VOMS attributes as enforcement fine-grained authorization (e.g. ACLs) globus to provide a set-uid service on CE Accounting EDG DGAS (not used yet) User Interface AliEn shell CLIs and APIs GAS Catalogs Integrate remaining services Package manager Prototype based on AliEn backend evolve to final architecture agreed with ARDA team LHCC Comprehensive Review – November

14 PPE & PPT Lunchtime Talk
CB PMB Deployment Board User Board Tier1/Tier2, Testbeds, Rollout Service specification & provision Requirements Application Development User feedback Metadata Storage Workload Network Security Info. Mon. ARDA Expmts EGEE LCG 9 November 2004 PPE & PPT Lunchtime Talk

15 Middleware Development
Network Monitoring Configuration Management Grid Data Management Storage Interfaces Information Services Security 9 November 2004 PPE & PPT Lunchtime Talk

16 Application Development
ATLAS LHCb CMS BaBar (SLAC) SAMGrid (FermiLab) QCDGrid PhenoGrid 9 November 2004 PPE & PPT Lunchtime Talk

17 GridPP Deployment Status
GridPP deployment is part of LCG (Currently the largest Grid in the world) The future Grid in the UK is dependent upon LCG releases Three Grids on Global scale in HEP (similar functionality) sites CPUs LCG (GridPP) 90 (15) 8700 (1500) Grid3 [USA] NorduGrid 9 November 2004 PPE & PPT Lunchtime Talk

18 PPE & PPT Lunchtime Talk
LCG Overview By 2007: 100,000 CPUs - More than 100 institutes worldwide building on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT) prototype went live in September 2003 in 12 countries Extensively tested by the LHC experiments during this summer 9 November 2004 PPE & PPT Lunchtime Talk

19 Deployment Status (26/10/04)
Incremental releases: significant improvements in reliability, performance and scalability within the limits of the current architecture scalability is much better than expected a year ago Many more nodes and processors than anticipated installation problems of last year overcome many small sites have contributed to MC productions Full-scale testing as part of this year’s data challenges GridPP “The Grid becomes a reality” – widely reported British Embassy (USA) Technology Sites British Embassy (Russia) 9 November 2004 PPE & PPT Lunchtime Talk

20 PPE & PPT Lunchtime Talk
Data Challenges Ongoing.. Grid and non-Grid Production Grid now significant ALICE - 35 CPU Years Phase 1 done Phase 2 ongoing LCG CMS - 75 M events and 150 TB: first of this year’s Grid data challenges Entering Grid Production Phase.. 9 November 2004 PPE & PPT Lunchtime Talk

21 PPE & PPT Lunchtime Talk
Data Challenge 7.7 M GEANT4 events and 22 TB UK ~20% of LCG Ongoing.. (3) Grid Production ~150 CPU years so far Largest total computing requirement Small fraction of what ATLAS need.. Entering Grid Production Phase.. 9 November 2004 PPE & PPT Lunchtime Talk

22 PPE & PPT Lunchtime Talk
LHCb Data Challenge 424 CPU years (4,000 kSI2k months), 186M events UK’s input significant (>1/4 total) LCG(UK) resource: Tier % Tier-2 sites: London 3.9% South 2.3% North 1.4% DIRAC: Imperial 2.0% L'pool 3.1% Oxford 0.1% ScotGrid 5.1% Entering Grid Production Phase.. DIRAC alone LCG in action /day LCG paused Phase 1 Completed /day restarted 186 M Produced Events 9 November 2004 PPE & PPT Lunchtime Talk

23 Paradigm Shift Transition to Grid…
424 CPU · Years May: 89%:11% 11% of DC’04 Jun: 80%:20% 25% of DC’04 Jul: 77%:23% 22% of DC’04 Aug: 27%:73% 42% of DC’04 9 November 2004 PPE & PPT Lunchtime Talk

24 PPE & PPT Lunchtime Talk
More Applications ZEUS uses LCG needs the Grid to respond to increasing demand for MC production 5 million Geant events on Grid since August 2004 QCDGrid For UKQCD Currently a 4-site data grid Key technologies used - Globus Toolkit 2.4 - European DataGrid eXist XML database managing a few hundred gigabytes of data 9 November 2004 PPE & PPT Lunchtime Talk

25 PPE & PPT Lunchtime Talk
Issues First large-scale Grid production problems being addressed… at all levels “LCG-2 MIDDLEWARE PROBLEMS AND REQUIREMENTS FOR LHC EXPERIMENT DATA CHALLENGES” 9 November 2004 PPE & PPT Lunchtime Talk

26 PPE & PPT Lunchtime Talk
5 Is GridPP a Grid? Coordinates resources that are not subject to centralized control … using standard, open, general-purpose protocols and interfaces … to deliver nontrivial qualities of service YES. This is why development and maintenance of LCG is important. VDT (Globus/Condor-G) + EDG/EGEE(Glite) ~meet this requirement. LHC experiments data challenges over the summer of 2004. 9 November 2004 PPE & PPT Lunchtime Talk

27 PPE & PPT Lunchtime Talk
What was GridPP1? A team that built a working prototype grid of significant scale > 1,500 (7,300) CPUs > 500 (6,500) TB of storage > 1000 (6,000) simultaneous jobs A complex project where 82% of the 190 tasks for the first three years were completed A Success “The achievement of something desired, planned, or attempted” 9 November 2004 PPE & PPT Lunchtime Talk

28 Aims for GridPP2? From Prototype to Production
BaBar BaBarGrid EGEE CDF D0 SAMGrid ATLAS LHCb GANGA EDG ARDA LCG ALICE CMS LCG CERN Tier-0 Centre CERN Prototype Tier-0 Centre CERN Computer Centre UK Tier-1/A Centre RAL Computer Centre UK Prototype Tier-1/A Centre 4 UK Tier-2 Centres 19 UK Institutes 4 UK Prototype Tier-2 Centres Separate Experiments, Resources, Multiple Accounts Prototype Grids 'One' Production Grid 2004 2001 2007 9 November 2004 PPE & PPT Lunchtime Talk

29 Planning: GridPP2 ProjectMap
Structures agreed and in place (except LCG phase-2) 9 November 2004 PPE & PPT Lunchtime Talk

30 What lies ahead? Some mountain climbing..
Annual data storage: 12-14 PetaBytes per year CD stack with 1 year LHC data (~ 20 km) 100 Million SPECint2000 Importance of step-by-step planning… Pre-plan your trip, carry an ice axe and crampons and arrange for a guide… Concorde (15 km) In production terms, we’ve made base camp  100,000 PCs (3 GHz Pentium 4) We are here (1 km) Quantitatively, we’re ~9% of the way there in terms of CPU (9,000 ex 100,000) and disk (3 ex 12-14*3 years)… 9 November 2004 PPE & PPT Lunchtime Talk

31 PPE & PPT Lunchtime Talk
Why? 2. What? 3. How? 4. When? From Particle Physics perspective the Grid is: 1. needed to utilise large-scale computing resources efficiently and securely 2. a) a working prototype running today on large testbed(s)… b) about seamless discovery of computing resources c) using evolving standards for interoperation d) the basis for computing in the 21st Century e) not (yet) as transparent or robust as end-users need 3. see the GridPP getting started pages (two-day EGEE training courses available) a) Now, at prototype level, for simple(r) applications (e.g. experiment Monte Carlo production) b) September 2007 for more complex applications (e.g. data analysis) – ready for LHC 9 November 2004 PPE & PPT Lunchtime Talk


Download ppt "Middleware Development and Deployment Status"

Similar presentations


Ads by Google