Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataGrid is a project funded by the European Commission under contract IST-2000-25182 GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring.

Similar presentations


Presentation on theme: "DataGrid is a project funded by the European Commission under contract IST-2000-25182 GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring."— Presentation transcript:

1 DataGrid is a project funded by the European Commission under contract IST-2000-25182 GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring WP3 Based on talk at EU review – but with a UK slant. Steve Fisher, RAL s.m.fisher@rl.ac.uk

2 WP3 - n° 2 Outline u Objectives of WP3 u Achievements u Lessons Learned u Future Work u Exploitation Plans

3 WP3 - n° 3 Objectives of WP3 (TA) u To provide a system or systems able to meet all the information and monitoring needs within a Grid from resource discovery to application monitoring

4 WP3 - n° 4 Objectives (D3.2) – part 1 u Base on the Grid Monitoring Architecture (GMA) from the GGF n Very simple model u For Relational Grid Monitoring Architecture (R-GMA): hide Registry mechanism from the user n Producer registers on behalf of user n Mediator (in Consumer) transparently selects the correct Producer(s) to answer a query u Use relational model (R of R-GMA) n Facilitate expression of queries over all the published information Producer Registry/ Schema Consumer u Users just think in terms of Producers and Consumers

5 WP3 - n° 5 Objectives (D3.2) – part 2 u Following the GMA, the system should offer one-off and streaming queries u Ensure that all records carry a timestamp - then all information can be used for monitoring u Highly scalable u No single point of failure u Dynamic schema mechanism to make it easy for applications to publish information u Fine grained authorisation mechanism u Ability to deal with very high rates of data to monitor the performance of parallel jobs u Interoperation with other information systems – e.g. MDS

6 WP3 - n° 6 Achievements u Since December the R-GMA code has become very much more stable – as recorded in D3.6 u We achieved what we set out to do by the end of the project u Unfortunately much of the experience of the Application Work Packages has been with the earlier versions

7 WP3 - n° 7 Achievements u We have successfully challenged the conventional wisdom on information and monitoring services on the Grid and produced a system that the user community is keen to use u The main product is R-GMA, which does treat the whole area of information and monitoring as a single coordinated system u Tools have been developed to allow R-GMA to interoperate with other systems: GIN/GOUT – for compatibility with MDS Nagios integration for displays and alerts Ranglia (ganglia integration) to allow R-GMA access to ganglia u We have a new version of GRM, which is integrated with the GridLab Mercury monitor for performance monitoring of parallel applications. This combines the flexibility of R-GMA with the performance of the Mercury monitor SZTAKI INFN

8 WP3 - n° 8 Users u WP1,2,4 and 5 for MDS like use u WP8: CMS for BOSS – for job monitoring u WP8: D0 – similar job monitoring u WP7 – Network monitoring u LCG – Trying it for accounting information

9 WP3 - n° 9 BOSS with R-GMA u Simulation MC jobs n Each job creates a stream producer publishing a number of tuples depending on the job phase u An Archiver collects tuples n Archived tuples are checked u Results n Early 2003 fell over at around 10 jobs! n October 2003 managed about 400 s Reported at IEEE NSS Conference, Oregon, USA, 21-24 October 2003 n Now 2 MC simulations, each running 4000 jobs

10 WP3 - n° 10 Application testbed – performance 1000 98 http://gw22.hep.ph.ic.ac.uk:8080/R- GMA/LatestProducerServlet 1000 98 http://tbn08.nikhef.nl:8080/R- GMA/LatestProducerServlet 99097- ldap://tbn08.nikhef.nl:2169/Mds-vo- name=local%2Co=grid -2-97 http://heplnx30.pp.rl.ac.uk/R- GMA/StreamProducerServletSum mary 98 response time < 10seconds (%) 0 timedout queries (%) - data age < 300seconds (%) 100 ldap://gw22.hep.ph.ic.ac.uk:2169/M ds-vo-name=local%2Co=grid Service Status OK (%) URI

11 WP3 - n° 11 Lessons learned u Release working code early u Distributed Software System testing is hard n We learned the tremendous value of a private WP3 testbed. While some problems only appear with real users, we were able to detect most problems on our own testbed n Have recently added code to stress R-GMA on the WP3 testbed. This shows up problems which previously only showed up on the application testbed u Automate as much as possible n Have made use of an open source product, Cruisecontrol, to rebuild and test software whenever people check in s Most of the time, most people run most of the tests s Cruisecontrol always runs all of them

12 WP3 - n° 12 More Lessons - New u Need for visibility at CERN u Need customers n To develop applications s And support us n Must support customers s Quite a lot of effort n Need to understand their needs s Probably failed to make best “use” of Stephen Burke s Note that XP says to have a customer rep as part of the team

13 WP3 - n° 13 Future u Functionality n Improve Virtual Organisation (VO) support so that each VO only sees its own information n Need multiple physical registries for performance and for resilience n Implement fine-grained authorisation n The mediator should support a broader range of queries u Packaging n Web services will be the base grid technology for the next few years, so it is essential that all WP3 software be migrated to Web services s Prototypes already written n The portability of the system will be improved s Need to make it easy to install “anywhere on any platform”

14 WP3 - n° 14 Exploitation – R-GMA in GGF u The inclusion of GMA concepts in OGSA will be very beneficial to OGSA and to the widespread acceptance of R-GMA u Have submitted documents to the OGSA working group of the GGF to explain how GMA fits into OGSA n We bring an implementation: R-GMA n Participated in phone meetings with the OGSA-WG discussing these documents n Attended Face-to-face meeting of the OGSA-WG in San Diego last week New – “Information Dissemination” BOF at GGF10 We will get involved – and get OGSA link

15 WP3 - n° 15 Exploitation – R-GMA in EGEE u R-GMA will be hardened within EGEE n To provide highly robust, production quality, web services u University-based research groups should take the ideas forward n Seeking collaborations to make this happen n Once the direction is established, those working on EGEE can produce the necessary production quality code u Continued involvement of the user community will bring new requirements – and a further improved product

16 WP3 - n° 16 Exploitation – R-GMA in the world u To increase visibility and to provide a focus for our users a web site (http://www.r-gma.org/) has been constructedhttp://www.r-gma.org/ u Once the system is repackaged to make it easy to build and configure on most platforms and with clear documentation we anticipate a good take-up

17 WP3 - n° 17 Summary u We did what we said we would u Users like R-GMA n An expanding community u Now the challenge is to make it a big success worldwide


Download ppt "DataGrid is a project funded by the European Commission under contract IST-2000-25182 GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring."

Similar presentations


Ads by Google