Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations Workshop - 3 Abingdon, 27 - 29 September 2005

2 Enabling Grids for E-sciencE INFSO-RI-508833 2 Main Topics Metrics Integration of new services

3 Enabling Grids for E-sciencE INFSO-RI-508833 3 Metrics OSG and EGEE/LCG are producing “weather reports” Assessment of the grid quality based on tests used in operations (core functionality) Progress since last workshop: –EGEE:  First shot at an implementation Based on SFT Doesn’t cover all central services ----> created a list  Work group (ROC managers) have produced a wish list Needs to be synchronized with practical work –OSG:  Metrics and Goals - Miron Livny Metrics and Goals - Miron Livny –Problem: How to decide what is critical------> VOs

4 Enabling Grids for E-sciencE INFSO-RI-508833 4 EGEE Practical Work Every Hour Every day CE Region Grid Weekly, Monthly, Quaterly Prototype metrics report: https://lcg-sft.cern.ch:9443/sft/metrics.html

5 Enabling Grids for E-sciencE INFSO-RI-508833 5 Graphs

6 Enabling Grids for E-sciencE INFSO-RI-508833 6 Next Practical Steps Defined the critical services Guide for target definition: LCG MOU Central Services –Resources Broker  David Kant can adapt his RB mon –CE –MyProxy –BDII  Gstat has components to provide this (Min?) –R-GMA  Analysis from logfiles ( gridView team) –LFC  Indirect by SFT, now each local and VO specific (SC team) –FTS  No probes available and complex –SRM  Data management tests at higher frequency (David Kant)

7 Enabling Grids for E-sciencE INFSO-RI-508833 7 Integration Of New Services Triggered by LCG SC 3 experience EGEE goal: All services are under COD operations! OSG has a defined process –Wiki page to follow progess Deployment Activity Integratio n Test Bed Provisioning Blueprint (ARCH) Release Description Technical Groups VO’s Service Development (Sponsored Activities) ITB 0.3 Operations OSG 0.4 Release Candidate

8 Enabling Grids for E-sciencE INFSO-RI-508833 8 Ticklist for new service User support procedures (GGUS) –Troubleshooting guides + FAQs –User guides Operations Team Training –Site admins –CIC personnel –GGUS personnel Monitoring –Service status reporting –Performance data Accounting –Usage data Service Parameters –Scope - Global/Local/Regional –SLAs –Impact of service outage –Security implications Contact Info –Developers –Support Contact –Escalation procedure to developers Interoperation –??? First level support procedures –How to start/stop/restart service –How to check it’s up –Which logs are useful to send to CIC/Developers  and where they are SFT Tests –Client validation –Server validation –Procedure to analyse these  error messages and likely causes Tools for CIC to spot problems –GIIS monitor validation rules (e.g. only one “global” component) –Definition of normal behaviour  Metrics CIC Dashboard –Alarms Deployment Info –RPM list –Configuration details (for yaim) –Security audit

9 Enabling Grids for E-sciencE INFSO-RI-508833 9 Common Problems Leigh: Why can’t we move services through more quickly? Why can’t the software/software work the first time? We have to find a way to start work before a service has met all criteria –Pilot service?? Release process: –Minimum 1 month in EGEE/LCG –OSG “organic” but not faster

10 Enabling Grids for E-sciencE INFSO-RI-508833 10 Summary Metrics have moved from discussion to prototypes Partners volunteered to help to fill the gaps COD well established first shot at a “tick list” based process to introduce new services

11 Enabling Grids for E-sciencE INFSO-RI-508833 11 Summary II Did we meet the goals? From the agenda: –Interoperation: all aspects; what makes sense? what can be achieved? what can we learn from each other?  Plenary –Metrics: to demonstrate a reliable, performant, robust, supported service that improves in quality  Progress, work distributed –Integrating LCG Service Challenges and pre-production service into the regular operations  TickList –Monitoring tools: where are we? what is missing? How do we fill in the gaps?  Plenary –(EGEE) Release/deployment process in the SC/LHC era  ROC managers meeting


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations."

Similar presentations


Ads by Google