GridPP3 Project Management GridPP20 Sarah Pearce 11 March 2008
Slide 2 11 March 2008 GridPP20: GridPP3 Project Management
Slide 3 11 March 2008 GridPP20: GridPP3 Project Management What’s the project map for? To show us how well GridPP is delivering against requirements To report to the Oversight Committee on: –Areas where GridPP is doing well (or OK) –Areas that need attention To report on staff posts GridPP is not in direct control of all metrics, but can aim to put pressure in areas where we see problems Will need to be complete for next OC – May?
Slide 4 11 March 2008 GridPP20: GridPP3 Project Management From production…
Slide 5 11 March 2008 GridPP20: GridPP3 Project Management …to exploitation
Slide 6 11 March 2008 GridPP20: GridPP3 Project Management Main features Led by experiments – key to delivering for LHC Tier-1 and Tier-2 areas –Aggregated per Tier-2 Mainly metrics, with some deliverables Based around services delivered – especially meeting MoU commitments Includes section for GridPP2+
Slide 7 11 March 2008 GridPP20: GridPP3 Project Management Milestones and metrics
Slide 8 11 March 2008 GridPP20: GridPP3 Project Management Example Experiment metrics NumberLHCb UK share of LHCb production computing needs 1.2.2MC production (generation) efficiency T1 MC production (reconstruction, stripping) efficiency T1 MC/Event user analysis - UK share/ efficiency 1.2.5T2 data transfer - T2->RAL 1.2.6T2 data transfer- T2->others (failover?) 1.2.7T1 data transfer - Incoming 1.2.8T1 data transfer - Outgoing 1.2.9T1 data storage : Tape T1 data storage : Disk LHCb SAM tests uptime T LHCb SAM tests uptime T2 NumberATLAS 1.101Tier 1 - Available jobs slots for reconstruction 1.102Tier 2 - Available job slots for group analysis 1.103Tier 1 - Available job slots for MC production 1.104Tier 1 - Job success rates in batch system Tier 1 - Available storage in usable service classes Tier 1 - Data reading rates from storage system to batch farm Tier 1 - Rates of data movement from tape to disk for reprocessing Tier 1 - Data availability in storage system Tier -1 Data loss per quarter (when not recoverable) Tier 1 - Data acceptance from CERN, Tier 1s, Tier 2s 1.111Tier 1- MoU service levels 1.112Tier 2 - Data acceptance from Tier Tier 2 - Available simulation slots 1.114Tier 2 - Available analysis slots
Slide 9 11 March 2008 GridPP20: GridPP3 Project Management Overall operations metrics NumberTitle 2.1.1Fraction of UK sites in Production 2.1.2Number of supported VOs 2.1.3Fraction of kSI2k used 2.1.4GridPP kSI2K Available 2.1.5GridPP disk storage available 2.1.6Job failure rates 2.1.7UK contribution to LHC experiments 2.1.8UK contribution to non-LHC experiments 2.1.9Deployment team meetings UK wide deployment support active GridPP deployment web-pages up-to-date Training needs addressed GridPP helpdesk functioning adequately Number of sites on VO blacklists
Slide March 2008 GridPP20: GridPP3 Project Management Tier-1 metrics – examples NumberResource delivery 3.2.1Tier-1 KSI2K Available to EGEE/LCG 3.2.2Tier-1 delivering to LCG MoU 3.2.3Fraction of available T1 KSI2K used in quarter 3.2.4Fraction of available T1 KSI2K used in quarter 3.2.5UB schedule implemented and upheld 3.2.6Time on VO blacklists 3.2.7Respond to tickets within required time 3.2.8Job efficiencies NumberHardware procurement 3.1.1Disk tender started 3.1.2Disk delivered 3.1.3Disk available and in production as per plan 3.1.4Tape tender started 3.1.5Tape delivered 3.1.6Tape available and in production as per plan 3.1.7CPU tender started 3.1.8CPU delivered 3.1.9CPU available and in production as per plan New machine room migration plan available New machine room - migration complete New machine room available to accept hardware Network upgraded Services Storage
Slide March 2008 GridPP20: GridPP3 Project Management Tier-2 metrics NumberTitle 4.x.1% of promised (by that time) disk available 4.x.2% of promised (by that time) CPU available 4.x.3 Average SAM (SLL page) availability performance over the last quarter 4.x.4 Average SAM (SLL page) reliability performance over the last quarter 4.x.5Average SLL ATLAS test performance? 4.x.6Average SLL disk test performance ? 4.x.7Amount of CPU delivered 4.x.8Number of TB of disk used 4.x.9Number of technical meetings held 4.x.10Number of management meetings held 4.x.11Tier-2 delivering to LCG MoU 4.x.12Quarterly operational performance review
Slide March 2008 GridPP20: GridPP3 Project Management Risk register
Slide March 2008 GridPP20: GridPP3 Project Management Reporting Project Manager User Board Chair Tier-1 Manager Production Manager Technical director ATLAS LHCb CMS Tier-1 Staff Tier-2 Coordinators Tier-2 Hardware Support Posts Storage Data Info.Mon. WLMS Security Network CB PMB OC Portal Other expts User support Expt. support
Slide March 2008 GridPP20: GridPP3 Project Management Quarterly reports Produced by manager in each area Reporting on progress in the quarter, including: –Effort figures –Resources delivered –Service levels –Metrics and milestones –Issues arising Expected 1 month after the end of each quarter
Slide March 2008 GridPP20: GridPP3 Project Management