Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Monitoring of the LHC Computing Activities Key Results from the Services.

Similar presentations


Presentation on theme: "Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Monitoring of the LHC Computing Activities Key Results from the Services."— Presentation transcript:

1 www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Monitoring of the LHC Computing Activities Key Results from the Services for HUC (SA3) EGI Technical Forum 2011 J. Andreeva, M. Cinquilli, P. Dhara, E. Karavakis (CERN & SA3), P. Karhula, M. Kenyon, L. Kokoszkiewicz, E. Lanciotti, M. Nowotka, G. Ro, P. Saiz, L. Sargsyan, D. Tuckett CERN IT-ES

2 www.egi.eu EGI-InSPIRE RI-261323 Outline Importance of monitoring Experiment Dashboard Key Results and Recent Development on: Data Transfer Monitoring Job Monitoring Monitoring of Sites and Services Summary 22/09/2011 2Monitoring of the LHC Computing Activities

3 www.egi.eu EGI-InSPIRE RI-261323 Importance of monitoring WLCG integrates more than 140 computing centres in 35 countries Reliable monitoring is complicated due to the diversity of the infrastructure Powerful and flexible monitoring systems are required in order to maintain and improve a highly distributed system Monitoring the computing activities for a given VO is essential in order to estimate the quality of the infrastructure and to detect any problems or inefficiencies 22/09/2011 3Monitoring of the LHC Computing Activities

4 www.egi.eu EGI-InSPIRE RI-261323 Experiment Dashboard Not coupled to a specific Workload or Data Management System Covers the full range of the experiments’ computing activities: Job Monitoring, Data Transfers, Sites and Services Provides common solutions focused on different user categories Heavily used by the main four LHC experiments More than 4000 unique visitors monthly just for CMS Can be easily adapted to the needs of new VOs but the VOs must decide what they wish to monitor and implement/extend the monitoring system to their needs 22/09/2011 4Monitoring of the LHC Computing Activities

5 www.egi.eu EGI-InSPIRE RI-261323 Dashboard for Monitoring the Computing Activities of the LHC Analysis + Production Real time and Historical Views Data transfer Data access Site Status Board Site usability SiteView WLCG GoogleEarth Dashboard 22/09/2011 5Monitoring of the LHC Computing Activities

6 www.egi.eu EGI-InSPIRE RI-261323 Recent Development on Data Management Monitoring Monitors dataset and file movement Used 24/7 by shifters to identify failures and alert sites ~1k unique visitors monthly / ~15k page views daily New ATLAS DDM Dashboard UI provides an interactive matrix and high quality plots of transfer statistics with flexible filtering and grouping Implemented in AJAX / jQuery 22/09/2011 6Monitoring of the LHC Computing Activities

7 www.egi.eu EGI-InSPIRE RI-261323 22/09/2011 Monitoring of the LHC Computing Activities Currently there is no tool that can provide an overall view of data transfers on the WLCG scope (across LHC experiments, across various technologies used, for example FTS and xrootd, across multiple local FTS instances, etc..) Prototype WLCG Transfer Dashboard consuming FTS transfer events, generating statistics and exposing data via a generic version of the DDM Dashboard user interface. Initially for ATLAS, CMS and LHCb – support for other file transfer protocols is planned Recent Development on Data Transfers Monitoring 7

8 www.egi.eu EGI-InSPIRE RI-261323 Recent Development on Job Monitoring Aimed at different types of users: individual scientists, user support teams, site admins and VO managers Works transparently across different middleware, submission methods and execution backends. Used heavily within CMS and ATLAS Common DB schema and common applications Improvements on information collectors for job monitoring data Speed improvements and optimisations for all the different user interfaces Added functionality and flexibility to the Historical Views job accounting application New version of User Analysis Task Monitoring and Production Task Monitoring using a common framework (hbrowse) implemented in jQuery 22/09/2011 8Monitoring of the LHC Computing Activities

9 www.egi.eu EGI-InSPIRE RI-261323 User Analysis Task Monitoring 22/09/2011 9 User / User-support perspective with a wide selection of plots ATLAS version in production based on ‘hbrowse’ (also used in ganga/diane mon) Will be adopted by CMS as well Monitoring of the LHC Computing Activities

10 www.egi.eu EGI-InSPIRE RI-261323 Job Summary & Historical Views 22/09/2011 10 Job Summary Shifter, Expert, Site perspective Real time job metrics by site, activity, … Significant code refactoring and speed improvements Historical Views Site, Management perspective Job metrics as a function of time Significant code refactoring Added flexibility:8 filtering and 11 grouping by options Monitoring of the LHC Computing Activities

11 www.egi.eu EGI-InSPIRE RI-261323 Recent Development on Monitoring of Sites and Services Site Usability Monitoring An interface to the Nagios tests used by the LHC VOs for the validation of sites and services Collaboration with SAM, Nagios and Grid View teams. Strong contribution from BARC in India Under validation by ATLAS and CMS 22/09/2011 11Monitoring of the LHC Computing Activities

12 www.egi.eu EGI-InSPIRE RI-261323 Recent Development on Site Commissioning (cont.) Site Status Board Used by ATLAS and CMS for distributed computing shifts and for site commissioning Presents the status of all sites in a VO According to VO-defined metrics Easy to add/combine metrics Different views (shifter, site commissioning, transfers...) More than 200 metrics were added by the LHC VOs Alarms added and error monitoring to the collectors Many improvements on the database layout and better graphics on the UI-level using jQuery and highcharts 22/09/2011 12Monitoring of the LHC Computing Activities

13 www.egi.eu EGI-InSPIRE RI-261323 Site Status Board 22/09/2011 13Monitoring of the LHC Computing Activities

14 www.egi.eu EGI-InSPIRE RI-261323 Recent Development on Publicity and Dissemination WLCG Google Earth Dashboard: Global, cross-vo, real-time view of the LHC computing activities Improved stability of the collectors 22/09/2011 14Monitoring of the LHC Computing Activities

15 www.egi.eu EGI-InSPIRE RI-261323 Common Monitoring Solutions ApplicationATLASCMSLHCbALICE Job monitoring (multiple applications) Site Status Board Site Usability Monitoring DDM Monitoring WLCG Transfers Monitoring SiteView & GoogleEarth 22/09/2011 15Monitoring of the LHC Computing Activities

16 www.egi.eu EGI-InSPIRE RI-261323 Summary Many recent improvements on our monitoring apps due to: Common framework leveraging the latest web technologies  Applications are built on the same framework to reduce development and maintenance overhead Loose coupling to data sources adding flexibility to the system  Applications can be easily adopted by different data sources within a VO or by different VOs  UI is agnostic of the data storage implementation 22/09/2011 16Monitoring of the LHC Computing Activities


Download ppt "Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Monitoring of the LHC Computing Activities Key Results from the Services."

Similar presentations


Ads by Google