Www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Monitoring of the LHC Computing Activities Key Results from the Services.

Slides:



Advertisements
Similar presentations
WLCG Monitoring Consolidation NEC`2013, Varna Julia Andreeva CERN IT-SDC.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Input from CMS Nicolò Magini Andrea Sciabà IT/SDC 5 July 2013.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
Julia Andreeva. \ Monitoring of the job processing Analysis Production Real time and historical views Users Opera- tion teams Sites Data management monitoring.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services GS group meeting Monitoring and Dashboards section Activity.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EG recent developments T. Ferrari/EGI.eu ADC Weekly Meeting 15/05/
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks VO-specific systems for the monitoring of.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
1 Andrea Sciabà CERN Towards a global monitoring system for CMS computing Lothar A. T. Bauerdick Andrea P. Sciabà Computing in High Energy and Nuclear.
1 1 Service Composition for LHC Computing Grid Monitoring Beob Kyun Kim e-Science Division, KISTI
Processing of the WLCG monitoring data using NoSQL J. Andreeva, A. Beche, S. Belov, I. Dzhunov, I. Kadochnikov, E. Karavakis, P. Saiz, J. Schovancova,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
Dashboard program of work Julia Andreeva GS Group meeting
…building the next IT revolution From Web to Grid…
CERN IT Department CH-1211 Geneva 23 Switzerland t GDB CERN, 4 th March 2008 James Casey A Strategy for WLCG Monitoring.
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
WLCG and the India-CERN Collaboration David Collados CERN - Information technology 27 February 2014.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Monitoring for CCRC08, status and plans Julia Andreeva, CERN , F2F meeting, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Julia Andreeva on behalf of the MND section MND review.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Conclusions on Monitoring CERN A. Read ADC Monitoring1.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI User-centric monitoring of the analysis and production activities within.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Ops Portal New Requirements.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,
FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting.
Enabling Grids for E-sciencE Grid monitoring from the VO/User perspective. Dashboard for the LHC experiments Julia Andreeva CERN, IT/PSS.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
New solutions for large scale functional tests in the WLCG infrastructure with SAM/Nagios: The experiments experience ES IT Department CERN J. Andreeva.
WLCG Transfers Dashboard A unified monitoring tool for heterogeneous data transfers. Alexandre Beche.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
MND section. Summary of activities Job monitoring In collaboration with GridView and LB teams enabled full chain from LB harvester via MSG to Dashboard.
Ian Bird LCG Project Leader Status of EGEE  EGI transition WLCG LHCC Referees’ meeting 21 st September 2009.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Present and Future Pedro Andrade (CERN IT) 31 st August.
Site notifications with SAM and Dashboards Marian Babik SDC/MI Team IT/SDC/MI 12 th June 2013 GDB.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
Daniele Bonacorsi Andrea Sciabà
Key Activities. MND sections
POW MND section.
Experiment Dashboard overviw of the applications
Monitoring of the infrastructure from the VO perspective
Presentation transcript:

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services for HUC (SA3) EGI Technical Forum 2011 J. Andreeva, M. Cinquilli, P. Dhara, E. Karavakis (CERN & SA3), P. Karhula, M. Kenyon, L. Kokoszkiewicz, E. Lanciotti, M. Nowotka, G. Ro, P. Saiz, L. Sargsyan, D. Tuckett CERN IT-ES

EGI-InSPIRE RI Outline Importance of monitoring Experiment Dashboard Key Results and Recent Development on: Data Transfer Monitoring Job Monitoring Monitoring of Sites and Services Summary 22/09/2011 2Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Importance of monitoring WLCG integrates more than 140 computing centres in 35 countries Reliable monitoring is complicated due to the diversity of the infrastructure Powerful and flexible monitoring systems are required in order to maintain and improve a highly distributed system Monitoring the computing activities for a given VO is essential in order to estimate the quality of the infrastructure and to detect any problems or inefficiencies 22/09/2011 3Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Experiment Dashboard Not coupled to a specific Workload or Data Management System Covers the full range of the experiments’ computing activities: Job Monitoring, Data Transfers, Sites and Services Provides common solutions focused on different user categories Heavily used by the main four LHC experiments More than 4000 unique visitors monthly just for CMS Can be easily adapted to the needs of new VOs but the VOs must decide what they wish to monitor and implement/extend the monitoring system to their needs 22/09/2011 4Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Dashboard for Monitoring the Computing Activities of the LHC Analysis + Production Real time and Historical Views Data transfer Data access Site Status Board Site usability SiteView WLCG GoogleEarth Dashboard 22/09/2011 5Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Recent Development on Data Management Monitoring Monitors dataset and file movement Used 24/7 by shifters to identify failures and alert sites ~1k unique visitors monthly / ~15k page views daily New ATLAS DDM Dashboard UI provides an interactive matrix and high quality plots of transfer statistics with flexible filtering and grouping Implemented in AJAX / jQuery 22/09/2011 6Monitoring of the LHC Computing Activities

EGI-InSPIRE RI /09/2011 Monitoring of the LHC Computing Activities Currently there is no tool that can provide an overall view of data transfers on the WLCG scope (across LHC experiments, across various technologies used, for example FTS and xrootd, across multiple local FTS instances, etc..) Prototype WLCG Transfer Dashboard consuming FTS transfer events, generating statistics and exposing data via a generic version of the DDM Dashboard user interface. Initially for ATLAS, CMS and LHCb – support for other file transfer protocols is planned Recent Development on Data Transfers Monitoring 7

EGI-InSPIRE RI Recent Development on Job Monitoring Aimed at different types of users: individual scientists, user support teams, site admins and VO managers Works transparently across different middleware, submission methods and execution backends. Used heavily within CMS and ATLAS Common DB schema and common applications Improvements on information collectors for job monitoring data Speed improvements and optimisations for all the different user interfaces Added functionality and flexibility to the Historical Views job accounting application New version of User Analysis Task Monitoring and Production Task Monitoring using a common framework (hbrowse) implemented in jQuery 22/09/2011 8Monitoring of the LHC Computing Activities

EGI-InSPIRE RI User Analysis Task Monitoring 22/09/ User / User-support perspective with a wide selection of plots ATLAS version in production based on ‘hbrowse’ (also used in ganga/diane mon) Will be adopted by CMS as well Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Job Summary & Historical Views 22/09/ Job Summary Shifter, Expert, Site perspective Real time job metrics by site, activity, … Significant code refactoring and speed improvements Historical Views Site, Management perspective Job metrics as a function of time Significant code refactoring Added flexibility:8 filtering and 11 grouping by options Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Recent Development on Monitoring of Sites and Services Site Usability Monitoring An interface to the Nagios tests used by the LHC VOs for the validation of sites and services Collaboration with SAM, Nagios and Grid View teams. Strong contribution from BARC in India Under validation by ATLAS and CMS 22/09/ Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Recent Development on Site Commissioning (cont.) Site Status Board Used by ATLAS and CMS for distributed computing shifts and for site commissioning Presents the status of all sites in a VO According to VO-defined metrics Easy to add/combine metrics Different views (shifter, site commissioning, transfers...) More than 200 metrics were added by the LHC VOs Alarms added and error monitoring to the collectors Many improvements on the database layout and better graphics on the UI-level using jQuery and highcharts 22/09/ Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Site Status Board 22/09/ Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Recent Development on Publicity and Dissemination WLCG Google Earth Dashboard: Global, cross-vo, real-time view of the LHC computing activities Improved stability of the collectors 22/09/ Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Common Monitoring Solutions ApplicationATLASCMSLHCbALICE Job monitoring (multiple applications) Site Status Board Site Usability Monitoring DDM Monitoring WLCG Transfers Monitoring SiteView & GoogleEarth 22/09/ Monitoring of the LHC Computing Activities

EGI-InSPIRE RI Summary Many recent improvements on our monitoring apps due to: Common framework leveraging the latest web technologies  Applications are built on the same framework to reduce development and maintenance overhead Loose coupling to data sources adding flexibility to the system  Applications can be easily adopted by different data sources within a VO or by different VOs  UI is agnostic of the data storage implementation 22/09/ Monitoring of the LHC Computing Activities