XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,

Slides:



Advertisements
Similar presentations
ProAssist ® complex assistance services management system Global Assistance & INGENIUM Praha.
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Experiment Support Introduction to HammerCloud for The LHCb Experiment Dan van der Ster CERN IT Experiment Support 3 June 2010.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Virtual techdays INDIA │ November 2010 PowerPivot for Excel 2010 and SharePoint 2010 Joy Rathnayake │ MVP.
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
1 Andrea Sciabà CERN Towards a global monitoring system for CMS computing Lothar A. T. Bauerdick Andrea P. Sciabà Computing in High Energy and Nuclear.
Xrootd Monitoring for the CMS Experiment Abstract: During spring and summer 2011 CMS deployed Xrootd front- end servers on all US T1 and T2 sites. This.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overlook of Messaging.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Kaskad Technology Korrelera for Market Surveillance Candyce Edelen November 8, 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
WLCG infrastructure monitoring proposal Pablo Saiz IT/SDC/MI 16 th August 2013.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Ceilometer + Gnocchi + Aodh Architecture
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Julia Andreeva on behalf of the MND section MND review.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Conclusions on Monitoring CERN A. Read ADC Monitoring1.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI User-centric monitoring of the analysis and production activities within.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Import XRootD monitoring data from MonALISA Sergey Belov, JINR, Dubna DNG section meeting,
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
WLCG Transfers Dashboard A unified monitoring tool for heterogeneous data transfers. Alexandre Beche.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Present and Future Pedro Andrade (CERN IT) 31 st August.
1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI,
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
PART1 Data collection methodology and NM paradigms 1.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
Daniele Bonacorsi Andrea Sciabà
WLCG Transfers Dashboard
Key Activities. MND sections
POW MND section.
Evolution of SAM in an enhanced model for monitoring the WLCG grid
FTS Monitoring Ricardo Rocha
Experiment Dashboard overviw of the applications
Artem Petrosyan (JINR), Danila Oleynik (JINR), Julia Andreeva (CERN)
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Advancements in Availability and Reliability computation Introduction and current status of the Comp Reports mini project C. Kanellopoulos GRNET.
Monitoring Of XRootD Federation
Ákos Frohner EGEE'08 September 2008
CTA: CERN Tape Archive Overview and architecture
Monitoring of the infrastructure from the VO perspective
Grid Data Integration In the CMS Experiment
Presentation transcript:

XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3, Lyon, November 22, 2011

Outline Historical overview XRootD monitoring background Use cases and user categories Monitoring architecture of Site Federation VO&WLCG Issues Conclusions

Historical overview Tier 3 monitoring task force Software suit to enable T3 sites monitoring, two layers T3mon site Site infrastructure monitoring Batch, storage software monitoring Data representation via Ganglia monitoring system T3mon global Monitor T3 site activity on global layer: data popularity, transfers Data representation via Dashboard Special part of project: XRootD monitoring for site, federation monitoring as a part of T3mon global

XRootD monitoring background XRootD instrumented for monitoring Summary stream Overview of site health, easy to configure and represent Detailed stream Provides information about each single operation (authorization, staging, IO, etc.) Complicated to use: decoding and aggregation needed Every instance (server, redirector) uses same data transportation (UDP) and representation technology Combination of info from summary and detailed allows to feed site and federation monitoring

XRootD monitoring metrics Following metrics provided and extracted: File Open/Close, Transferred volume, Read/Write Username Application Trace Client IP/name Server IP/Name This metrics can be used as initial aggregation patterns

Use cases Monitoring of the XRootD transfers and data access are required at various levels: Site (administrators) Federation (federation administrators) VO (VO managers) WLCG Different data should be presented on every level Different requirement for different user categories

Monitoring consumers & requirements VO data transfers Health of sites Data popularity Data transfers Consumed recourses Health of sites (CoS) Health of sites Transfers between sites Redirections Easy integration Health of site Transfers to site Transfers from site SiteFederation VOWLCG

Architecture 1/2 Information collected from servers/redirectors All top level metrics are calculated or aggregated based on data coming from servers/redirectors At the moment we do not have good estimation of how much data has to be handled on the federation level (depends on number of sites, type of messages, granularity, etc.), but it is clear that solution should be scalable

Architecture 2/2

Architecture. Site monitoring 1/2 Summary metrics are collected and represented through local fabric monitoring system Ganglia solution is already implemented Python parser for extracting metrics from detailed stream is ready Plugin system for intercommunication with different backends: PostgreSQL for T3 site MSG for CMS popularity Both solutions have been implemented and are being tested Transfer info available only via detailed stream Lot of data Scalability & load tests needed

Architecture. Site monitoring 2/2

XRootD monitoring through Ganglia

Architecture. Federation monitoring 1/3 Collecting filtered list of metrics at the site (servers/redirectors) All data generated by servers at site level Each site must be declared in the WLCG&VO topology Transmitting data for aggregation (MSG/ActiveMQ) to central storage Scalability – queues with asynchronous operations and multiple consumers support

Architecture. Federation monitoring 2/3 Central storage/Data aggregation alternatives: RDBMS Data normalization, logical complexity, high cost of any data structure change Structured data storage (Hbase) Scalable, flexible, fast access to data Presentation & Visualization JSON Easy data integration within application and between web applications Django, jQuery Python, AJAX

Architecture. Federation monitoring 3/3

Architecture. VO&WLCG monitoring Data collected from XRootD federations Federation must be declared in the WLCG&VO topology Applications must provide VO mark of user for transfer at the site level Transfer of monitoring data (MSG/ActiveMQ/WebService) Integration with Dashboard Data formats to be agreed

Architecture. VO monitoring

Issues Federation topology Site and federation declaration in the WLCG&VO topology Metrics Information is available, list of metrics for federation/VO/WLCG level to be agreed Initial aggregation patterns

Conclusions Monitoring architecture for XRootD data transfers on the local and global levels is being prototyped Collection and publishing of basic metrics at the site level is implemented and is being tested Technology and infrastructure for data transmission is in place Technology for data aggregation and visualization is defined