Presentation is loading. Please wait.

Presentation is loading. Please wait.

Update on CERN IT Unified Monitoring Architecture (UMA)

Similar presentations


Presentation on theme: "Update on CERN IT Unified Monitoring Architecture (UMA)"— Presentation transcript:

1 Update on CERN IT Unified Monitoring Architecture (UMA)
IT-CM-MM

2 News New dedicated ES cluster for Monitoring
Lemon data getting integrated Working on Grafana frontend too On ATLAS matters: New Data sources (DDM Files Queued) New/Improved Processing Jobs (DDM, JM) New/Improved Dashboards (FTS, DDM, JM) Notebooks ideas (Scrutiny) 06/10/2016 ATLAS ADC meeting

3 FTS Workflow Kibana Dashboard ~ production ready:
FTS transfers logs as data source Streaming Job to enrich data (from VOfeeds) Enriched data stored in ES and HDFS Kibana Dashboard ~ production ready: Same look and feel as old one, when possible New views (e.g. error exploration, FTS log links) Banner now on the old dashboard Demo: New FTS Dashboard 06/10/2016 ATLAS ADC meeting

4 ATLAS DDM Workflow Validation ongoing, all good so far
Rucio Events and Traces as data sources Streaming job to transform, enrich (AGIS) and compute 1 min aggregated statistics Enriched data on HDFS, Stats on ES and HDFS Validation ongoing, all good so far Good progress on job resilience (thx Sergey!) Kibana DDM Dashboard prototype ready: Demo: New DDM Dashboard 06/10/2016 ATLAS ADC meeting

5 ATLAS DDM (II) DDM Queued Files (RQF0615374) Scrutiny Notebooks
New data source integrated JSON Data retrieved ~ 5 minutes from a Web server Demo: Kibana dashboard available Scrutiny Notebooks Summer student worked on porting to Zeppelin some of the scrutiny analysis from HDFS data Automatize plot creation in ~ one click Demo: Monitoring Zeppelin 06/10/2016 ATLAS ADC meeting

6 Job Monitoring Workflow Real-time data: all validated
Panda, Tier0, Prodsys as data sources Several processing jobs to enrich (AGIS) and aggregate into accounting hourly stats Both enriched and stats data on HDFS and ES Real-time data: all validated Accounting data: working on pending/running and error stats, everything else validated Jobs by Event-service already available Kibana Dashboards prototype ready Demo: Real-time, Accounting 06/10/2016 ATLAS ADC meeting

7 Kibana upstream improvements
Bucket aggregations (#4584) e.g. weighted average. Will improve efficiency visualization and throughput computation Missing “others” (#1961) e.g. show top 10 sites + an entry with the contribution of all the others Better matrix/heatmap Clickable cells Visualize value on cells 06/10/2016 ATLAS ADC meeting

8 Summary Moving from early prototypes to production ready dashboards
Consolidating workflow Build your own dashboard! Setting up a “Monitoring for WLCG” group to exchange information and feedback 06/10/2016 ATLAS ADC meeting

9 Reference & Contacts Dashboards: Feedback/Requests: Documentation:
monit.cern.ch Feedback/Requests: cern.ch/monit-support (SNOW) Documentation: cern.ch/monitdocs 06/10/2016 ATLAS ADC meeting


Download ppt "Update on CERN IT Unified Monitoring Architecture (UMA)"

Similar presentations


Ads by Google