N EWS OF M ON ALISA SITE MONITORING

Slides:



Advertisements
Similar presentations
NAGIOS AND CACTI NETWORK MANAGEMENT AND MONITORING SYSTEMS.
Advertisements

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
TCP Monitor and Auto Tuner. Need Analysis Enable monitoring of TCP Connections Enable maximum bandwidth utilization No such utility available in MONALISA.
ALICE G RID SERVICES IP V 6 READINESS
PlanetLab Operating System support* *a work in progress.
MONITORING WITH MONALISA Costin Grigoras. M ONITORING WITH M ON ALISA What is MonALISA ? MonALISA communication architecture Monitoring modules ApMon.
Web Server Administration
Network+ Guide to Networks, Fourth Edition Chapter 1 An Introduction to Networking.
Torrent-based Software Distribution in ALICE.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
1 Enabling Secure Internet Access with ISA Server.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
G RID SERVICES IP V 6 READINESS
1 Web Server Administration Chapter 1 The Basics of Server and Web Server Administration.
ALICE data access WLCG data WG revival 4 October 2013.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Online Monitoring with MonALISA Dan Protopopescu Glasgow, UK Dan Protopopescu Glasgow, UK.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
October 8, 2015 University of Tulsa - Center for Information Security Microsoft Windows 2000 DNS October 8, 2015.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 17 This presentation © 2004, MacAvon Media Productions Multimedia and Networks.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
Xrootd Monitoring for the CMS Experiment Abstract: During spring and summer 2011 CMS deployed Xrootd front- end servers on all US T1 and T2 sites. This.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
Remote programs and commands In this presentation… –rpc concepts –rpc connections –rpc actions.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
1 Implementing Monitoring and Reporting. 2 Why Should Implement Monitoring? One of the biggest complaints we hear about firewall products from almost.
Network Management Protocols and Applications Cliff Leach Mike Looney Danny Mar Monty Maughon.
Site operations Outline Central services VoBox services Monitoring Storage and networking 4/8/20142ALICE-USA Review - Site Operations.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
Update on replica management
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Monitoring with MonALISA Costin Grigoras. What is MonALISA ?  Caltech project started in 2002
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 17 This presentation © 2004, MacAvon Media Productions Multimedia and Networks.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
+ AliEn site services and monitoring Miguel Martinez Pedreira.
Update of SAM Implementation ALICE TF Meeting 18/10/07.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Role Of Network IDS in Network Perimeter Defense.
HP Openview NNM: Scalability and Distribution. Reference  “HP Openview NNM: A Guide to Scalability and Distribution”,
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
Monitoring for the ALICE O 2 Project 11 February 2016.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
+ AliEn status report Miguel Martinez Pedreira. + Touching the APIs Bug found, not sending site info from ROOT to central side was causing the sites to.
MONITORING WITH MONALISA Costin Grigoras. M ON ALISA COMMUNICATION ARCHITECTURE MonALISA software components and the connections between them Data consumers.
1 R. Voicu 1, I. Legrand 1, H. Newman 1 2 C.Grigoras 1 California Institute of Technology 2 CERN CHEP 2010 Taipei, October 21 st, 2010 End to End Storage.
Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
MONALISA MONITORING AND CONTROL Costin Grigoras. O UTLINE MonALISA services and clients Usage in ALICE Online SE discovery mechanism Data management 3.
ALICE WLCG operations report Maarten Litmaath CERN IT-SDC ALICE T1-T2 Workshop Torino Feb 23, 2015 v1.2.
Storage discovery in AliEn
Federating Data in the ALICE Experiment
ALICE internal and external network
California Institute of Technology
ALICE Monitoring
Securing the Network Perimeter with ISA 2004
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Torrent-based software distribution
Implementing TMG Server Publishing
Storage elements discovery
Publishing ALICE data & CVMFS infrastructure monitoring
Presentation transcript:

N EWS OF M ON ALISA SITE MONITORING

N EWS FROM THE F LORENCE SITE Analysis of analogue history data in a 500y old database shows a strong correlation between these two time series:correlation 2 Lisa Gherardini ALICE T1/T2 Workshop, Tsukuba 2014

U PDATED MAP – G OOGLE M APS API V 3 3 ALICE T1/T2 Workshop, Tsukuba 2014

U PDATED MAP – REAL X ROOTD TRAFFIC 4 Top 100 WAN connections at the moment Source-based color coding Traffic-dependent line width ALICE T1/T2 Workshop, Tsukuba 2014

X ROOTD /EOS TRAFFIC AGGREGATION Filter on each VoBox aggregating local Xrootd on-close events (read and write) in: Client IPv4 C-classes Remote site (IP to site mapping done by the central services) Remote site LAN and WAN LAN WAN Total traffic Also available are read and write operation frequencies Data available in MonALISA under the XrdServers_Aggregation cluster A few sites don’t report this monitoring data from Xrootd: Catania, Clermont, Cyfronet, RRC-KI 8.5% remote reading (520MB/s remote, 5.6GB/s local) 5 ALICE T1/T2 Workshop, Tsukuba 2014

R EMOTE ACCESS IMPACT ON ANALYSIS JOBS Local SE problems makes the jobs read remotely In this particular case the SE tests are all fine Under investigation why the jobs cannot access local data Remote access can severely impact the jobs efficiency 6 ALICE T1/T2 Workshop, Tsukuba 2014

R EMOTE ACCESS EFFICIENCY Storage WNs CERNLEGNAROTORINOCNAFFZK CERN2.668 MB/s0.27 MB/s FZK0.486 MB/s0.161 MB/s0.213 MB/s2.963 MB/s LEGNARO1.611 MB/s2.628 MB/s0.673 MB/s0.749 MB/s TORINO1.848 MB/s1.609 MB/s0.684 MB/s0.891 MB/s CNAF2.193 MB/s0.623 MB/s2.126 MB/s 7 ALICE T1/T2 Workshop, Tsukuba 2014 Problems can come from both network and the storage IO performance seen by jobs doesn’t always match the VoBox-to-VoBox measurements Congested firewall / network segment, different OS settings Reflected in the overall efficiency

S YSTEM AND FIREWALL REQUIREMENTS REMINDER Network buffer settings, same on all nodes (WNs, Xrootd servers, VoBox) Or larger, newer machines typically have enough memory WNs to VoBox firewall openings UDP/8884 – ApMon data from JA and jobs Storage servers to VoBox UDP/8884 – ApMon monitoring of the hosts UDP/9930 – Xrootd internal monitoring: traffic data World to VoBox and VoBox to the world TCP/1093 – 1 stream bw measurement ICMP – tracepath / traceroute, UDP/ tracepath 8 ALICE T1/T2 Workshop, Tsukuba 2014

S ITE I SSUES D ASHBOARD Issues split in levels, last selected level as cookie Default sorting by site size Added direct testing of individual Xrootd data servers CVMFS status IPv6, network buffers, efficiency… 9 ALICE T1/T2 Workshop, Tsukuba 2014

F UTURE USE OF ML DATA IN SAM/SSB ALICE reports will be based on values published by MonALISA eta: 1 month A lot of details are still unclear Issues reported in the previous page will influence the report We will have to implement a test job scheduling for the idle sites to avoid “unknown” statuses Connectivity, bandwidth tests and appropriate buffer sizes will become critical 10 ALICE T1/T2 Workshop, Tsukuba 2014

M ON ALISA – A LI ROOT I NTERFACE Available in AliROOT as ANALYSIS/AliXMLParser.cxx Currently the following links implement it: Production details, eg: res_path=xml Run Condition Table: res_path=xml SHUTTLE: res_path=xml The REST interface of MonALISA: Accept=text/xml Accept=text/xml Create the view you want in the web interface then add the respective argument in the code Some certificate-protected areas will require ROOT to pass a valid certificate, todo 11 ALICE T1/T2 Workshop, Tsukuba 2014

U SAGE 12 ALICE T1/T2 Workshop, Tsukuba 2014

CSV DUMP OF DATA Data from history plots ( display?page=… ) &download_data_csv=true Status tables ( stats?page=… ) &dump_csv=true Some of the dynamic (.jsp ) pages &res_path=csv REST interface Accept=text/csv Accept=text/csv If you need to periodically query such values, running a standalone client to collect site-specific information is a better option 13 ALICE T1/T2 Workshop, Tsukuba 2014

U SING THE WEB FILTERS You can use ranges, lists, exclude particular values… Full link to page + options to send/bookmark: 14 ALICE T1/T2 Workshop, Tsukuba 2014

Q UESTIONS AND SUGGESTIONS SESSION This page is intentionally left blank 15 ALICE T1/T2 Workshop, Tsukuba 2014