Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status of SA2 network monitoring and troubleshooting tools

Similar presentations


Presentation on theme: "Status of SA2 network monitoring and troubleshooting tools"— Presentation transcript:

1 Status of SA2 network monitoring and troubleshooting tools
Guillaume Cessieux (CNRS/IN2P3-CC, EGEE SA2) OTAG, , AMS

2 Why is this a complicated topic ?
More than 40 heterogeneous networks domains Network monitoring slowly converging toward perfSONAR but not yet mature, usable and deployed enough EGEE cannot support manpower in each NRENs And we have no direct customer/provider relationships Networks are shared Reduce network monitoring overhead and overlaps Must be generic, not on a per project basis Active performance tests are disturbing and facing unknown interferences on shared networks Passive tests are facing unwillingness to grant access on network devices or to share data

3 SA2 tools Monitoring and troubleshooting tools converged toward two main tools DownCollector perfSONAR-lite TroubleShooting Services (TSS) Several other attempts NPM, e2emonit, Grid jobs... All source code produced by the EGEE-SA2 activity is publicly published (Apache 2 license via EGEE) Support will be provided until the end of EGEE-III

4 DownCollector (1/2) Completely home made monitoring tool for EGEE-SA2
Started in EGEE-II, developments lead by IN2P3-CC Central tool No remote deployment needed Testing Grid TCP ports on all Grid nodes registered in GOCDB every 2 minutes (3000 tests) Aggregated connectivity results and history shown on a webinterface Interfaced with some other tools: Nagios, CIC portal... Users: Sites, ROCs, COD, ENOC

5 DownCollector (2/2) Status Future
Tool released , mature and stable Very used and appeared giving useful results Future Some interest by CNRS and IN2P3, but still unclear Equivalent in Nagios?

6 perfSONAR-lite TroubleShooting Services (1/3)
Site A Probe A 2 - Request ENOC 1 - Request 4 - Result Central server 3 - e2e measurement Users 5 - Result Site B Probe B Started in EGEE-III, entirely designed by SA2 Developments lead by DFN/Erlangen as a SA2 partner Central server orchestrating on demand e2e measurements between light probes hosted by Grid sites EGEE driven improvements of standard perfSONAR framework Authentication & Authorisation mapped from GOCDB’s roles

7 perfSONAR-lite TSS (2/3)

8 perfSONAR-lite TSS (3/3)
Expected users: Sites, ROCs, ENOC... Status: Tool nearly ready, but missing maturation phase Suffered some staff movements and licensing issues Not yet production deployment, but promising testbed with 3 sites First production release: End of March Future: May be followed and used outside EGI DFN and CNRS currently looking for some applications and fundings around Could be interesting for NRENs to host probes on demarcation points

9 Conclusion from SA2 Network monitoring and troubleshooting tools are key dependencies for networking support SA2 really suffered from this area Collaboration with network providers is essential Good network monitoring also eases Grid operations Lot of constraints are hardening good ideas or standard tools Multi domains, security policies, data disclosure, manpower, scale etc. We were unable to obtain network performance monitoring We suggest DownCollector and perfSONAR-lite TSS for interest in EGI


Download ppt "Status of SA2 network monitoring and troubleshooting tools"

Similar presentations


Ads by Google