Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connect. Communicate. Collaborate E2Emon Michael Enrico, DANTE (representing many others!) TNC 2008, Bruges, Belgium 22 May 2008 (E2E Link Monitoring)

Similar presentations


Presentation on theme: "Connect. Communicate. Collaborate E2Emon Michael Enrico, DANTE (representing many others!) TNC 2008, Bruges, Belgium 22 May 2008 (E2E Link Monitoring)"— Presentation transcript:

1 Connect. Communicate. Collaborate E2Emon Michael Enrico, DANTE (representing many others!) TNC 2008, Bruges, Belgium 22 May 2008 (E2E Link Monitoring) a PerfSONAR-based monitoring system for multi-domain, point-to-point managed bandwidth services

2 Connect. Communicate. Collaborate Outline The motivation for E2Emon What is E2Emon? The gory details (OK, not all of them) How does it look? (some screenshots) Developments (and a few more screenshots) Who participates today? Future development Summary & credits (plus some extra material to read off-line)

3 Connect. Communicate. Collaborate Motivation Original motivation was to aid in monitoring of “Cross-Border Fibre” (CBF) wavelength services Quickly realised that it would be useful for all wavelength services traversing multiple “provider domains” (e.g. including those transiting GÉANT2) Reason why this work landed in GN2’s JRA4 activity (rather than JRA1/perfSONAR) Karlsruhe Manno Basel Milano Bologna 320km 100km

4 Connect. Communicate. Collaborate E2E (Link) Monitoring (The problem space) Point A Point B Domain 1 Domain 2 Domain 3 GOAL: to realise (near) real-time monitoring (link status & in-service PM) of the constituent parts and of the whole E2ELink A-B E2ELink A-B (where E2ELink = discrete layer 1 or 2 service)

5 Connect. Communicate. Collaborate What E2Emon is not… E2Emon Everything else …a panacea (when it comes to providing quality P2P services in a multi-domain environment) …a substitute for sound multi-domain operational processes NOTE: for more on this topic leave now (!!!) and see Marian Garcia- Vidondo’s talk on Multi-domain operations in Room D - Erasmus

6 Connect. Communicate. Collaborate E2Emon data model: divide & conquer Connect. Communicate. Collaborate REMEMBER: Initial focus was on wavelength services

7 Connect. Communicate. Collaborate E2Emon(itoring) (Method) Point A PointB Domain 1 Domain 2 Domain 3 E2ELink A-B perfSONAR MP or MA perfSONAR MP or MA SOAP/XML E2Emon correlator perfSONAR MP or MA DomainLink and (partial) ID_Link info E2ECU operators SNMP SOAP/XML “Weathermap” view for users HTTP

8 Connect. Communicate. Collaborate Basic characteristics (of E2Emonitoring) Status information corresponds to network layer 1 and 2 Status information is logical abstraction No information about physical devices necessary Domain and Interdomain (ID) link status provided by constituent domains using perfSONAR –Abstraction process within domain may be non-trivial –Some examples given later E2E link status: aggregation of NREN and ID links

9 Connect. Communicate. Collaborate Operational States: –Up – link is available –Degraded – link is up, but has reduced performance (future) –Down – unavailable –Unknown – state is unknown Administrative States: –NormalOperational –Maintenance –TroubleShooting –UnderRepair –Unknown Information available in E2Emon Not yet any “in-service” PM data Still for further study Difficult in heterogeneous environment See MCF from PerfSONAR DJ1.2.5: MCF: Experimental Results & Sub-layer3 Monitoring

10 Connect. Communicate. Collaborate Gory detail (data model) Connect. Communicate. Collaborate

11 Gory detail (raw XML) ams-gen_LHC-06002A CERN-TRIUMF-LHCOPN-001 DOMAIN_Link DemarcPoint DemarcPoint UP UP NORMALOPERATION

12 Connect. Communicate. Collaborate How does it look? (some screenshots follow…)

13 Connect. Communicate. Collaborate 12

14 Connect. Communicate. Collaborate Typical E2E link (working normally)

15 Connect. Communicate. Collaborate Typical E2E link (failure condition)

16 Connect. Communicate. Collaborate The “magic” within the domains Refers to the process of synthesizing E2Emon-compliant abstract information from whatever raw data is available May need to synthesize from atomic MIB objects like LOS or LOW on a certain set of interfaces/boards –these, in turn, may need to be retrieved directly from NEs on the data plane OR –from an NMS via a “northbound” interface If an NMS is present then it may perform some of the necessary synthesis (but maybe not all!) Transmission equipment – may be an SNMP-free zone!

17 Connect. Communicate. Collaborate Example 1: GARR What goes into synthesizing this?

18 Connect. Communicate. Collaborate IP MPLS lambda GARR SWITCH CNAF X BO MI PD KARLSRUHE DFN WDM Manno X X lambda GINS e2e Service check the status of segments E2E Monitoring System status aggregation Connect. Communicate. Collaborate Detail within DomainLink (more detail in slides at end of presentation)

19 Connect. Communicate. Collaborate Example 2: GÉANT2 (more detail in slides at end of presentation) OSI & IP DCN DomainLink partial IDL partial IDL ALU NMS 1353NM (EML) 1354RM (NML) 1359 IOO TRAP Handler & other stuff & MP/MA SNMP traps

20 Connect. Communicate. Collaborate Developments Introduced in R2.0… Synthesized management object alarm handling Export of synthesized alarms and defect conditions (via SNMP traps) to umbrella management systems Availability statistics Production/non-production flags

21 Connect. Communicate. Collaborate 13

22 Connect. Communicate. Collaborate ALARM!!! Connect. Communicate. Collaborate Synthesized alarm handling RETURN ALARM CLEARED ALSO… SNMP trap sent to NOC operators’ dashboard

23 Connect. Communicate. Collaborate Export to Nagios (E2ECU) (via SNMP)

24 Connect. Communicate. Collaborate Who is participating? GN2 partnerHardware Status info available? perfSONAR installation? Expected RFS GÉANT2Alcatelyesdonein service now DFNHuaweiyesdonein service now RENATERAlcatelyesdonein service now RedIRISNortel 8010yesdonein service now NORDUnetAlcatelyesnot yetforthcoming GARRJuniper/ADVAyesdonein service now SURFnet [NL]Nortelyesdonein service now ja.netNortel+CienaTBCnot yetforthcoming SWITCHSorrentoyesdonein service now CESNETCiscoyesdonein service now PSNCAdvayesdonein service now Internet2Ciena/Infinerayesnot yetforthcoming CANARIENortelyesnot yetnearly ready ESNETCiena/Infinerayesdonein service now USLHCNETCienayesdonein service now Fermilabvariousyesdonein service now CERNForce 10 + othersyesdonein service now IN2P3?yesdonein service now DEISACiscoYesDonein service now

25 Connect. Communicate. Collaborate The future? Minor release (R2.1) on the way Big omission is still in-service PM stats –Do we invest the effort to rectify this? Do we need it? Making it more “production quality”: –Need to encourage a more thorough approach to feeding E2Emon (improve on quality of MP/MA data, availability of MP/MA, etc) –Add controls to better control front-end view and manage synthesized alarms –Add HA? Adding proper AAI support

26 Connect. Communicate. Collaborate Summary E2Emon came about as a “quick fix” to an immediate problem –(monitoring wavelength services in a multi-domain environment) Now adopted in a production environment (E2ECU) Wider applicability (within R&E net community) –Sub-wavelength services (e.g. GE EPLs) –Will be adapted to monitor short-lived services (e.g. created using AutoBAHN, DICE CP, etc) Wider applicability (outside R&E net community)? –do we try to take this to the standards bodies?

27 Connect. Communicate. Collaborate URLs Most material (documentation, downloads, etc) can be found on the PerfSONAR wiki at: http://wiki.perfsonar.net/jra1-wiki/index.php/PerfSONAR_support_for_E2E_Link_Monitoring http://wiki.perfsonar.net/jra1-wiki/index.php/PerfSONAR_support_for_E2E_Link_Monitoring E2E Monitoring System (Sandbox) http://cnmdev.lrz-muenchen.de/e2e/lhc/mon/G2_E2E_index_ALL.html http://cnmdev.lrz-muenchen.de/e2e/lhc/mon/G2_E2E_index_ALL.html

28 Connect. Communicate. Collaborate Credits M&M Matthias Hamm & Mark Yampolskiy DFN/LRZ/MNMT (München) (Developers) Other authors Otto Kreiter & Loukik Kudarimoti (DANTE) Giovanni Cesaroni (GARR) (Developers/contributors) GÉANT2 JRA-4 (WI-03) and PerfSONAR folks Emma Apted DANTE Operations (Coordinator) Numerous others in participating domains (Implementers/maintainers of domain-specific “magic”)

29 Connect. Communicate. Collaborate That was… E2E (Link) mon itoring Questions?

30 Connect. Communicate. Collaborate Extra info On the “magic” within the domains (GARR and GÉANT2) …

31 Connect. Communicate. Collaborate Example 1: GARR What goes into synthesizing this?

32 Connect. Communicate. Collaborate IP MPLS lambda GARR SWITCH CNAF X BO MI PD KARLSRUHE DFN WDM Manno X X lambda GINS e2e Service check the status of segments E2E Monitoring System status aggregation Connect. Communicate. Collaborate Detail within DomainLink 1

33 Connect. Communicate. Collaborate Detail within DomainLink 2

34 Connect. Communicate. Collaborate Domain information, LSP status, traffic Interdomain information, e2e L2 circuit status GARR UI (MPLS monitoring)

35 Connect. Communicate. Collaborate Information: mplsLspName 1.3.6.1.4.1.2636.3.2.3.1.1 mplsLspPathChanges 1.3.6.1.4.1.2636.3.2.3.1.10 mplsLspLastPathChange 1.3.6.1.4.1.2636.3.2.3.1.11 mplsLspConfiguredPaths 1.3.6.1.4.1.2636.3.2.3.1.12 mplsLspStandbyPaths 1.3.6.1.4.1.2636.3.2.3.1.13 mplsLspOperationalPaths 1.3.6.1.4.1.2636.3.2.3.1.14 mplsLspFrom 1.3.6.1.4.1.2636.3.2.3.1.15 mplsLspTo 1.3.6.1.4.1.2636.3.2.3.1.16 mplsPathName 1.3.6.1.4.1.2636.3.2.3.1.17 mplsPathType 1.3.6.1.4.1.2636.3.2.3.1.18 mplsPathExplicitRoute 1.3.6.1.4.1.2636.3.2.3.1.19 mplsLspState 1.3.6.1.4.1.2636.3.2.3.1.2 mplsPathRecordRoute 1.3.6.1.4.1.2636.3.2.3.1.20 mplsPathBandwidth 1.3.6.1.4.1.2636.3.2.3.1.21 mplsPathCOS 1.3.6.1.4.1.2636.3.2.3.1.22 mplsPathInclude 1.3.6.1.4.1.2636.3.2.3.1.23 mplsPathExclude 1.3.6.1.4.1.2636.3.2.3.1.24 mplsPathSetupPriority 1.3.6.1.4.1.2636.3.2.3.1.25 mplsPathHoldPriority 1.3.6.1.4.1.2636.3.2.3.1.26 mplsPathProperties 1.3.6.1.4.1.2636.3.2.3.1.27 mplsLspOctets 1.3.6.1.4.1.2636.3.2.3.1.3 mplsLspPackets 1.3.6.1.4.1.2636.3.2.3.1.4 mplsLspAge 1.3.6.1.4.1.2636.3.2.3.1.5 mplsLspTimeUp 1.3.6.1.4.1.2636.3.2.3.1.6 mplsLspPrimaryTimeUp 1.3.6.1.4.1.2636.3.2.3.1.7 mplsLspTransitions 1.3.6.1.4.1.2636.3.2.3.1.8 mplsLspLastTransition 1.3.6.1.4.1.2636.3.2.3.1.9 How to get information on an MPLS LSP 1 - Get the snmp index (see next slide) BO1-MI1-VPN :.66.79.49.45.77.73.49.45.86.80.78.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 2 - Query snmpget -v2c -c. 3 - Parse the output 1 = unknown 2 = up 3 = down MPLS Monitor (using Juniper MIBs)

36 Connect. Communicate. Collaborate <? $name=$argv[1]; $oid=name2oid($name); print $name.": ".$oid."\n"; function name2oid($string) { $hex = ''; $len = strlen($string); for ($i = 0; $i < $len; $i++) { $hex.= ".".str_pad(ord($string[$i]), 2, 0, STR_PAD_LEFT); } $npoints=32-$len; for ($i=0;$i<$npoints;$i++){ $hex.= ".0"; } return $hex; } ?> Finding the index of the LSP B O 1 - M I......66.79.49.45.77.73.49.45.86....... : $ php name2oid.php BO1-MI1-VPN BO1-MI1-VPN:.66.79.49.45.77.73.49.45.86.80.78.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 MPLS Monitor (using Juniper MIBs)

37 Connect. Communicate. Collaborate Example 2: GÉANT2 OSI & IP DCN DomainLink partial IDL partial IDL ALU NMS 1353NM (EML) 1354RM (NML) 1359 IOO TRAP Handler & other stuff & MP/MA SNMP traps

38 Connect. Communicate. Collaborate Example 2: GÉANT2 Path – gen_mil_CERN OCH trailPhys-linkPhys link Domain linkP. IDLink CERN-SARA-LHC-001 OCH trailPhys-link P. IDLink

39 Connect. Communicate. Collaborate Monitoring data processing “e2e path”

40 Connect. Communicate. Collaborate GÉANT2 Alarm analyzer Called every time a trap is received Written in bash Each trap is analyzed separately –if in the meantime a new trap arrives it waits in the queue (snmptrapd) Must maintain state After analysing the trap, action is taken  call the data transformation script Had several problems: –snmptrapd version –Alcatel snmp problems After one year of testing and modification currently stable – awaiting a new NMS upgrade – or an alarm churn

41 Connect. Communicate. Collaborate E2E Data transformation Applications developed in Java: –E2EXMLWriter –XMLGenerator E2EXMLWriter takes in a template XML and produces an XML file containing live e2e path status information conforming to the JRA4 e2e data model –Triggered by the bash script listening to SNMP alarms –Parameters passed Trail ID Status E2EXMLWriter – updates the perfSONAR MA XMLGenerator produces this template XML that E2EXMLWriter uses to export domain’s e2e information


Download ppt "Connect. Communicate. Collaborate E2Emon Michael Enrico, DANTE (representing many others!) TNC 2008, Bruges, Belgium 22 May 2008 (E2E Link Monitoring)"

Similar presentations


Ads by Google