Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006.

Similar presentations


Presentation on theme: "Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006."— Presentation transcript:

1 Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006

2 Connect. Communicate. Collaborate Agenda Extraction of monitoring information from the GÉANT2 network External application developed by DANTE Demonstration of a home grown weather-map Conclusion

3 Connect. Communicate. Collaborate Network Element Manager All network elements communicate with the NM separately NM task is to configure and monitor one by one each NE It is not service aware – no knowledge about the intra-domain e2e path status.

4 Connect. Communicate. Collaborate Regional Network Manager (RM) Topology Services Correlation “User” interface

5 Connect. Communicate. Collaborate How we export data ! Alarms Perf. Meas. Rem. Inv.

6 Connect. Communicate. Collaborate Status via alarms Alarms SNMPTrapD Alarms Monitoring station

7 Connect. Communicate. Collaborate Alarm content From the NM: –Information about interfaces and associated signal status, SDH timing problems –NE and ILA status From the RM –Information related to services –Information related to path, trails and physical connections at all layers

8 Connect. Communicate. Collaborate One hop case NMS vs JRA-4 Path – gen_mil_CERN OCH trailPhys-linkPhys link Domain linkP. ID link BOL-CERN-LHC-001

9 Connect. Communicate. Collaborate Multiple hop case NMS vs JRA-4 Path – gen_mil_CERN OCH trailPhys-linkPhys link Domain linkP. IDLink CERN-SARA-LHC-001 OCH trailPhys-link P. IDLink

10 Connect. Communicate. Collaborate Alarm processing SNMP traps from the Alcatel IOO module. Alcatel Enterprise v1/v2c MIB SNMP traps received by a Linux station –snmptrapd to pick up all alarms –For each trap a bash script is called which performs: Analysis Selection Action

11 Connect. Communicate. Collaborate Alarm type & information Alarm Raise: –friendlyName –probableCause –perceivedSeverity –currentAlarmId –eventTime –acknowledgementStatus –additionalInformation –eventType –snmpTrapAddress Alarm Clear: –friendlyName –probableCause –currentAlarmId –eventTime –snmpTrapAddress

12 Connect. Communicate. Collaborate Used alarm information Alarm Raise: –friendlyName –probableCause –perceivedSeverity –currentAlarmId –eventTime –acknowledgementStatus –additionalInformation –eventType –snmpTrapAddress Alarm Clear: –friendlyName –probableCause –currentAlarmId –eventTime –snmpTrapAddress

13 Connect. Communicate. Collaborate Alarm analyzer process SNMP trap received snmpTrapAddressMust be registered Check for type Of Alarm Raise Additional Info path clientpath ochtrail omstrail physicallink recordAlarm Call External Program Clear alarmID Read recordAlarm Call ExternalProgram Record all traps delete recordAl friendlyName

14 Connect. Communicate. Collaborate Alarm analyzer Called every time a trap is received Written in bash Each trap is analyzed separately and if in the meantime a new trap arrives it waits in the queue (snmptrapd) –Possible problem if an external program get stuck and the scripts hangs. The alarms remains unprocessed in the queue Must maintain state –SNMP traps may get lost so a program needs to check time to time if the monitoring station is in syncro with the NMS.

15 Connect. Communicate. Collaborate XML file generation

16 Connect. Communicate. Collaborate E2E Data transformation Prototype applications developed in Java – –E2EXMLWriter –XMLGenerator E2EXMLWriter performs 2 functions – –Takes in a template XML and produces an XML file containing live e2e path status information conforming to the JRA4 e2e data model. –Feeds a perfSonar MA with live path status information. E2EXMLWriter is triggered by a script listening to SNMP alarms –Parameters passed Trail ID Status XMLGenerator produces this template XML that E2EXMLWriter uses to export domain’s e2e information

17 Connect. Communicate. Collaborate Design of E2EXMLWriter Relies on 2 configuration files to produce live XML status information –Properties file (links.properties) Properties file containing key = value entries Each key is one e2e path name Value to each key is a csv of multiple trails that form one Domain Link and/or Partial ID Link Currently manually maintained –Alarm register A simple csv file Application maintained An “alarm raise” registers the associated path An “alarm clear” de-registers the associated path (contd).

18 Connect. Communicate. Collaborate Design (contd.) The application sets all path’s default status as UP with admin state as NORMALOPERATION Only the paths “registered” in the alarm-register csv file are set as DOWN with admin state as MAINTENANCE No implementation of the status DEGRADED at the moment No implementation of other admin states at the moment

19 Connect. Communicate. Collaborate Design of XMLGenerator Relies on 3 configuration files – –Properties file (init.properties) Contains a key = value entry Key = DOMAIN Value = Enables on-the-fly domain name configuration –Config file (config.csv) A simple CSV file Contains node-link-node information –A sample XML file containing “pieces of XML” to be replicated for each node and link in the final output “template XML” All configuration files are currently manually maintained

20 Connect. Communicate. Collaborate Monitoring data processing “e2e path”

21 Connect. Communicate. Collaborate LHC weather-map live demonstration 1.CERN user-side down 2.CERN user-side up 3.GEN-MIL Lambda down 4.GARR user-side down 5.Back-to-back interconnection in DE broken 6.AMS-FRA lambda down 7.Up DE interconnection 8.AMS-FRA lambda up 9.GARR user-side up 10. GEN-MIL lambda up

22 Connect. Communicate. Collaborate Conclusion Status monitoring via SNMP alarms in an advanced phase and well understood. –Once the characteristic of the equipment/alarms/faults understood the development was easy. XMLGenerator not bonded to a specific equipment and can be used together with the JRA-4 MP and/or to feed an perfSONAR MA

23 Connect. Communicate. Collaborate Questions ? Otto.Kreiter@dante.org.uk Navneet.Daga@dante.org.uk

24 Connect. Communicate. Collaborate T0-T1 CERN-CNAF GARR GÉANT2 CERN (CH) CNAF (IT)

25 Connect. Communicate. Collaborate Technologies

26 Connect. Communicate. Collaborate Domain I – CERN Partial ID Link corresponds to the status of the port MP developed by Martin Swany - export port status information

27 Connect. Communicate. Collaborate Domain II – GÉANT2 Partial ID link – status of the ports facing the adjacent domains Domain Link – status of the lambda perfSonar MA and GN2-JRA4 MP used to export status information

28 Connect. Communicate. Collaborate Domain III - GARR Inter Domain Link – status of the port facing GÉANT2 Domain link – status of the LSP between the two routers + status of the interface facing CNAF (T1) GN2-JRA4 MP used to export measurement data

29 Connect. Communicate. Collaborate View on the E2E monitoring system

30 Connect. Communicate. Collaborate Conclusion Fairly easy to establish the monitoring of the E2E path. –It took around two phone conf with GARR + around 10 e-mails –3-4 phone conf with CERN and Martin Swany + around 10-15 e- mails –All parties were extremely familiar with their equipment and the required softwares. Questions started to pop-up if we need to monitor an End-Point and how should we do it ? –Is an EP a simple client ? –Or we shall redefine the “Client” as somebody who actively participate in the e2e monitoring

31 Connect. Communicate. Collaborate Backup

32 Connect. Communicate. Collaborate CERN user side down

33 Connect. Communicate. Collaborate Lambda CH-IT down

34 Connect. Communicate. Collaborate Lambda and user failure in IT

35 Connect. Communicate. Collaborate Lambda + POP interconnect failure

36 Connect. Communicate. Collaborate Multiple Lambda, user and POP interconnect failure


Download ppt "Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006."

Similar presentations


Ads by Google