Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nagios – Our Open Source Network Management Solution

Similar presentations


Presentation on theme: "Nagios – Our Open Source Network Management Solution"— Presentation transcript:

1 Nagios – Our Open Source Network Management Solution
Presenter: Ling Zhang LBLnet Services Group Information Technologies and Services Division LBNL

2 Contributors Nagios software design and development:
Ethan Galstad ( System integration, configuration, testing: Ling Zhang, Greg Bell, Harper Mann, Cedric Hui, Clark Wood, Mike Bennett 18 September 2018 ITSD/LBNL

3 Goals for this talk To explain: To discuss
LBLnet’s point of view of Network Management System network monitoring problems we encountered the design of our Nagios network monitoring system To discuss the benefits of the nagios system our future development goals 18 September 2018 ITSD/LBNL

4 Our point of view of a NMS
Proactive network management Alarm Panel Connectivity Performance Fault isolation Trend Analysis Capacity planning The Notification Precise Fast 18 September 2018 ITSD/LBNL

5 Background Information
Network Monitoring tools we have tested and/or used before: Sun Net Manager Spectrum Whatsup Gold Netmon SNMPc Ipmonitor HP Openview OpenNMS InCharge Home grown scripts MRTG/RRDtool etc. 18 September 2018 ITSD/LBNL

6 Background Information
Our fair share of problems with NMS: Notification storm 65 notifications were received during a router up/down event. The router has 20 active interface and 32 downstream monitored devices False alarms Integration with existing systems (MRTG, Trouble ticket system) Tech support our longest outstanding tickets: 2 years and counting Budget 18 September 2018 ITSD/LBNL

7 In Search of a Better NMS
Accurate and efficient fault detection Good performance Extensible Can be integrated with our existing system Low maintenance Fits our budget 18 September 2018 ITSD/LBNL

8 Features of Nagios Open source system runs on most Unix system
Highly extensible Reliable dependency monitoring Excellent service monitoring capabilities Ability to schedule maintenance periods Flexible notification 18 September 2018 ITSD/LBNL

9 Our Nagios Topology LBLnet NMS diagram 18 September 2018 ITSD/LBNL

10 Nagios Extensibility Plugins Event handlers External commands
18 September 2018 ITSD/LBNL

11 Nagios Extensibility - Plugins
Compiled executables or scripts (Perl, shell, etc.) Run by nagios process Checks device or service status Example: define host { host_name switch1 address check_command ping_switch } define service { host_name switch1 Service_description CPU Util check_command get_cpu_util 18 September 2018 ITSD/LBNL

12 Services Monitored by Nagios
Nagios uses plugins to check service status DHCP DNS FTP HTTP HTTPS IMAP NTP Radius SMTP SQL TFTP WINS etc. 18 September 2018 ITSD/LBNL

13 Nagios Extensibility – Event Handelers
Compiled executables or scripts Run by nagios process Triggered by host or service status change Example: define service{ host_name somehost service_description HTTP max_check_attempts 4 check_command check_http event_handler restart-httpd ...other service variables... } 18 September 2018 ITSD/LBNL

14 Nagios Extensibility – External Commands
A predefined set of commands issued externally to control the behavior of nagios Controls notification, monitor scheduling, program start/stop Issued by external applications (CGI, snmptrapd, etc.) Reads in by nagios core process during run time Example User disabled monitoring of switch1 from web interface CGI wrote command “disable monitor switch1” to command file Nagios process read this command and stopped scheduling monitoring for switch1 18 September 2018 ITSD/LBNL

15 Monitoring Network Devices
Ping Measures system responsiveness via average RTT SNMP get CPU Temperature Interface/port status System up time Power supply status Throughput Packet discard rate etc. SNMP trap 18 September 2018 ITSD/LBNL

16 Nagios Trap handling Requires Net-SNMP or other trap receiver daemon
Trap receiver notifies nagios about traps received via External Commands Nagios calls event handlers and/or notifies user 18 September 2018 ITSD/LBNL

17 Dependency Configuration
define host { use switch-tmpl host_name switch1 address parents router1 } host_name switch2 address parents switch1 host_name switch3 address host_name switch4 address parents switch2 Diagram 18 September 2018 ITSD/LBNL

18 Nagios Notification Similar to event handlers
Triggered by host/service status change Calls third party notification tools (sendmail, qpage, etc.) Supports , page, instant messaging etc. 18 September 2018 ITSD/LBNL

19 Nagios Notification format
Subject: switch3 ( ) DOWN Host: switch3 Address: Date/Time: Thu Jul 15 14:03:37 PDT 2004 Additional Info: (No Information Returned From Host Check) Page DOWN switch3( ) 18 September 2018 ITSD/LBNL

20 Maintenance Scheduling
Schedule a maintenance window via Nagios web interface Uses external commands Fixed window Float window Dependency aware 18 September 2018 ITSD/LBNL

21 Monitoring Subnet with Redundant Network Connections
Solution: Monitor interface up/down status via Ping Monitor HSRP status via HSRP mib Challenge: Monitoring interface status Monitoring standby status at the same time 18 September 2018 ITSD/LBNL

22 Performance of Nagios False alarms Notification delay False positive
False negative Unnecessary Notification delay Before: 303 sec After: 221 sec 18 September 2018 ITSD/LBNL

23 Money and Time Saved Software package cost
InCharge ($$$) IPmonitor ( $1500) Nagios ($0) Software maintenance contract cost InCharge (>$15,000) IPmonitor ($500) Time saved from less unnecessary alarms (Compared to IPmontior) 20 man.hrs/month 18 September 2018 ITSD/LBNL

24 Future development of Nagios
Performance Monitoring Network element out of resources Interface buffer drops Duplex mismatch Has to be done by inference Assume heterogeneous network equipment No use of host SNMP Derive from combination of interface error types and rates Integrating with other NMS elements Syslog MRTG/RRDtool Trouble ticket System Database Topology discovery 18 September 2018 ITSD/LBNL

25 Conclusion Nagios fits our Network Management needs because:
Accurate and efficient fault detection Extensibility Can be easily integrated with our existing system Low maintenance Fits our budget Delete sample document icons and replace with working document icons as follows: From Insert Menu, select Object... Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked Click OK Select icon From Slide Show Menu, Select “Action Settings” Click “Object Action” and select “Edit” 18 September 2018 ITSD/LBNL

26 Thanks! We are happy to share Questions / comments
send to Delete sample document icons and replace with working document icons as follows: From Insert Menu, select Object... Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked Click OK Select icon From Slide Show Menu, Select “Action Settings” Click “Object Action” and select “Edit” 18 September 2018 ITSD/LBNL


Download ppt "Nagios – Our Open Source Network Management Solution"

Similar presentations


Ads by Google