Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nagios FTW TriLUG 8/10/06 Presented by: Jason Faulkner Ian Kilgore.

Similar presentations


Presentation on theme: "Nagios FTW TriLUG 8/10/06 Presented by: Jason Faulkner Ian Kilgore."— Presentation transcript:

1 Nagios FTW TriLUG 8/10/06 Presented by: Jason Faulkner Ian Kilgore

2 What is Nagios? ● A network monitor for small to medium size networks ● Flexible/Featureful – Pluggable checks/notifications – Host/Service dependency support – Escalation framework – Web interface – Remote checks ● GPL software (yay! free!)

3 Basic Nagios Information ● (go to web interface)

4 Access Control ● The nagios web interface allows you to specify fine-grained access controls ● It uses the built-in Apache auth method. You can specify in cgi.cfg what users should have access to what. ● Say, for instance, you have a boss who wants to view status of all services, but you don't want him breaking things. Easy! Edit cgi.cfg to give him read access to just a few things.

5 Nagios 1.3 vs. Nagios 2.0 ● Bugfixes ● No more hostgroup escalations ● Passive host checks ● Service groups Conclusion: Nagios 2.0 is definitely an improvement over Nagios 1.3, but the basic syntax remains the same. We will be showing off Nagios 1.3 tonight, and consult your local man page for the subtle config differences.

6 Installing Nagios ● Distro packages available for most major Linux distributions – RHEL/CentOS – Fedora – Debian – Ubuntu – Gentoo ● What do you need to run nagios? – Web server (apache) – *nix server – (more required for plugins)

7 Basic Configuration ● Most distros organize configuration in a way that is difficult to scale. Feel free to use it, but be prepared to deal with hundred, maybe even thousand line configuration files. ● Is there a better way? ● YES! There is!

8 Super Secret Broadwick Nagios Configuration(TM) ● We separate config files into two basic groups: – Host/Service definitions – Everything else ● cfg_dir is your friend – Put all host/service definitions in /etc/nagios/hosts/ – Put other stuff (except cgi.cfg & resource.cfg) in /etc/nagios/conf.d/ ● In /etc/nagios/hosts there is one file for each host, which contains the relevant host and service defs

9 Why templates are good ● Nagios has a ton of individual options for a service/host – Check interval – Notification groups – Parents ● Use templates to avoid duplication

10 Basic Configuration Example Setup ● Download your own example config: – http://trilug.oldos.org/nagios_cfg http://trilug.oldos.org/nagios_cfg – http://trilug.oldos.org/nagios_cfg.tar.gz http://trilug.oldos.org/nagios_cfg.tar.gz

11 Host Template (in conf.d/templates.cfg) define host{ name critical-host notifications_enabled 1 register 0 check_command check-host-alive max_check_attempts 5 notification_interval 60 notification_options d,r }

12 Host Definition (in hosts/webserver.cfg) define host{ use critical-host host_name webserver alias Main Web Server address webserver.foo.com }

13 Service Template (in conf.d/templates.cfg) define service{ name critical-service notifications_enabled 1 max_check_attempts 4 normal_check_interval 5 retry_check_interval 1 contact_groups sysadmin notification_interval 60 notification_period 24x7 notification_options w,c,r register 0 }

14 Command Definition (in conf.d/checkcommands.cfg) define command{ command_name check_http command_line /usr/lib/nagios/plugins/check_http -H $HOSTADDRESS$ -S }

15 Service Definition (in hosts/webserver.cfg) define service { use critical-service host_name webserver service_description HTTP check_command check_http }

16 Remote Checks ● Check by SSH? – Unless you want to deploy ssh keys, and leave yourself wide open to attack. We dub it a Bad Idea(tm). ● statd? – We've never used it. ● NRPE – Pluggable – Secure

17 NRPE ● How does it work? – The nagios server connects to the NRPE daemon on the monitored host, and gives it a command. Commands are mapped to plugins in the NRPE configuration. ● Security – All NRPE transmissions use SSL, and require you to specify the hosts that are allowed to connect. – Do not enable arguments. These are substituted into the command line. (bad)

18 Basic NRPE setup ● On NRPE server (nrpe.cfg) – allowed_hosts=1.2.3.4 – command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10 ● On Nagios host (hosts/webserver.cfg) – define service { use critical-service host_name webserver service_description USERS check_command check_nrpe_plain!check_users }

19 Advanced Configuration ● Dependencies – Because getting paged 100 times if the network goes out sucks ● Escalations – Because we don't always wake up when our pager goes off ● Custom plugins – For those of us with special needs

20 Advanced Configuration Scripts ● Services and Escalations setup on a service-by-service level – If you have a lot of services, this sucks ● Script it! ● Find the scripts at: – http://trilug.oldos.org/scripts/ http://trilug.oldos.org/scripts/ – http://trilug.oldos.org/scripts.tar.gz http://trilug.oldos.org/scripts.tar.gz

21 Service Dependency Script ● Makes all services on a host webserver depend on webserver's ping service ● Any host dependencies you have setup will carry over – host webserver depends on host router, therefore, services on webserver will depend on router's ping service ● http://trilug.oldos.org/scripts/servicedeps.pl http://trilug.oldos.org/scripts/servicedeps.pl

22 Service Dependency Script (cont) ● Usage is simple – Reads host and service info from STDIN – Outputs a properly formatted servicedeps.cfg to STDOUT – cat /etc/nagios/hosts/* |./servicedeps.pl > /etc/nagios/conf.d/servicedeps.cfg

23 Escalations Script ● Modify template to your liking ● Script plugs-in host and service names into template ● Downside: Requires you to actually answer pages so your boss doesn't get paged. ● http://trilug.oldos.org/scripts/escalations.pl http://trilug.oldos.org/scripts/escalations.pl

24 Escalations Script (cont) ● Usage is simple – Reads host and service info from STDIN – Outputs a properly formatted escalations.cfg to STDOUT – cat /etc/nagios/hosts/* |./escalations.pl > /etc/nagios/conf.d/escalations.cfg

25 Custom Nagios Plugins ● Custom nagios plugins are simply executable files on the nagios host. They have to output a short line of status info and an exit code – 0 = OK – 1 = Warning – 2 = Critical – >2 = Unknown ● Feel free to use our plugins as a template. The best way to write a plugin is to hack someone else's :) – http://trilug.oldos.org/nagios_plugins http://trilug.oldos.org/nagios_plugins – http://trilug.oldos.org/nagios_plugins.tar.gz http://trilug.oldos.org/nagios_plugins.tar.gz

26 Stuff we didn't cover ● Passive checks – Nagios will accept passive checks – We don't do this. ● Distributed monitoring – There are ways to make multiple nagios servers talk to each other – We don't do this, either.

27 Alternatives to Nagios? ● OpenNMS – Written in java ● Big Brother – Written in bash ● Angry Customers calling the support department – Written in blood ● Interns with Cellphones and a terminal – Written in India

28 Resources ● Scripts – http://trilug.oldos.org/scripts/ http://trilug.oldos.org/scripts/ – http://trilug.oldos.org/scripts.tar.gz http://trilug.oldos.org/scripts.tar.gz ● Example nagios configuration – http://trilug.oldos.org/nagios_cfg http://trilug.oldos.org/nagios_cfg – http://trilug.oldos.org/nagios_cfg.tar.gz http://trilug.oldos.org/nagios_cfg.tar.gz ● Custom Nagios plugins written by us – http://trilug.oldos.org/nagios_plugins/ http://trilug.oldos.org/nagios_plugins/ – http://trilug.oldos.org/nagios_plugins.tar.gz http://trilug.oldos.org/nagios_plugins.tar.gz ● This presentation – http://trilug.oldos.org/nagios_presentation.odp http://trilug.oldos.org/nagios_presentation.odp ● Nagios Documentation – http://www.nagios.org/docs/http://www.nagios.org/docs/

29 Questions?


Download ppt "Nagios FTW TriLUG 8/10/06 Presented by: Jason Faulkner Ian Kilgore."

Similar presentations


Ads by Google