Presentation is loading. Please wait.

Presentation is loading. Please wait.

Monitoring Openstack – The Relationship Between Nagios and Ceilometer Konstantin Benz, Zurich University of Applied Sciences

Similar presentations


Presentation on theme: "Monitoring Openstack – The Relationship Between Nagios and Ceilometer Konstantin Benz, Zurich University of Applied Sciences"— Presentation transcript:

1 Monitoring Openstack – The Relationship Between Nagios and Ceilometer Konstantin Benz, Zurich University of Applied Sciences

2 Introduction & Agenda About me Working as Zurich University of Applied Sciences OpenStack / Cloud Computing Engaged in monitoring and High Availability systems Currently working on a Europe-wide cloud federation: XIFI – eXtensible Infrastructure for Future Internet 17 nodes / OpenStack clouds Test environment for Future Internet (FI-WARE) applications Infrastructure for smart cities, public healthcare, traffic management… European-wide L2-connected backbone network Nagios as main monitoring tool of that project

3 Introduction & Agenda What are you talking about in this presentation? How to use Nagios to monitor an OpenStack cloud environment Integrate Nagios with OpenStack Anything else? Cloud monitoring requirements OpenStack cloud management software and Ceilometer Comparison between Nagios and Ceilometer: Technological paradigms Commonalities and differences How to integrate Nagios with Ceilometer Can't wait!

4 Cloud Monitoring Requirements Cloud ≈ virtualization + elasticity Types of clouds: IaaS: virtual VMs and network devices, elasticity in number/size of devices PaaS: virtual, elastically sized platform SaaS: software provided by employing virtual, elastic resources Cloud is a collection of virtual resources provided in physical infrastructure Cloud provides resources elastically

5 Cloud Monitoring Requirements Why should someone use clouds? Cloud consumer can outsource IT infrastructure No fixed costs for cloud consumer Pay for resource utilization Cloud provider responsible for building and maintaining physical infrastructure Cloud provider can rent out unused IT infrastructure Eliminate waste Get money back for overcapacity

6 Monitoring OpenStack OpenStack Architecture Open source cloud computing software Consists in multiple services: Keystone: OpenStack identity services (authentication, authorization, accounting) Cinder: management of block storage volumes Nova: management and provision of virtual resources (VM instances) Glance: management of VM images Swift: management of object storage Neutron: management of network resources (IPs, routing, connectivity) Horizon: GUI dashboard for end users Heat: orchestration of virtualized environments (important for providing elasticity) Ceilometer: monitoring of virtual resources

7 Monitoring OpenStack Things to monitor Operation of OpenStack itself: Services: Cinder, Glance, Nova, Swift... Infrastructure: Hardware, Operating System where OpenStack services are running Operation of virtual resources provided by OpenStack: Resource availability: VMs, virtual network devices Resource utilization: VM uptime, CPU / memory usage → Virtual resources are commonly monitored by Ceilometer → Ceilometer gathers data through the API of OpenStack services

8 Monitoring OpenStack Why is Ceilometer not enough? → Ceilometer monitors virtual resources through APIs of OpenStack components, BUT NOT operation of the OpenStack components

9 Comparison Nagios / Ceilometer Nagios operational model Configuration: Check interval (and retry interval) to poll system status and update frontend GUI Remote execution of monitoring clients (usually Nagios plugins) Thresholds that result in "Okay", "Warning", "Critical" status messages which are sent back to Nagios server (and "Unknown" if status not measurable) Main usage: Effective monitoring solution for physical servers System administration console that allows for fast reaction in case of problems Strength: extensibility and customizability Nagios must be extended in order to monitor virtual resources inside administrated systems

10 Comparison Nagios / Ceilometer Ceilometer operational model Configuration: Polling services check metrics OpenStack objects generate event notifications automatically All events and metrics collected in a database Main usage: OpenStack integrated metrics collector and database Temporal database that can be used for rating, charging and billing of virtual resource utilization Strength: fully integrated in OpenStack, collecting most important metrics and storing their change history Weakness: Does not monitor physical hosts

11 Alternative 1: Ceilometer Plugin in Nagios Use Nagios server as frontend for Ceilometer: Nagios plugin that queries Ceilometer database Virtual resource utilization data collected by Ceilometer Nagios server responsible for monitoring non-virtual resources Benefits: Simple and easy to implement No extra Nagios plugins required to monitor virtual devices that are managed within OpenStack Ceilometer tool can be left unchanged Drawbacks: Monitoring data is stored at 2 different places: Nagios flat file and Ceilometer database Nagios / OpenStack Integration

12 Alternative 1: Ceilometer Plugin in Nagios Implementation: Nagios plugin on client which hosts the Ceilometer API (code sample below) Initialization with default values, OpenStack authentication: #!/bin/bash #initialization with default values SERVICE='cpu_util' THRESHOLD='50.0' CRITICAL_THRESHOLD='80.0' #get openstack token to access ceilometer-api export OS_USERNAME="youruser" export OS_TENANT_NAME="yourtenant" export OS_PASSWORD="yourpassword" export OS_AUTH_URL=http://yourkeystoneurl:35357/v2.0/ Nagios / OpenStack Integration

13 Alternative 1: Ceilometer Plugin in Nagios The plugin should receive paramaters for: Resource to be monitored (VM) Service (Ceilometer metric) Warning threshold Critical threshold while getopts ":hs:t:T:" opt do case $opt in h ) printusage;; r ) RESOURCE=${OPTARG};; s ) SERVICE=${OPTARG};; t ) THRESHOLD=${OPTARG};; T ) CRITICAL_THRESHOLD=${OPTARG};; ? ) printusage;; esac done Nagios / OpenStack Integration

14 Alternative 1: Ceilometer Plugin in Nagios Query Nova API to get resource to monitor (VM to be monitored): RESOURCE=$(nova list | grep $RESOURCE | tail -2 | head -1 | awk -F '|' '{print $2; end}') RESOURCE=$(echo $RESOURCE) Query metric on that resource, multiple entries possible requires an iterator): ITERATOR=$(ceilometer meter-list -q "resource_id=$RESOURCE" | grep -w $SERVICE | awk 'END{print NR; end}') Initialize with return code 0 (no warning or error): RETURNCODE=0 Nagios / OpenStack Integration

15 Alternative 1: Ceilometer Plugin in Nagios Iterate through metric: for (( C=1; C<=$ITERATOR; C++ )) do METER_NAME=$(ceilometer meter-list -q "resource_id=$RESOURCE" | grep -w $SERVICE | awk -F '|' -v var="$C" '{if (NR == var) {print $2 $1; end}}') METER_UNIT=$(ceilometer meter-list -q "resource_id=$RESOURCE" | grep -w $SERVICE | awk -F '|' -v var="$C" '{if (NR == var) {print $4 $1; end}}') RESOURCE_ID=$(ceilometer meter-list -q "resource_id=$RESOURCE" | grep -w $SERVICE | awk -F '|' -v var="$C" '{if (NR == var) {print $5 $1; end}}') ACTUAL_VALUE=$(ceilometer sample-list -m $METER_NAME -q "resource_id=$RESOURCE" -l 1 | grep $RESOURCE_ID | head -4 | tail -1| awk -F '|' '{print $5; end}') Nagios / OpenStack Integration

16 Alternative 1: Ceilometer Plugin in Nagios Update return code if value of one metric is above a threshold: if [ $(echo "$ACTUAL_VALUE > $THRESHOLD" | bc) -eq 1 ] then if (( "$RETURNCODE" < "1" )) then RETURNCODE=1 fi if [ $(echo "$ACTUAL_VALUE > $CRITICAL_THRESHOLD" | bc) -eq 1 ] then if (( "$RETURNCODE" < "2" )) then RETURNCODE=2 Nagios / OpenStack Integration

17 Alternative 1: Ceilometer Plugin in Nagios Output return code: STATUS=$(echo "$METER_NAME on $RESOURCE_ID is: $ACTUAL_VALUE $METER_UNIT") echo $STATUS done echo $RETURNCODE Nagios / OpenStack Integration

18 Alternative 1: Ceilometer Plugin in Nagios Plugin can be downloaded from Github: https://github.com/kobe6661/nagios_ceilometer_plugin.git Additionally: NRPE-Plugin: remote execution of Nagios calls to Ceilometer Install NRPE on Nagios Core server and server that hosts Ceilometer API Change nrpe.cfg to include call to VM metric Nagios / OpenStack Integration

19 Alternative 1: Implementation OpenStack installed on 3 nodes: Management node: responsible for monitoring other OpenStack nodes Controller node: responsible for management and configuration of cloud resources (VMs, network) Compute node: provisions virtual resources

20 Alternative 2: Nagios OpenStack Plugins Nagios as a tool to monitor OpenStack services and VMs: Plugins to monitor health of OpenStack services As soon as new VMs are created, Nagios should monitor them Requires elastic reconfiguration of Nagios Benefits: No data duplication, Nagios is the only monitoring tool required to monitor OpenStack Drawbacks: Elastic reconfiguration Rather complex Nagios configuration Nagios / OpenStack Integration

21 Alternative 2: Nagios OpenStack Plugins Problem: Dynamic provisioning of resources (Virtual Machines) Dynamic configuration of hosts in Nagios Server required Nagios / OpenStack Integration

22 Alternative 2: Nagios OpenStack Plugins Problem: What happens if VM is terminated by end user? Nagios assumes a host failure and produces a critical warning Nagios / OpenStack Integration

23 Alternative 2: Nagios OpenStack Plugins Solution: Nova-API triggers reconfiguration of Nagios if VMs are created or terminated Nagios / OpenStack Integration

24 Alternative 2: Nagios OpenStack Plugins Another problem: VMs must have Nagios plugins installed when they are created Solution: Use only VM Images that contain Nagios plugins for VM creation OR Use package management tools like Puppet, Chef… Nagios / OpenStack Integration

25 Alternative 2: Nagios OpenStack Plugins Trigger for dynamic Nagios configuration: Find available resources via nova-api (requires name of host and IP address) #!/bin/bash NUMLINES=$(nova list | wc -l) NUMLINES=$[$NUMLINES-3] for (( C=1; C<=$ITERATOR; C++ )) do VM_NAME=$(nova list | tail -$NUMLINES | awk -F'|' -v var="$I" '{if (NR==var){print $3 $1;end}}') IP_ADDRESS=$(nova list | tail -$NUMLINES | awk -F'|' -v var="$I" '{if (NR==var){print $7 $1;end}}' | sed 's/[a-zA-Z0-9]*[=|-]//g') Nagios / OpenStack Integration

26 Alternative 2: Nagios OpenStack Plugins Trigger for dynamic Nagios configuration: Create a config file including VM name and IP address from a template (e. g. vm_template.cfg) CONFIG_FILE=$(echo $VM_NAME).cfg sed "s/ /$VM_NAME/g" vm_template.cfg>named_template.cfg sed "s/ /$IP_ADDRESS/g" named_template.cfg>$CONFIG_FILE Set Nagios as owner of the file and move file to Nagios configuration directory chown nagios.nagios $CONFIG_FILE chmod 644 $CONFIG_FILE mv $CONFIG_FILE /usr/local/nagios/etc/objects/$CONFIG_FILE Nagios / OpenStack Integration

27 Alternative 2: Nagios OpenStack Plugins Trigger for dynamic Nagios configuration: Add config file to nagios.cfg echo "cfg_file=/usr/local/nagios/etc/objects/$CONFIG_FILE" >> /usr/local/nagios/etc/nagios.cfg Restart nagios service nagios restart Nagios / OpenStack Integration

28 Alternative 2: Nagios OpenStack Plugins Why restart Nagios? Nagios must know that a new VM is present or that an old VM has been terminated Reconfigure and restart Nagios (!) Nagios / OpenStack Integration

29 Alternative 2: Nagios OpenStack Plugins Trigger for dynamic Nagios configuration: Add trigger to Nova-API: Nagios Event Broker module: Check_MK: Reconfigure Nagios dynamically: Edit nagios.cfg and restart Nagios – bad idea (!!) in a cloud environment Autoconfiguration tools: NagioSQL: Nagios / OpenStack Integration

30 Alternative 2: Nagios OpenStack Plugins What other ways do exist to dynamically reconfigure Nagios? Puppet master that triggers: VMs to install Nagios NRPE plugins and Nagios Server to update its configuration Same can be done with Chef, Ansible… Drawback: Puppet scalability if 1‘000s of servers have to be (de-)commisioned dynamically Nagios / OpenStack Integration

31 Alternative 2: Nagios OpenStack Plugins What other ways do exist to dynamically reconfigure Nagios? Python fabric with Cuisine to trigger: VMs to install Nagios NRPE plugins and Nagios Server to update its configuration Get list of VMs from novaclient.client import Client nova = Client(VERSION, USERNAME, PASSWORD, PROJECT_ID, AUTH_URL) servers = nova.servers.list() Write VM list to file file = open('servers'‚ 'w') file.write(servers) Nagios / OpenStack Integration

32 Alternative 2: Nagios OpenStack Plugins What other ways do exist to dynamically reconfigure Nagios? Python fabric with Cuisine to trigger: VMs to install Nagios NRPE plugins and Nagios Server to update its configuration Create fabfile.py and define which servers should be configured from fabric.api import * from. import vm_recipe, nagios_recipe env.use_ssh_config = True servers=open('servers‘) serverlist=[str(line) for line in servers] env.roledefs = {‘vm': serverlist, ‘nagios_server': xx.xx.xx.xx } Nagios / OpenStack Integration

33 Alternative 2: Nagios OpenStack Plugins Assign def configure_vm(): def configure_nagios(): nagios_recipe.ensure() Nagios / OpenStack Integration

34 Alternative 2: Nagios OpenStack Plugins Create vm_recipe.py and nagios_recipe.py from fabric.api import * import cuisine def ensure(): if not is_installed(): puts("Installing NRPE...") install() else: puts(„NRPE already installed") def install_prerequisites(): cuisine.package_ensure(„nrpe") Nagios / OpenStack Integration

35 Which option should we choose? Implementation advantages and drawbacks Choice of Alternatives ImplementationAdvantagesDrawbacks A1: Ceilometer collects data Very easy solution Scales well Data duplication Two monitoring systems working in parallel A2: Shell script No data duplication Easy solution Difficult to maintain Possibly insecure Nagios is forced to restart A2: Puppet Automatic VM and Nagios configuration Allows for elastic reconfiguration of Nagios Heavyweight Bad scalability for large IaaS clusters A2: Python fabric & cuisine Lightweight Automatic VM and Nagios configuration Allows for elastic reconfiguration of Nagios Bigger configuration effort for package management with strong dependencies between packages

36 Conclusion What did you talk about? How to use Nagios to monitor an OpenStack cloud environment Cloud monitoring requirements: Elasticity, dynamic provisioning of virtual machines OpenStack monitoring tools Nagios and Ceilometer Nagios as extensible monitoring system Ceilometer captures data through Nova-API Nagios/OpenStack integration Alternative 1: Ceilometer monitors VMs with Nagios as graphical frontend Alternative 2: Nagios monitors VMs and is automatically reconfigured Discovered need for dynamic reloading of Nagios configuration Discussed advantages/drawbacks of different implementations

37 Questions? Any questions? Thanks!

38 The End Konstantin Benz


Download ppt "Monitoring Openstack – The Relationship Between Nagios and Ceilometer Konstantin Benz, Zurich University of Applied Sciences"

Similar presentations


Ads by Google