Metrics data published Via different methods Monitoring Server

Metrics data published Via different methods Monitoring Server
Monitored Machine Agent Pattern-1 Passive? Metrics data published Via different methods Monitoring Server Monitored Machine Agent request commands to be run on the monitored machine, which return the status of the machine Bunch of Scripts in Any Language. Agent will execute those scripts. Results to central server or to other scripts to take action. Pattern-2 Active? Pattern-3: Hybrid Monitoring Server (optional) (Config/ Command/ Probe) Metrics Data NRPE (Nagios Remote plugin Executor) A Nagios centric protocol to collect remote metrics (active checks). NSCA (Nagios Service Check Acceptor) Another Nagios centric protocol for submitting results (passive checks). NRDP A Nagios replacment for NSCA. check_mk Is a protocol utilized by the check_mk monitoring system. Syslog A protocol primarily designed for submitting log records to central servers. Graphite Graphite is a graphing solution which allows you do real-time graphing. SMTP SMTP is used for sending (this more of a toy currently). CollectD A protocol for collecting information REST Web based easily firewalled protocol. internal mechanisms for checking the status of hosts and services. Collectd external programs (called plugins) to do all the dirty work Icinga, nagios, Plugins are compiled executables or scripts (Perl scripts, shell scripts, etc.) that can be run from a command line to check the status or a host or service. Icinga uses the results from plugins to determine the current status of hosts and services Plugins act as an abstraction layer between the monitoring logic present in the Icinga daemon and the actual services and hosts that are being monitored. Monitored Machine Agent

Pattern-1 Vs Pattern-2 Pattern1 Pattern2
Internal Mechanism for checking the status of the hosts/servers Not Mandatory to have such built-in mechanism – can rely on external mechanism No Trigger from the server to begin monitoring Trigger required from the monitoring server. Apart from simple thresholds and filters – complex operations are carried out at the monitoring server. Complex Modifications can happen at the agent before sending the data to the server. Passive? active monitoring of systems

Agent and pattern Agent Pattern Collectd 1 Monasca Snap 3*
Node-exporter Telegraf Icinga 2 Sensu 3 Diamond Reimann Beats Ceilometer 1|2 Munin Nagios Centreon NSClient++ OpenNMS Agent and pattern

Which is the most common Pattern?
No clear winner!

Maximum number of metrics support?
Nagios. How? Number of plugins –

Which agent supports dynamic configuration?
All Pattern-2 and Pattern-3 agents. Custom solutions for Pattern-1.

Any of the agents been used in large-scale real-world deployments
Any of the agents been used in large-scale real-world deployments? If so, please provide the details on the performance. Collectd: Many blogs by those who have deployed. No performance data – rather, minimal. Part of a complete solution. Riemann, Telegraf – same problem as collectd. Nagios: Many deployments There are performance issues.

Interoperability: Which agent is 'most interoperable'
Interoperability: Which agent is 'most interoperable'? (Work with maximum of 'servers' (collection node) Collectd monasca Prometheus nagios sensu Riemann Telegraf

Which agent provides maximum 'freedom' w. r. t
Which agent provides maximum 'freedom' w.r.t. Licenses (core agent + plugins)? Sensu, Telegraf, Reimann, Diamond. MIT license provides the maximum freedom.

Best for Time Series Databases – Direct compatibility and Publishes to maximum Databases.
Collectd. Next (Snap) Influxdb, Elasticsearch, Cassandra, OpenTSDB, Prometheus, Graphite, Riak. Druid – Missing!!!! Graphite (store and graph metrics) Collection, Forwarding, Visualization, Monitoring, Storage-Backends, etc. Collectd, Reimann, Sensu, Nagios, Icinga, Diamond, Graphite

Which agent is part of a solution that has Analytics, Alerts, and Graphing.
Almost all!

Which agent has the least - Libraries, OS/Kernel versions, etc.?
Mostly Similar, but Collectd stood out. Adaptability to the system configuration is also better. Python based tools.

Which solution has “plugins” for “processing” the metrics – and what type?
Soln Threshold Statistics Anomaly Detection Tagging Collectd Y N SNAP N? Telegraf Mostly realized by Kapacitor. Sensu Monasca Node Exporter Ceilometer Icinga Nagios & Variants N* Diamond Reimann Beats Handled by the other Elastic solutions Munin

Gaps: Are there any metrics/Events that are not supported by any of the agent and that are relevant to NFV? Metrics: NO. *** When the collection point is guest, there are challenges with those metrics that relies of Virtualization platform to expose access to the hardware (ex: PMU). Single Solution: NO Events: NO*** Single Solution: YES (no single solution addresses all the events). What all sources and references considered: Barometer project - OpenStack Monitoring Projects VNFM Monitoring Solutions. Vendor Blogs.

What all agents support metrics over REST-API.
Collectd Ceilometer SNAP Monasca* Nagios*, Icinga* Riemann*

Anybody supporting REDFISH APIs?
None. As Almost all vendors (Dell, HP, SuperMicro, etc) are already providing the support (via built-in controllers, agent-free) for both custom or generic clients. No need for additional entity between. Maybe, we can have an entity that gets the metrics via Redfish REST-APIs and share it over the ‘common-channel’ (message queue, tcp socket, etc).

Metrics data published Via different methods Monitoring Server

Similar presentations

Presentation on theme: "Metrics data published Via different methods Monitoring Server"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Metrics data published Via different methods Monitoring Server

Similar presentations

Presentation on theme: "Metrics data published Via different methods Monitoring Server"— Presentation transcript:

Similar presentations

About project

Feedback