Presentation is loading. Please wait.

Presentation is loading. Please wait.

System Monitoring with Lemon

Similar presentations


Presentation on theme: "System Monitoring with Lemon"— Presentation transcript:

1 System Monitoring with Lemon
Johannes Gutleber August 26, 2004

2 What is Lemon? Is a toolkit for monitoring the state of computer systems Originally developed in scope of European Data Grid project Now maintained and developed by IT in scope of ELFms project Extremely Large Fabric Management System

3 Architecture

4 Main Components MSA (Monitoring Sensor Agent) MS (Monitoring Sensor)
Gets data from sensors and forwards them to the monitoring repository MS (Monitoring Sensor) samples data and forwards them to MSA MR (Monitoring Repository) Stores data that come from MSA with backend to Oracle, flat file

5 Further Components Correlation Engine Alarm Gateway
framework for creating metric correlations Alarm Gateway for handling communication between alarm GUI and MR Interface to LASER (LHC Alarm Service)

6 Technologies Simple, ASCII based protocols Data Storage
Between sensors and agent Between agent and repository server SOAP for getting data from remote repository Data Storage Flat files in directories per host ~/lemon-spool/rubu01.cmsdaqpreseries/<time><metric> Limited functionality (e.g. no search) Oracle Automatically created tables

7 Tools 1 RRD (Round Robin Database) Export to XML, flat file formats
Data are organized in (time) series Types: Gauge, Counter, Derive, Absolute Framework for storing averages, min, max, derivatives,… Provides graphing capabilities Provides simple mathematic operations Data size does not grow with time (stores derived) Export to XML, flat file formats

8 Tools 2 Apache/PHP/OCI Data visualized with RRDTool Sensor API
Used for displaying data from repository on Web pages Data visualized with RRDTool Avoids continuous access to raw data from Oracle Sensor API Tcl, C, Perl

9 Core Team German Cancio (CERN staff) Miro Siket (CERN staff)
ELFms project leader Miro Siket (CERN staff) lemon-status framework, server redundancy David Front (Weizmann Institute, Israel) OraMon server development Maciej Stepniewski (technical student) Alarm gateway development

10 Associated Developments
Jan van Eldik (CERN staff) CERN-CC deployment and sensors Hugo Cacote (CERN fellow) tape and disk server sensors, IPMI Piotr Kolet Solaris ports Dennis Waldron (University of Bristol) MSA enhancements

11 Future Activities Production deployment and support in the CERN computer centre Replacement of SURE the legacy alarm system in CERN-CC Interface to LASER or SPECTRUM Redundancy layer for OraMon Requirement driven enhancements of the lemon-status pages and new sensors Deployment support for experiments

12 Lemon in CMS On-Line Installed on cluster in Cessy
Rubu Manual installation, agents are not restarted automatically at boot time Server and file repository at cmsdaqpreseries Installed on XDAQ development cluster Lxcmd101, 102, 103, 104 Automatic installation Communicate to server at cmsdaqpreseries

13 1 Minute Metrics in Cessy
Temperature now with lm_sensors (defunct in Linux 2.6) to be replaced with IPMI (Intelligent Platform Management Interface from Intel) in future CPU utilization, Memory statistics (used, free, swap, paging) Disk IO Disk IO Summary Network IO (eth0 - eth4, myri0) MSA alive (uptime also every 1 hour)

14 5 Minute Metrics in Cessy
Interrupts Context switches Swap IO, paging IO Existing Processes, created Processes Number of Sockets MSA footprint (used CPU and memory) Usage of root filesystem, /tmp, /var Number of users Load averages

15 Current Status Sensors and agents
Rubu01-64 and lxcmd{101,102,103,104} Flat file repository on cmsdaqpreseries /home/_xdaq/lemon-spool Data collected Total: 706 Mbytes since July 30 Around 12 Gbytes/year of raw monitoring data for 64 machine 400 kBytes/day/machine 20-30 Mbytes per day stored on cmsdaqpreseries Web pages created by Miroslav Siket (IT)

16 Web Access


Download ppt "System Monitoring with Lemon"

Similar presentations


Ads by Google