Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t CF 10. 12. 2010. - Post-C5 Lemon-web 2.0 Daniel Lenkes and Ivan Fedorko.

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

26/05/2004HEPIX, Edinburgh, May Lemon Web Monitoring Miroslav Šiket CERN IT/FIO
IWay Service Manager 6.1 Product Update Scott Hathaway iWay Software Copyright 2010, Information Builders. Slide 1.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
Building Enterprise Information Portal using Oracle Portal 3
Overview of Search Engines
Institute of Computer Science AGH Performance Monitoring of Java Web Service-based Applications Włodzimierz Funika, Piotr Handzlik Lechosław Trębacz Institute.
Understanding and Managing WebSphere V5
Operating Systems & Infrastructure Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search Updates Eduardo Alvarez November.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Web Content Management System Discussion.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Introduction: Drupal is a free and open-source content management system (CMS). A content management system(CMS) is a computer program that allows publishing,
Selected Topics in Software Computing Distributed Software Development CVSQL Final Project Presentation.
SharePoint 2010 Business Intelligence Module 10: Reporting Services.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Advanced Topics INE2720 Web Application Software Development Essential Materials.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
AI project components: Facter and Hiera
Framework Universal & Infinite Software Solution.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Messaging System Ivan, Omar, Sergio 14 march 2012.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Overview Scale out architecture Servers, services, and topology in Central Administration.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
WINDOWS AZURE PLATFORM ROADMAP Eric Nelson Slide 1.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
CERN IT Department CH-1211 Geneva 23 Switzerland t Daniel Gomez Ruben Gaspar Ignacio Coterillo * Dawid Wojcik *CERN/CSIC funded by Spanish.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
CERN IT Department CH-1211 Genève 23 Switzerland t IT Monitoring WG IT/CS Monitoring System Virginie Longo September 14th 2011.
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
Lemon Monitoring Presented by Bill Tomlin CERN-IT/FIO/FD WLCG-OSG-EGEE Operations Workshop CERN, June 2006.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Automatic server registration and burn-in framework HEPIX’13 28.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon for Quattor I.Fedorko CERN CF/IT 16 March 2011.
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
ICM – API Server Gary Ratcliffe. 2 Agenda Webinar Programme API Server Overview JSON-RPC iCM API Service API Server and Forms New services under.
CERN General Infrastructure Services Department CERN GS Department CH-1211 Geneva 23 Switzerland Db Futures Workshop
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CF Monitoring: Lemon, LAS, SLS I.Fedorko(IT/CF) IT-Monitoring.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Alarming with GNI VOC WG meeting 12 th September.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon monitoring and Lemon Alarm System (sensors, exception, alarm)
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CC Monitoring I.Fedorko on behalf of CF/ASI 18/02/2011 Overview.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Cluman: Advanced Cluster Management for Large-scale Infrastructures.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Homework 5 DNS 、 HTTPD 、 SNMP. Requirements One dedicated domain name for yourself Setup DNS server with following records  SOA, NS, MX  Make them reasonable.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
Lemon Computer Monitoring at CERN Miroslav Siket, German Cancio, David Front, Maciej Stepniewski Presented by Harry Renshall CERN-IT/FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland.
Jean-Philippe Baud, IT-GD, CERN November 2007
System Monitoring with Lemon
Miroslav Siket, Dennis Waldron
FTS Monitoring Ricardo Rocha
Generator Services planning meeting
PHP / MySQL Introduction
TN19-TCI: Integration and API management using TIBCO Cloud™ Integration
Presentation transcript:

Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Post-C5 Lemon-web 2.0 Daniel Lenkes and Ivan Fedorko 1

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Overview Lemon Current lemon-web and our experience Development 2010: –Federated lemon –Power measurement –Lemonmrd Lemon-web 2.0 Lemon plans 2

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon overview SQL TCP/UDP HTTP Sensor Monitoring Agent Local Cache Oracle Database Repository Backend Application Server Lemon CLI Lemon-host-check Web Browser RRD tool / Python Apache/ PHP (command line tool to access data) (command line tool node exceptions) Measurement Repository User InterfacesNode Monitoring 3

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon in numbers ~11k monitored entities (~8k nodes) ~1.1k metrics, 473 exceptions, 254 classes ~60% of metrics covered by core sensors ~1.7M monitored metrics across Lemon ~300GB of data / month produced 4

CERN IT Department CH-1211 Geneva 23 Switzerland t CF How many services do we monitor? number of unique entities Metrics Count of all not null metrics entries over all metric tables (if we monitor two partitions on host, two entries are counted) 5

CERN IT Department CH-1211 Geneva 23 Switzerland t CF How many services do we monitor? number of metrics monitored number of nodes < >2502 number of nodes number of sensors/agent 6

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon-db Lemonops (latest only data) Size Used Avail Use% Mounted on 32G 29G 3.8G 89% /ORA/dbs03/LEMONOP Lemonrac (historical data) Size Used Avail Use% Mounted on 1.6T 1.5T 76G 96% /ORA/dbs03/LEMONRAC Data income: ~300 GB/month Not enough in one year 7

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon-web suite Lemon-web –~50-70k hits/day –LAS, lemon-web, entry point to cdb-tpl-viewer –~140 unique IPs accessing lemon-web /day Lemon-gateway –Called by lemon-cli –~100k hits/day –Used by ~100 sites All together ~150k hits/day 8

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon overview SQL HTTP Sensor Monitoring Agent Local Cache Oracle Database Repository Backend Application Server Lemon CLI Lemon-host-check Web Browser RRD tool / Python Apache/ PHP (command line tool to access data) (command line tool node exceptions) Measurement Repository User InterfacesNode Monitoring 9 TCP/UDP

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Web suite structure It has two parts: Lemonmrd Lemon-web Retrieve, Store, Display information 10 Lemon-web lemonmrd Configuration RRD files DB CDB

CERN IT Department CH-1211 Geneva 23 Switzerland t CF What is lemonmrd? Lemon Monitoring Repository Daemon Cache data –Collects data from DB, stores in RRD files Aggregate data Clustering on RRD level (only sum and avg) Repartition the data –Data by metrics → data by Nodes lemonmrd Configuration RRD files DB 11

CERN IT Department CH-1211 Geneva 23 Switzerland t CF What is RRD? (Round Robin Database) Fixed size cache (~14Mb /entity) Round-robin archive of data (RRA) Each RRA makes avg of the previous ones  precision lose for historical data RRA 1 1 sec 5 min 1 hour 1 day 12

CERN IT Department CH-1211 Geneva 23 Switzerland t CF What is Lemon-web? PHP based web application Provides data about entities –Hosts, clusters, power/temperature sensors etc. –(all) metrics and exceptions, alarms Metrics graphs –Based on rrd –Based on DB selects Lemon-web Configuration RRD files DB 13

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Current version Working for a couple of years without major development Tightly bound to CDB hierarchy Stable, but hits the limits Maintenance limitation 14

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Development by summer 2010 Federated Lemon (in cooperation with Morgan Stanley) Power measurements with formulas Excel export functionality Many small enhancements / bug fixes –Metric distribution (over all CC e.g.: OS metric) –Parent/child links between entities –Metric Graphs (fix for multiple primary key metrics) –RRD parameter tuning to fix gap problems 15

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Federated Lemon search over all instances federated cluster from all measurement repositories Federated Lemon-web Measurement Repository Measurement Repository Lemon-web search over entities grouping entities rrd file/entity Lemon-web search over entities grouping entities rrd file/entity 16

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Power measurement Collect power data and provide trends and efficiencies –Beyond cluster hierarchy –Beyond simple sum and average 17

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Power measurement Implemented and in production PHP+config level (extracts data from rrd and performs on fly rrd operations) Error prone configuration 18

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Current Lemonmrd - Shortcomings Performance issues with more entities –Long update loops, causing gaps in the graphs –Long startup > 30 minutes –Peaks lost or became hills Not capable for parallel processing –Bugs in the underlying (Python 2.3) libraries Maintenance –Logging level change only with restart –Missing some debug info 19

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Current Lemonmrd - Shortcomings Required improvements –Configuration change without restart like: log level –Advanced logging –Enhanced configuration (protection against mistakes) –New math operations (-, *, / ) for dynamic cluster data aggregation in current version only sum and average –Data aggregation from multiple DB backends 20

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemonmrd Results Multithread application Runtime configuration parameters: –No need to restart in case of all change Dynamic reference resolution for cluster – sub cluster hierarchy –Recursively checks the content of the cluster and preprocesses the subparts –The startup is <1min (~ 30 times faster) –Collecting loop 1-3 sec (> 100 times faster) +, avg, -, *, / operations in the cluster configuration Based on Python 2.6 → portable to SLC6 Failsafe, simplified configuration 21

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemonmrd 2.0 – cluster math Weighted summary for lemon cluster: lxmred080340% lxmred06035% dbsrvd277 35% lxmred060510% lxfsec16145% lemon2build0115% Graph reliability: How many entities are reported from the expected? 22

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemonmrd Results Improved data precision: Current: New: 23

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Current Lemon-web - Shortcomings Security concerns –e.g. Lemon Alarm System (LAS) Difficulties to add / modify functionality Limited performance 24

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon-web New design Security –CERN SSO (NICE) with E-group based authorization (critical for LAS) Architecture –MVC design pattern –Modular design –Single entry point –Using memcached, APC Maintainability –Advanced configuration –Advanced logging Controller Modules Controller Modules Model Database Model Database View Templates, layout View Templates, layout demand data request HTTP,CLI response HTML, RSS, XML, JSON 25

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon-web New features Flexible menu structure Connecting to multiple DB sources Personal views by service Tiny url-s pointing at graphs, can be embedded in any pages Auto-complete search field Possibility to support other DB engines Data export 26

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Future Plans Current activity Lemon-web 2.0 development ongoing Will be released by the end of January Lemon enhancements under consideration New Lemon DB schema –Increase of monitoring data impacts the size and performance of DB repository –Impact on many Lemon components Lemon repository data export –Reduce amount of historical data stored in DB export to data files 27

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Future Plans Lemon-sensors review/development –Pending enhancement practically on all core sensors –New sensors (e.g. for SafeHost) –Python API High level objects –Trigger alarm if > 40% of cluster nodes is on high load –Data aggregation on data collection Integration with Windows monitoring (one LAS) Support for virtualization (new instances +federated web) 28

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Nagios feedback CHEP 2010, HePix, LHC experiments Based on push model and probes Usually at scale up to 2000 nodes In combination with other tools like Ganglia Limitations: ~3000 nodes ~30k monitored services (service =(node,metric)) Attractive for application/service testing 29

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon and Nagios Lemon tasks under consideration Fresh monitoring tool review –Nagios, Ganglia, etc. Interfacing Nagios in Lemon –Monitoring of Drupal, GRID 30

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Summary Current development –Lemonmrd 2.0 Startup time <1min (~ 30 times faster) Collecting loop 1-3 sec (> 100 times faster) –Lemon-web 2.0 Required development –New Lemon DB schema and repository data export –Lemon-sensors review/development –Integration with Windows monitoring –Support for virtualization –Interfacing Nagios in Lemon Available manpower: –~50% staff FTE –Fellow FTE for the next 6 months 31

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Questions Questions? Thank you for your attention! 32