Presentation on theme: "Monitoring and System Management in Distributed Environment (Hajautettujen Tietojärjestelmien Hallinta ja Valvonta) S-38.310 Tietoverkkotekniikan diplomityöseminaari."— Presentation transcript:
Monitoring and System Management in Distributed Environment (Hajautettujen Tietojärjestelmien Hallinta ja Valvonta) S-38.310 Tietoverkkotekniikan diplomityöseminaari 16.3.2004 Mikko Uljas Supervisor: Jorma Jormakka
Topics Research problem and methods Framework for distributed applications management Measurement data warehouse Scalability and resource usage calculations Management concepts and applications Conclusions
Research Problem How monitoring of distributed systems can be arranged in an enterprise level architecture? What management concepts can gain from a properly implemented monitoring system?
Research Methods Literature research focusing on –system monitoring and management –data warehousing –management concepts Theoretical calculations Case study
Bauers Framework for Distributed Applications Management Three-tier architecture –management applications –management services –managed nodes Management services play a central role –repository subsystem –configuration subsystem –combined control and monitoring subsystem In this thesis adopted as a common architecture for distributed systems management
Bauers Framework for Distributed Applications Management
Measurement Data Warehouse A data warehouse solution for storing monitoring data Four main components –Collector –Data integration component –Data warehouse –Data querying and reporting component Monitoring information data model
Scalability and Resource Usage Calculations Aim is to show that the framework is suitable for enterprise level architecture Three potential bottleneck points identified –amount and accumulation of monitoring information –disk space need –network traffic There are more -> further studies needed
Scalability and Resource Usage Calculations FocusPoints of observation Managed node Consider carefully how long there is a need to keep detailed historical information in the local measurement data source. Network Measurement Data Warehouse should be located in a site where network transmission capabilities are at least 100 Mbits Ethernet. Spread data transfers from different managed nodes to a longer time period. Centralized data warehouse The amount of data transferred to the data warehouse has to be chosen carefully. Use pre-summarized data when possible. Consider carefully how long there is a need to keep detailed data in the warehouse.
Management Concepts ITIL (IT Infrastructure Library) chosen as a conceptual base Three example concepts –Service level management –Capacity management –Incident management For each concept a set of example metrics Motivation => To show what they can gain from a properly implemented monitoring system
Management Concepts Management concept Gains of properly implemented monitoring system Service level management When SLAs are initially solved there is a need of historical performance data and trends. Early warning system which tells when SLA goals are about to be breached or have already been breached. Historical performance data of SLA metrics. Capacity management Historical performance data of systems key parameters. Incident management Systems real time status monitoring can provide a proactive way to deal with incidents.
Conclusions Propose a three-tier architecture for enterprise level monitoring system Measurement data warehouse solution separates –real-time updates made by management agents –complex data analysis performed by management applications No obvious bottleneck point found Management concepts gain greatly from a properly implemented monitoring system
Further Studies Because of the wide subject there has been only little room for details Case studies and working solutions –monitoring system fine tuning –data warehouse solution concentrating on monitoring information –management applications using monitoring information