1 NORDUnet conferenceGrid Monitoring : Paryavekshanam 9 th April 2008 PARYAVEKSHANAM STATUS MONITORING TOOL for INDIAN National Grid: GARUDA Karuna

Slides:



Advertisements
Similar presentations
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Advertisements

BetterInvestings Portfolio Manager Improving Mutual Fund Decisions Created by: QUANT IX SOFTWARE, Inc. Revised: November, 2005.
Process Description and Control
Zhongxing Telecom Pakistan (Pvt.) Ltd
1
Chapter 1: The Database Environment
Distributed Systems Architectures
Chapter 7 System Models.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Myra Shields Training Manager Introduction to OvidSP.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
1 Hyades Command Routing Message flow and data translation.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
1 Introducing the Specifications of the Metro Ethernet Forum MEF 19 Abstract Test Suite for UNI Type 1 February 2008.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Plan My Care Brokerage Training Working in partnership with Improvement and Efficiency South East.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
1. Bryan Dreiling Main Contact for Three Year Plans
Database Systems: Design, Implementation, and Management
© Tally Solutions Pvt. Ltd. All Rights Reserved Shoper 9 License Management December 09.
Auto-scaling Axis2 Web Services on Amazon EC2 By Afkham Azeez.
Course Objectives After completing this course, you should be able to:
Impressive Star Softwares (P) Ltd. Presents Sent Item Box-Detail of Mails from Tally ( 1.0 )
AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITH RESOURCE ASSURANCE AND ALLOCATION MECHANISM Shikha Mehrotra Centre for Development.
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Chapter 1: Introduction to Scaling Networks
PP Test Review Sections 6-1 to 6-6
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
© 2005 AT&T, All Rights Reserved. 11 July 2005 AT&T Enhanced VPN Services Performance Reporting and Web Tools Presenter : Sam Levine x111.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Sample Service Screenshots Enterprise Cloud Service 11.3.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
1..
Mobility Tool Fremtidens afrapportering 2013 – Erasmus Mobilitet / IP 2014 – Erasmus+ aktioner.
Database System Concepts and Architecture
Adding Up In Chunks.
Executional Architecture
GEtServices Services Training For Suppliers Requests/Proposals.
1 BRState Software Demonstration. 2 After you click on the LDEQ link to download the BRState Software you will get this message.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Subtraction: Adding UP
Essential Cell Biology
ANSC644 Bioinformatics-Database Mining 1 ANSC644 Bioinformatics §Carl J. Schmidt §051 Townsend Hall §
PSSA Preparation.
Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.
Essential Cell Biology
The DDS Benchmarking Environment James Edmondson Vanderbilt University Nashville, TN.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Chapter 13 The Data Warehouse
Chapter 13 Web Page Design Studio
Energy Generation in Mitochondria and Chlorplasts
RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.
South Dakota Library Network MetaLib User Interface South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD © South Dakota.
TCP/IP Protocol Suite 1 Chapter 18 Upon completion you will be able to: Remote Login: Telnet Understand how TELNET works Understand the role of NVT in.
Workshop on HPC in India Challenges of Garuda : The National Grid Computing Initiative of India Subrata Chattopadhyay C-DAC Knowledge Park Bangalore, India.
EU-IndiaGrid (RI ) is funded by the European Commission under the Research Infrastructure Programme GARUDA - The National Grid.
National Grid Computing Initiative - Garuda February 2006 C-DAC / Mohan Ram N 1 GARUDA National Grid Computing Initiative N. Mohan Ram Chief Investigator.
EU-IndiaGrid (RI ) is funded by the European Commission under the Research Infrastructure Programme GARUDA - The National Grid.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Presentation transcript:

1 NORDUnet conferenceGrid Monitoring : Paryavekshanam 9 th April 2008 PARYAVEKSHANAM STATUS MONITORING TOOL for INDIAN National Grid: GARUDA Karuna Co-authors: Deepika H.V.,Mangala N., Prahlada Rao BB, MohanRam N. System Software Development Group, Center for Development of Advanced Computing(C-DAC), Bangalore INDIA

2 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Overview GARUDA Architecture Monitoring Requirements Paryavekshanam Objectives Paryavekshanam Architecture Paryavekshanam Features Alert and Notification system Conclusion Presentation Plan

3 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Indian National Grid: GARUDA GARUDA is initiated by C-DAC, and is funded by Dept. of Information Technology, Govt. of India. GARUDA provides an amalgam of advanced capabilities to enable increasingly interdisciplinary scientific environments required to solve complex problems. GARUDA connects 45 national research and academic institutions, across 17 cities/locations in India. GARUDA is used by applications communities such as Weather / Climate Modeling, Disaster Management, and Bio-informatics.

4 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Grid : Key Features Geographically distributed resources across 17 cities and 45 research institute and academia Resources are dynamic and Heterogeneous in nature (Linux, Solaris, AIX) Resources are under various administrative domains Network backbone is of 2.43GB, 10/100 Mbps BW links from point –point. GARUDA middleware - Globus 2.x Multi-institutional Virtual Organization

5 9 th – 11 th April th NORDUnet conference IGIB Linux Submit node gridfs Cluster Head Node Compute Nodes Bangalore GARUDA HeadNode Cluster Head Node Cluster Head Node Chennai Linux C-DAC Bangalore AIX Cluster Head Node Cluster Head Node Compute Nodes Pune Linux RRI- Bangalore Linux C-DAC (Hyd) Linux GARUDA Grid Architecture Cluster Head Node

6 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Management & Monitoring Paryavekshanam Resources Compute, Data Storage Scientific Instruments Softwares Resource Mgmt & Scheduling Moab from Cluster Resources Load Leveler, Torque Globus 2.x Application (PoC) Disaster Management Bioinformatics Climate modeling Access Methods Access Portal Problem Solving Environments Data Management Storage Resource Broker Development Environment DIViA for Grid GridIDE GARUDA Components

7 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Ethernet based High BW capacity of Layer 2/3 MPLS VPN Scalable over entire geographic area High levels of reliability Fault tolerance and redundancy High security Effective Network Management GARUDA Network Fabric Features

8 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Resources C-DAC Centers are contributing computing resources at: Bangalore, Pune, Chennai, and Hyderabad HPC systems from partner sites. Total processor > 600 Aggregated compute power = 3.5 TFlops Satellite terminals from SAC Ahmedabad Grid Labs at Bangalore, Pune, Hyderabad

9 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Resources conti..

10 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Institute of Plasma Research, Ahmedabad Physical Research Laboratory, Ahmedabad Space Applications Centre, Ahmedabad Harish Chandra Research Institute, Allahabad Motilal Nehru National Institute of Technology, Allahabad Raman Research Institute, Bangalore National Center for Biological Sciences Indian Institute of Astrophysics, Bangalore Indian Institute of Science, Bangalore Institute of Microbial Technology, Chandigarh Punjab Engineering College, Chandigarh Madras Institute of Technology, Chennai Indian Institute of Technology, Chennai Institute of Mathematical Sciences, Chennai ERNET, Delhi Indian Institute of Technology, Delhi Jawaharlal Nehru University, Delhi Institute for Genomics and Integrative Biology, Delhi Indian Institute of Technology, Guwahati Guwahati University, Guwahati GARUDA Partners

11 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM University of Hyderabad, Hyderabad Centre for DNA Fingerprinting and Diagnostics, Hyderabad Jawaharlal Nehru Technological University, Hyderabad Indian Institute of Technology, Kanpur Indian Institute of Technology, Kharagpur Saha Institute of Nuclear Physics, Kolkatta Central Drug Research Institute, Lucknow Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow Bhabha Atomic Research Centre, Mumbai Indian Institute of Technology, Mumbai Tata Institute of Fundamental Research, Mumbai IUCCA, Pune National Centre for Radio Astrophysics, Pune National Chemical Laboratory, Pune Pune University, Pune Indian Institute of Technology, Roorkee Regional Cancer Centre, Thiruvananthapuram Vikram Sarabhai Space Centre, Thiruvananthapuram Institute of Technology, Banaras Hindu University, Varanasi GARUDA Partners conti..

12 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Grid Monitoring- Purpose Detect, record, and report faults and service degradations Ensure GARUDA operates optimally Check Status availability & usage of grid resources Monitoring data repository for developers and Admin for Troubleshooting, Scheduling, Performance tuning and Analysis.

13 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Monitoring Requirements: GARUDA Needed a simple and easy to use tool Able to handle different users perspective Information should be readily available Should have more graphical views Should produce relevant and accurate timely data Diagnose the problems of GARUDA Environment

14 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Paryavekshanam: Monitoring Tool GARUDA is monitored by PARYAVEKSHANAM PARYAVEKSHANAM in Sanskrit means Supervision PARYAVEKSHANAM is a web-based user-friendly grid monitoring tool to monitor GARUDA Grids health to enhance the reliability, usability and manageability. PARYAVEKSHANAM is scalable and can be deployed on platforms like AIX, Linux and solaris. It assists users in resource allocation/selection through various GARUDA tools like G-IDE.

15 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Components Monitored by Parya.. Computing nodes Network Grid middleware Submitted jobs Software Storage and Storage Resource Broker Scientific Instruments

16 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Paryavekshanam Architecture Client server architecture with pull model having a centralized server Resource - everything connected to grid Headnode – is the contact node of clusters Four components: –Information generator –Information Receiver –Information Repository –Paryavekshanam Visualizer

17 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Paryavekshanam Architecture

18 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Information Generator –Daemon resides on cluster Headnodes –Collects the cluster details and creates the data collection. –Data collection is processed using the MDS schema and populated into Globus MDS Paryavekshanam Architecture (Conti..) Information Receiver –Daemon that resides on the monitoring server. –requests Information Generator to produce the Data collection and fetches it from Globus MDS Information Repository –The data collection obtained from Globus MDS is processed and stored in the Information Repository. –It resides on the monitoring server –It has mirror repository for providing the fault tolerance Paryavekshanam Visualizer –User friendly Graphical User Interface –It retrieves data from Information Repository and displays through well- structured graphs and tables –Visualizer helps in diagnosing the problem areas.

19 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Paryavekshanam Features Hierarchical drill down of information Birds eye view of Grid Health through Radar Graph Dashboard providing the top level view Status bar for quick and action oriented insights Alerts generation through s Easy Interface for New site addition Multiple Views: Grid, Nodes, GOC and Network views Visualization of data in tabular and graphical format Data Gallery for analysis of historical data Search facility for resources, software stack and jobs Separate resolution for GOC monitoring

20 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Dashboard of Paryavekshanam GARUDA Connected cities on India Map Status Bar Birds eye view of Grid Health through Radar Graph Grid Strength

21 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Dashboard of Paryavekshanam Conti.. Radar Graph Compare performance of different entities on axes starting from same point Easy inference of utilization of quantitative parameters Uniform utilization of various parameters can be inferred from the radar graphs. Provides the glimpse of deviation from Ideal scenario. Grid Strength Defines health of grid and mathematically derived from radar graphs parameters It is % representation on the dashboard Colored bullets for representing different values of grid strength Globus Strength : Monitoring Globus Strength based on empirical formula. Status Bar gives the instantaneous up/down status can be drilled down further.

22 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Alert & Notification system:AlNotis Paryavekshanam captures errors generated in the grid such as failures of link, cluster, node, grid middleware and jobs through AlNotis Provides more visibility into the health of the system Any failure or breakdown of resources needs to be captured and notified Necessary for corrective actions Whenever any error occurs, generates Error s Sends Warning s when utilization crosses threshold level Well-defined Escalation procedure –Unattended errors after 48 hrs is sent to grid admins

23 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Error Message Description DescriptionError Code Error Condition Network link down eLNKpkt loss 100% Cluster downeCLSHeadNode status down Node DowneNODNode Status down Globus Component eGLBComponent fail Jobs not running eJOBtotal jobs>0, RJ =0 Warning Message Description DescriptionWarning Code Warning Condition Utilization of CPU wCPUThreshold reached (cpu load >=1) Utilization of memory wMEMThreshold reached (mem utli >= 80%) Bandwidth Utilization wB/WThreshold reached (b/w utli >= 90%) Alert & Notification system conti..

24 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM AlNotis tabulation showing the error id, date & time the error generated, effected resources and time taken to close the ticket. Alert error messages generated during the last 6 months. Alert & Notification system conti..

25 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GOC Desk : Parya.. Grid Operation Center (GOC) help Desk built for GARUDA monitoring with State of art Wall Display GOC is responsible for monitoring of the Grid Infrastructure as a whole. GOC operates in four regional areas and centrally reporting to the GOC at Bangalore Apart from monitoring through Paryavekshanam it coordinates it activities through video conferencing

26 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GOC Desk Page GOC Desk page mainly used daily monitoring Provides overall performance of parameters like BW utilization etc for 24 hrs Each graph is a hyperlinked to details of that parameter for the respective grid center. Additional table for reading accurate value on graphs.

27 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GOC Desk Page conti..

28 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Grid Overview Page: Parya.. It summarizes the performance of the entire grid for users. Provides information of all the parameters for all the centers in a tabular format It can be drilled down to fetch center resource details as Node level Summary It monitors the middleware components that provide detailed status summary for error resolving. It lists all the software available on the clusters. Helps in knowing which components of Globus are up.

29 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Grid Overview Page: Parya..

30 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Nodes view & Globus component status GSIFTP service is not available

31 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Software packages installed at headnodes

32 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Network Info Page: Parya.. Routers and switches are monitored Displays the bw avail, bw used, pkt loss, RTT and link status The report generation facility helps in maintaining the SLA of RTT, Pkt loss, Circuit uptime on monthly basis Monitors the operation of network on 24x7x365 basis

33 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Network Info Page: Parya..

34 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM SRB Server status check Status of Storage Resource Broker is checked Space availability of storage servers Report generation in word and excel format

35 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Data Gallery Page: Parya.. It archives data for reviewing the performance of the Grid in past Can view previous data both in tabular and graphical format Generates report for the duration selected.

36 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Search Page: Parya.. Resource and software search is provided for user Resources can be searched based on os, memory, cpu speed etc Softwares can be searched on categories like debugger, libraries etc.

37 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Paryavekshanam tracks the progress of submitted jobs Shows the current status based on jobid Report of jobs based on users, status, job id, duration and running at clusters are available Job search : Parya..

38 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GARUDA Resource usage - Resources are extensively used - More than 100 registered users - >600 cpus across 14 sites - 65 TB data transferred on 2.43 GB backbone

39 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Admin Page: Parya.. Paryavekshanam adds the new sites and resources through simple interface Managed by access control Modification and deletion of sites supported

40 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Conclusion Successfully monitoring GARUDA from last 2 years Dashboard has been a very useful feature aggregating lots of information AlNotis system accelerates the speed of problem rectification Paryavekshanam overall improves the usability of GARUDA

41 NORDUnet conferenceGrid Monitoring : Paryavekshanam 9 th April 2008 Thank Q

42 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM Globus Strength Each distinct value is indicative of the Globus status. It is having a value of 29 - summing up the individual distinct weights as shown below: Major 4 pillars of globus 1.Security – 10 2.Job Submission – 8 3.Data Management – 7 4.Information Services– E.g. : Globus strength = 21 Result : Security, data mgmt, info services are up and Job submission is not possible.

43 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM The value 22 shows that Data Mgmt service is down

44 9 th – 11 th April th NORDUnet conferenceGARUDA Grid Monitoring : PARYAVEKSHANAM GSIFTP service is not available