International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, 2006 1 LCG Software Activities in India Rajesh K. Computer Division BARC.

Slides:



Advertisements
Similar presentations
GridPP7 – June 30 – July 2, 2003 – Fabric monitoring– n° 1 Fabric monitoring for LCG-1 in the CERN Computer Center Jan van Eldik CERN-IT/FIO/SM 7 th GridPP.
Advertisements

26/05/2004HEPIX, Edinburgh, May Lemon Web Monitoring Miroslav Šiket CERN IT/FIO
CCTracker Presented by Dinesh Sarode Leaf : Bill Tomlin IT/FIO URL
1 CHEP 2000, Roberto Barbera Roberto Barbera (*) Grid monitoring with NAGIOS WP3-INFN Meeting, Naples, (*) Work in collaboration with.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
 Introduction Originally developed by Open Software Foundation (OSF), which is now called The Open Group ( Provides a set of tools and.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
The CERN Computer Centres October 14 th 2005 CERN.ch.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Terminal Services in Windows Server ® 2008 Infrastructure Planning and Design.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Performance and Exception Monitoring Project Tim Smith CERN/IT.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
7/2/2003Supervision & Monitoring section1 Supervision & Monitoring Organization and work plan Olof Bärring.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Module 7: Fundamentals of Administering Windows Server 2008.
Large Computer Centres Tony Cass Leader, Fabric Infrastructure & Operations Group Information Technology Department 14 th January and medium.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Hands-On Microsoft Windows Server Implementing Microsoft Internet Information Services Microsoft Internet Information Services (IIS) –Software included.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Large Farm 'Real Life Problems' and their Solutions Thorsten Kleinwort CERN IT/FIO HEPiX II/2004 BNL.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
Local Monitoring at SARA Ron Trompert SARA. Ganglia Monitors nodes for Load Memory usage Network activity Disk usage Monitors running jobs.
Microsoft Management Seminar Series SMS 2003 Change Management.
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
Lemon Monitoring Presented by Bill Tomlin CERN-IT/FIO/FD WLCG-OSG-EGEE Operations Workshop CERN, June 2006.
EU 2nd Year Review – Feb – WP4 demo – n° 1 WP4 demonstration Fabric Monitoring and Fault Tolerance Sylvain Chapeland Lord Hess.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
ATP Future Directions Availability of historical information for grid resources: It is necessary to store the history of grid resources as these resources.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
ECHO A System Monitoring and Management Tool Yitao Duan and Dawey Huang.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Presentation on developments for the period Oct Feb 2007 C.S.R.C.Murthy, Salim A. Pathan, Rohitashva Sharma & Dinesh Sarode.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
VOX Project Status T. Levshina. 5/7/2003LCG SEC meetings2 Goals, team and collaborators Purpose: To facilitate the remote participation of US based physicists.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Cluman: Advanced Cluster Management for Large-scale Infrastructures.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Lemon Computer Monitoring at CERN Miroslav Siket, German Cancio, David Front, Maciej Stepniewski Presented by Harry Renshall CERN-IT/FIO-FS.
Chapter 1 Introducing Windows Server 2012/R2
Jean-Philippe Baud, IT-GD, CERN November 2007
Simulation Production System
System Monitoring with Lemon
Status of Fabric Management at CERN
LEMON – Monitoring in the CERN Computer Centre
University of Technology
Production Manager Tools (New Architecture)
Presentation transcript:

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, LCG Software Activities in India Rajesh K. Computer Division BARC

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, DAE-CERN Collaboration on Grid Computing Agreement for collaboration in software development for LCG 5 year period (inclusive) 50 FTE years Participating DAE institutes: BARC, CAT, VECC, TIFR

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Current Projects 3 projects currently underway in the area of Grid Monitoring and Fabric Management GRIDVIEW: A visualization tool for LCG ELFMS: Extremely Large Fabric Management System CC Tracker: Computer Centre Tracker

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, GRIDVIEW – Visualization Tool for LCG Visualization system for viewing monitoring information from the LCG Dashboard showing different status views for different kinds of information – Site-wise – VO-wise – Etc. Intended for use in GOCs and ROCs but not restricted to that

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Gridview Collects monitoring information from different monitoring tools from grid sites using R-GMA as transport – Gridftp monitor – SFT – RB Logs etc. Archival of monitoring information in a central Oracle database at CERN Analysis of this data to generate summaries Visualization of summary data through Web interface and GUI

Gridview: Architecture

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Gridview: Current GridFTP transfer monitoring – In production use – Display of network throughput and total data transferred Different host/destination VO-wise, Site-wise, Host-wise Current, Hourly, Daily, Monthly etc. – Used during SC3 throughput tests

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Gridview: In development Job Status Monitoring – Total number of jobs at grid sites in different states – VO-wise, RB-wise distribution – Site wise job failure rate, utilization etc. Grid Dashboard – Pictorial representation of site status info on a world map

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, ELFMS: Extremely Large Fabric Management System Participation in CERN Project on ELFMS ELFMS is used to manage and monitor thousands of nodes in the CERN computer centre and other LCG sites Contribution to development and support for ELFMS modules LEMON and QUATTOR

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, LEMON: LHC Era MONitoring Lemon is a system designed to monitor performance metrics, exceptions & status information of extremely large clusters At CERN it monitors ~2000 nodes, ~70 clusters with ~150 metrics/host producing ~1GB of data. Estimated to monitor up to nodes A variety of web based views of monitored data for – Sysadmins, managers and users Highly modular architecture allows the integration of user developed sensors for monitoring site-specific metrics.

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Contribution to LEMON project TCP based communication between agent and server instead of the current UDP based one. SSL encryption A light weight correlation engine to generate exception metrics and launch fault tolerant actuators in response to an undesired state. This sensor supports multiple metric correlation and mathematical operations between correlated metrics

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Quattor Quattor is a tool suite providing automated installation, configuration and management of clusters and farms Highly suitable to install, configure and manage Grid computing clusters correctly and automatically At CERN, currently used to auto manage nodes >2000 with heterogeneous hardware and software applications Centrally configurable & reproducible installations, run time management for functional & security updates to maximize availability

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Contribution to Quattor project Testing all new releases Quattor outside CERN. Presently working in the SWRep (Software Repoisitory) part of Quattor. Changing SWRep framework from ssh to SOAP with ssl.

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, CC Tracker A tool to Visualize the CERN Computer Centre Simplifies management of the CERN Computer Centre for LHC scale operations Allows easy invocation of service management and operational interventions across sets of selected nodes

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, CC Tracker Functionality Visualize Computer Centre Physical View : Display of all CC rooms with Racks, Disk cabinets, Tape silos, PDUs Logical View: Display Domains, Clusters, Sub clusters & hosts in hierarchical way Manage Hardware: – Add, Move, Rename & Retire operations for a set of machines – Change cluster, update kernel, update OS, shutdown, reboot & set desired state Manage Infrastructure: – Addition/deletion of Rack, Disk Cabinet, Tape Silo, PDU and Tape drive – Updating properties (name, location) – Check power consumption by rack, zone, room & cluster – Check cost by rack, zone, room & cluster

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, CCTracker Logical & Physical view

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Completed Projects SHIVA – Problem Tracking System QoS: Quality of Service Prediction for worker nodes RDBMS backend for POOL LCG-AliEn Storage Element Interface Test Suite for perl harnessing with AliEn

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, SHIVA Problem Tracking System General purpose bug tracking system for keeping track of bugs, feature and other issues in software development projects SHIVA accepts problem reports from users, routes them to troubleshooters and maintains archives of problems and solutions Generate reports and statistics Implement helpdesk systems for services Used as a problem tracker for SC3 related work

SHIVA: Screenshot

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, QoS: Predicting Quality of Service of Worker nodes Deriving a composite metric for quality of service offered by worker node QoS computed by a correlation engine which takes simple metrics such as load average, free memory and so on as input

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, RelationalStorageSvc for POOL Designed and Developed the “Relational StorageSvc” module prototype for POOL. Designed the prototype as a plugin in POOL framework. Provided solution for Remote Database connectivity. Implemented interfaces of the POOL Storage Manager for ORACLE backend. Demonstrated the Navigation,Storage and Retrieval of data using ORACLE and the ODBC connectivity option. Tested the cross technology referencing concept of POOL Storage Manager using ROOT and ORACLE, for primitive and referenced data types Courtesy: Anil Rawat, Centre for Advanced Technology, Indore

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, RSSvc’s place in POOL Architecture RDBMS Storage Svc ? Courtesy: Anil Rawat, Centre for Advanced Technology, Indore

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, LCG-AliEn Storage Element Interface Test Bed Installation for Grid Environment: – One central server and two Sites. Installation of Certification Authority Server Installation of GridFTP Library under AliEn – The GridFTP daemon in.ftpd has been used as server and globus-url-copy has been used as client Development of AliEn-SE Interface via GridFTP – These newly developed modules along with necessary GridFTP libraries and changes made in existing AliEn Code have been committed to CVS Server at CERN. Courtesy: Tapas Samanta, VECC, Kolkata

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Quality Assurance and Test Environment for AliEn-ARDA Prototype Exploration and Design of Test Scripts using perl. Implementation of Test Scripts for each Individual perl sub- module of AliEn. Individual perl sub-modules of AliEn code were tested for proper functionalities. It Generates a detailed report of the individual tests and maintains a log. Validation of Test-Scripts and Procedures. Testing Modules with perl Harnessing Environment. The Complete Suit was tested at CERN under perl Harnessing Environment for testing AliEn online and generating online consolidated report of the test. Inline Documentation to the extent possible. Courtesy: Tapas Samanta, VECC, Kolkata

International Workshop on Large Scale Computing, VECC, Kolkata, Feb 8-10, Acknowledgements Gridview and Shiva team: Phool Chand, Digamber Sonvane, Kislay Bhatt ELFMS, CC Tracker and Qos team: R.S.Mundada, R.Sharma, D.Sarode. C.Murthy Anil Rawat, CAT, Indore T.Samanta, VECC, Kolkata Colleagues from IT Dept., CERN