Open Science Grid The OSG Accounting System: GRATIA by Philippe Canal (FNAL) & Matteo Melani (SLAC) Mumbai, India CHEP2006.

Slides:



Advertisements
Similar presentations
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Chapter 8: Network Operating Systems and Windows Server 2003-Based Networking Network+ Guide to Networks Third Edition.
Consorzio COMETA - PI2S2 Project UNIONE EUROPEA SAGE – Storage Accounting for Grid Environments in gLite Fabio Scibilia Consorzio.
EHealth Network Monitoring Network Tool Presentation J. Gaston Senior Network Design Seminar Professor Morteza Anvari 10 December 2004.
Maintaining and Updating Windows Server 2008
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
Windows Server 2008 Chapter 11 Last Update
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Chapter 1 Intro to Routing & Switching.  Networks have changed how we communicate  Everyone can connect & share  How have networks changed the way…
Current Job Components Information Technology Department Network Systems Administration Telecommunications Database Design and Administration.
Thinking about Accounting Matteo Melani SLAC Open Science Grid.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Identifying Application Impacts on Network Design Designing and Supporting Computer.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
FESR Consorzio COMETA Grid Introduction and gLite Overview Corso di formazione sul Calcolo Parallelo ad Alte Prestazioni (edizione.
Module 7: Fundamentals of Administering Windows Server 2008.
◦ What is an Operating System? What is an Operating System? ◦ Operating System Objectives Operating System Objectives ◦ Services Provided by the Operating.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
1 Network Management: SNMP The roots of education are bitter, but the fruit is sweet. - Aristotle.
1 OSG Accounting Service Requirements Matteo Melani SLAC for the OSG Accounting Activity.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.
Job and Data Accounting on the Open Science Grid Ruth Pordes, Fermilab with thanks to Brian Bockelman, Philippe Canal, Chris Green, Rob Quick.
1 Oracle Enterprise Manager Slides from Dominic Gélinas CIS
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Adrian Jackson, Stephen Booth EPCC Resource Usage Monitoring and Accounting.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Storage Accounting for Grid Environments Fabio Scibilia INFN - Catania.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
1 CEG 2400 Fall 2012 eDirectory – Directory Service.
Open Science Grid OSG Accounting System Matteo Melani SLAC 9/28/05 Joint OSG and EGEE Operations Workshop.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Open Science Grid The OSG Accounting System: GRATIA by Philippe Canal (FNAL), Chris Green (FNAL), Jeff Mack (FNAL), Penelope Constanta (FNAL), John Weigand.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
ALICE Monitoring
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Introduction to Data Management in EGI
The OSG Accounting System: GRATIA
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
University of Technology
Leigh Grundhoefer Indiana University
CLOUD COMPUTING.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Presentation transcript:

Open Science Grid The OSG Accounting System: GRATIA by Philippe Canal (FNAL) & Matteo Melani (SLAC) Mumbai, India CHEP2006

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 2 What is Accounting? (in the Grid context) Grid accounting is the process of maintaining a (consistent) Grid-wide view of VO members' resource utilization.[1] [1] Accounting in Grid Environments, by Peter Gardfjäll, Department of Computing Science, Umeå University,Sweden

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 3 Why do we want an accounting system?  Resource providers (SLAC, Fermilab…) want to perform cost- benefits analysis  Resource providers wants to improve planning  Resource providers want better security  Resource providers want to improve QoS (priorities, debugging…)  Support a Grid “Economic Model”

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 4 What is the real problem (solution)?  Nobody talked about “Grid economy”  Do we really want an Accounting system?  Or maybe a monitoring system will do? Lets look at accounting and monitoring

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 5 Accounting vs. Monitoring A monitoring system:  Purpose: monitoring system health, debugging, system profiling  Gathers state information about the system resources  Collects system events.  It works like a DAQ system: as close as possible to the system, as less intrusive as possible  Quasi Real-time to real-time An accounting system:  It keeps track of resources usage  It links a users’ service requests with the resources consumed to satisfied that requests  It has accounts, banks, “currency” and support an economic model (policies)  “After the facts”

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 6 For Example: Monitoring at SLAC What do we monitor:  Network  Switches, routers status Internet Mbytes/sec in/out  Computer Clusters  Batch systems, NFS and AFS servers, databases servers  Storage Space  Disks usage, HPPS  Some metrics we use:  CPU utilization, Memory  Disk usage, Disk I/O  Various Networking metrics (Mbytes in/out of switches, routers, servers…)  Some primitive job submission results (LSF)  We use a lot of monitoring tools and infrastructure: Ganglia, Nagios, OpenView, SNTP tools, Monalisa…

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 7 For Example: Accounting at SLAC?  The monitoring system cannot link resource usage to users/groups  Maybe by looking into the logs and correlating the events…but a lot of work  Accounting infrastructures and tools ala Ganglia or Nagios do not exist  Basically we cannot (yet) fully link a user name with a precise set of computing resource usage metrics

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 8 What I think we should track  Job submission:  Priority in the batch queue  CPU-time  Wall clock time  Memory usage  Storage  Disk usage,  Tape storage usage  Storage class (to be defined)  Network data transfer  Network speed  Quantity of data transferred  Special software usage, Operator/Administrator services…maybe later

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 9 Goals  Track services and resources usage per grid user after the fact  Focus on quality, integrity and security of the information  Accounting Information easily available to people (web interface) and to applications (Web Services)  Build a system that is simple to manage (install, configure and upgrade) and to extends (well defined APIs)  Based on well proven and standard (industrial strength) technologies  However we do not cover (but keep in mind)  User charging system,  Resources or services pricing  Support for an economic model for resource allocation

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 10 System Properties  Interoperability  The Accounting System should leverage existing standards to maximize interoperability with other Grids and Accounting Services.  Fault Tolerance  Reduce and flag data loss.  Resilient to communication failures over LAN and WAN.  resilient to the failure of one of its component.  Security  Guarantees integrity and non–repudiation of the accounting records at the site level.  Uses secure communication channels (mutual authentication, message integrity, confidentiality) and access control lists.  Scalability and Performance  Not really an issue  Other  leverage existing tools and infrastructures to solve related problems.

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 11 Simple Domain Model

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 12 Design Direction  We are currently focused on getting the infrastructure right more than the specific metrics to measure resources usage  Open: we give APIs  Distributed: Meters are distributed objects  Based on open source standard technologies: Web Services, Java Platform, Tomcat, Axis, Hibernate  Same idea as GUMS and JClarens: the service is an independent Tomcat Application (JClarens for authentication)  Insure interoperability with OSG partners (LCG, TeraGrid…)

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 13 Architecture Overview

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 14 Meter  A Meter is responsible for  Gathering all the data about a Grid service usage  Gathering all the data about the resources used by that Grid service  Assembling a Service Usage record  Logically there is 1 Meter entity per 1 Grid Service  Each Meter is composed by one or more Probes and one Assembler (plus some other components for management functions)  Grid Service uses resources distributed across the Resource Provider’s LAN, therefore the Meter is also distributed

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 15 Meter Logical View

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 16 Meter’s Probe and Assembler  Probes use secure channel (mutual authentication, data integrity) to send usage information to the Assemblers.  Usage information is packaged in ProbeEvents that are send to the Assemblers through a Web Service interface.  Each ProbeEvent object has a standard header and a payload in XML format.  Probes use “at least one semantics” technique to send ProbeEvents to the Assemblers (communication is resilient to failure)  Assemblers can choose synchronous or asynchrous processing of ProbeEvents

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 17 Collector Main functionalities:  Hosting the Meters' components (the Assemblers) that are responsible for assembling Service Usage Records  Monitoring the Meters' components called Probes  Communication between Probes and Assemblers: routing of ProbesEvents to the proper Assembler  Communication between Assemblers and Data Store

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 18 Collector Logical View Data Store Component

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 19 Accountant  This is a component thought for future use.  Main functionalities:  further process the Service Usage Records to apply economic policy (pricing & billing)

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 20 Deployment View  Deployed as a Tomcat application: can take advantage of Tomcat clustering features for scalability and availability  Collector and Publisher can run on two different Tomcat instance  Can use the most popular database implementations; the database server can be on the same host with Tomcat or on different host  Probes can run anywhere on the LAN

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 21 Deployment Diagram

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 22

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 23 Conclusion  More Information  Project Charter, Requirements and Design Documents Project Charter  OSG Accounting Twiki page and OSG Accounting Twiki  Mailing list: Mailing list  Any Questions, Comments, etc?

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 24 SPARE SLIDES

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 25

CHEP2006 Philippe Canal (Fermilab) & Matteo Melani (SLAC) 26 Prob e Collector Repository of Accounting Records Data Store Access Layer Resource Provider Site W SA PI Web Presenter Statistical Analyzer Prob e Collector Repository of Accounting Records Grid Operation Center Prob e Collector Repository of Accounting Records Data Store Access Layer VO Center Web Presenter Statistical Analyzer Prob e Data Store Access Layer Web Presenter Statistical Analyzer Overview