G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.

Slides:



Advertisements
Similar presentations
Overview of local security issues in Campus Grid environments Bruce Beckles University of Cambridge Computing Service.
Advertisements

Database Architectures and the Web
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
1 CHEP 2000, Roberto Barbera Roberto Barbera (*) Grid monitoring with NAGIOS WP3-INFN Meeting, Naples, (*) Work in collaboration with.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
LIGO-G E ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech Argonne, May 10, 2004.
Riccardo Bruno INFN.CT Sevilla, Sep 2007 The GENIUS Grid portal.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
IOS110 Introduction to Operating Systems using Windows Session 9 1.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
Module 7: Fundamentals of Administering Windows Server 2008.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
LIGO-G9900XX-00-M ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech.
Computer Emergency Notification System (CENS)
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
WNoDeS – Worker Nodes on Demand Service on EMI2 WNoDeS – Worker Nodes on Demand Service on EMI2 Local batch jobs can be run on both real and virtual execution.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
INTRUSION DETECTION SYSYTEM. CONTENT Basically this presentation contains, What is TripWire? How does TripWire work? Where is TripWire used? Tripwire.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 Grid2Win: porting of gLite middleware to Windows Dario Russo INFN Catania
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Module 10: Windows Firewall and Caching Fundamentals.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid2Win: Porting of gLite middleware to.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
HLRmon accounting portal The accounting layout A. Cristofori 1, E. Fattibene 1, L. Gaido 2, P. Veronesi 1 INFN-CNAF Bologna (Italy) 1, INFN-Torino Torino.
Microsoft Virtual Academy Module 12 Managing Services with VMM and App Controller.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Grid2Win : gLite for Microsoft Windows Elisa Ingrà - INFN.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Enabling Grids for E-sciencE INFN Workshop – May 7-11 Rimini 1 Grid Accounting Status at INFN Riccardo Brunetti INFN-TORINO.
Antonio Fuentes RedIRIS Barcelona, 15 Abril 2008 The GENIUS Grid portal.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Grid2Win Porting of gLite middleware to Windows XP platform
A testbed for the SuperB computing model
INFNGRID Monitoring Group report
Brief overview on GridICE and Ticketing System
Grid2Win: Porting of gLite middleware to Windows XP platform
TYPES OF SERVER. TYPES OF SERVER What is a server.
Interoperability & Standards
a VO-oriented perspective
#01 Client/Server Computing
Managing Services with VMM and App Controller
How To Integrate an Application on Grid
A Scripting Server for Domain Automation Tasks
#01 Client/Server Computing
Presentation transcript:

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G. Russo, D. Del Prete, S. Pardi, INFN Napoli & Università Federico II, Napoli, Italy

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th The rationale The distributed computing system that will support the SuperB project will need a valid software tool for the management and monitoring Most functionalities have already been coded (e.g. Atlas tier1), but there is no general model, to be used in a distributed environment The typical case is the Italian SuperB Tier1 for offline analysis, not yet designed, but which will likely be e a distributed Tier1, over three-four separated sites. 2

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th Computing sites for SuperB in Italy 3 3

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th The Model 4 The requirements of distributed computing centers covering heterogeneous needs We require a monitoring system that allows us to centralize a wide range of services Sw requirements System Monitoring Model kind of users services

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th System Monitoring Requirements 1/2 Highly usable and cross-platform: Web Based Interactive interface Adequate user profiling: Operator, Support, User Enable user authentication through X.509 certificates Service Oriented Architecture Modular and Extensible composition Single Sign-On authentication Access to distributed remote resources 5 non-functional requirements

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 6 System Monitoring Requirements 2/3 Centralize all necessary applications in a Web portal Use individual applications as components 6

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th Why Liferay ? Already chosen by IGI, the Italian grid infrastructure new institute Experienced users in Napoli, Catania, Bari Public domain, but support is available Integration with authentication and authorization tools already done Can accomodate existing tools with minimal re-writing, if any 7 7

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th The functional requirements relate to services monitoring which belong to each level Grid architecture, ranging from all services machine monitoring (cpu, storage, ambient sensors, …), and ends with the resource monitoring that users utilizing through applications (grid sites mapping, queue and job advanced monitoring, …) 8 System Monitoring Requirements 3/3 functional requirements

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th The System Features 9 Machine-level services monitoring (Fabric Layer) “Status and Notifications about all basic services” Servers node: cpu load, disk space, free memory, ping, … Network devices: Traffic Tx-Rx, traffic load %, ping latency, Errors/Discarded packets detection, … Ambient sensors: liquid cooling and ambient temperature, fan speed, … Temporal Graph report Event log reporting Web management access Interactive maps consultation

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 10 System Features Middleware level monitoring and management Verification of node installation instance is in line with the fruition of services (job execution, toolkit software, …) Distributed package versions monitoring Distribute Initialization of the remote nodes from web interface Storage Resource Manager services monitoring

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 11 The System Features Local Resource Management Systems monitoring Grid site mapping Queues and Job advanced Monitoring All information at LRMS level, examples: – Host on which the job is running – Time when the job was created; – Time when the job is queued; – Time when the job is eligible to be sent to execution; – Time when the job was sent running Monitoring point of view: – Virtual organization – virtual organizations groups – Queues Graphical reporting states

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 12 The System Features Application-level WebUI (Application Layer) Access and management of jobs in grid systems User services: – Send the job to the grid system – Check the status of each user’s job – Delete user’s jobs – Retrieve user’s job outputs – Report any errors through the clear messages that include reason for this error – Retrieving information about the Storage Element, Computing Element, LFC and TAG – Send and retrieve files from different Storage Element and register by LFC

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th What users would like to monitor in a distributed Tier1 ? END Users: Access to Grid applications and resource monitorig All LRMS information (Queue and Job status) WebUI job submission Data flow from Tier 0 (as in Atlas Tier 2) Disk space control Device and machine status 13

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th What users would like to monitor in a distributed Tier1 ? OPERATIONS: (includes End-Users privileges) Remote administration of distributed file systems and storage resources (as in Atlas Tier 1) Remote web management access (nodes, network device, …) (as in Atlas Tier 2) Distribute Initialization of the remote nodes (from web interface) and package versions monitoring with a centralized interface (new) 14

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th What users would like to monitor in a distributed Tier1 ? SUPPORT: (includes End-Users privileges) The Support-User can access all contents of the Operator-User, but cannot change configurations Event log web interface Notifications of critical events Ticket system for troubleshooting 15

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 16 System architecture Liferay integrates all tools, and will provide x.509 authentication and services access using single sign-on philosophy

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th Using Liferay as a portlet container, we could integrate several etherogeneous tools, allowing an integrated vision 17 Atlas Tier2 experience

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 18 Atlas Network Monitoring

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th 19 Atlas Devices Monitoring

G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th EXAMPLE On-line example at: 20