- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,

Slides:



Advertisements
Similar presentations
The Grid Job Monitoring Service Luděk Matyska et al. CESNET, z.s.p.o. Prague Czech Republic.
Advertisements

ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
GPW2005 GGF Techniques for Monitoring Large Loosely-coupled Cluster Jobs Brian L. Tierney Dan Gunter Distributed Systems Department Lawrence Berkeley National.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL June 23, 2003 GAE workshop Caltech.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
OSG Logging Architecture Update Center for Enabling Distributed Petascale Science Brian L. Tierney: LBNL.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
SUN HPC Consortium, Heidelberg 2004 Grid(Lab) Resource Management System (GRMS) and GridLab Services Krzysztof Kurowski Poznan Supercomputing and Networking.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
CEDPS: Center for Enabling Distributed Petascale Science Brian Tierney Lawrence Berkeley National Laboratory
SOS EGEE ‘06 GGF Security Auditing Service: Draft Architecture Brian Tierney Dan Gunter Lawrence Berkeley National Laboratory Marty Humphrey University.
TRƯỜNG ĐẠI HỌC CÔNG NGHỆ Bộ môn Mạng và Truyền Thông Máy Tính.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
Nick Brook Current status Future Collaboration Plans Future UK plans.
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Grid Monitoring Services Robin Middleton RAL/PPD24-May-01.
Chapter 4 Realtime Widely Distributed Instrumention System.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett The Athena Control Framework in Production, New Developments and Lessons Learned.
Adaptable Consistency Control for Distributed File Systems Simon Cuce Monash University Dept. of Computer Science and Software.
NetLogger GGF Distributed Application Analysis and Debugging using NetLogger v2 Lawrence Berkeley National Laboratory Brian L. Tierney.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
- Early Adopters (09mar00) May 2000 Prototype Framework Early Adopters Craig E. Tull HCG/NERSC/LBNL ATLAS Arch CERN March 9, 2000.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
NetLogger Using NetLogger for Distributed Systems Performance Analysis of the BaBar Data Analysis System Data Intensive Distributed Computing Group Lawrence.
1 BBN Technologies Quality Objects (QuO): Adaptive Management and Control Middleware for End-to-End QoS Craig Rodrigues, Joseph P. Loyall, Richard E. Schantz.
7. Grid Computing Systems and Resource Management
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
Exchanging Network Measurement Data using Web Services Merten Leupolt Supervisors: Daniel Gunter, DSD Martin Swany, University of Delaware DSD Meeting.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Mar 05 - hvdsOffline / HLT1  Athena SW Infrastructure  programming + applying tools wrt. dependencies between packages  developing + testing extra ideas.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
David Adams ATLAS Hybrid Event Store Integration with Athena/StoreGate David Adams BNL March 5, 2002 ATLAS Software Week Event Data Model and Detector.
Moving the LHCb Monte Carlo production system to the GRID
End-to-End Monitoring and
Distributed Systems Bina Ramamurthy 11/12/2018 From the CDK text.
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
Brian L. Tierney, Dan Gunter
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
A General Approach to Real-time Workflow Monitoring
Presentation transcript:

- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen, David Quarrie, Brian Tierney, Craig Tull HCG/NERSC/LBNL CHEP 2003 La Jolla, CA - March 24, 2003

- GMA Athena (24mar03 - CHEP La Jolla, CA) The Problem The Atlas Athena Framework has a large number of components When running in a Grid environment, and something goes wrong (e.g.: the job runs slower than expected or crashes) it is very difficult to determine which component is at fault Constant, verbose logging generates too much information Solution: We are using NetLogger and pyGMA to instrument and monitor Athena

- GMA Athena (24mar03 - CHEP La Jolla, CA) Athena/GAUDI Architecture Converter Algorithm Event Data Service Persistency Service Data Files Algorithm Transient Event Store Detec. Data Service Persistency Service Data Files Transient Detector Store Message Service JobOptions Service Particle Prop. Service Other Services Histogram Service Persistency Service Data Files Transient Histogram Store Application Manager Converter

- GMA Athena (24mar03 - CHEP La Jolla, CA) Grid Testbed Topologies (2002) EDG Testbed (star) US ATLAS (mesh) NorduGrid (mesh)

- GMA Athena (24mar03 - CHEP La Jolla, CA) Review: Grid Monitoring Architecture (GMA): Terminology and Architecture (Performance) Event: —Typed collection of data with a specific structure Producer Interface: —makes performance data (events) available Consumer Interface: —receives performance data (events) Directory Service: —supports information publication and discovery —must be distributed and/or replicated

- GMA Athena (24mar03 - CHEP La Jolla, CA) Athena Distributed Instrumentation Part of SuperComputing 2002 ATLAS demo IGMASvc  IMonitorSvc extension? —Abstract application monitoring service. NetLogger ( —End-to-End Monitoring & Analysis of Distributed Systems —C, C++, Java, Python, Perl, Tcl APIs —Web Service Activation Prophesy ( —An Infrastructure for Analyzing & Modeling the Performance of Parallel & Distributed Applications —Normally a Parse & auto-instrument approach (C & FORTRAN).

- GMA Athena (24mar03 - CHEP La Jolla, CA) DIDC Technologies Used LBNL's Data Intensive Distributed Computing Group NetLogger provides —Easy to use instrumentation library —Ability to correlate data from varies sources based on time —Easy way to collect data from multiple clients/servers reliably —Visualization and analysis tools pyGMA provides —Easy to use producer and consumer python library for constructing GGF-defined GMA services Activation Service provides —Ability to remotely trigger and collect monitoring data in running Grid applications

- GMA Athena (24mar03 - CHEP La Jolla, CA) NetLogger Toolkit DIDC have developed the NetLogger Toolkit (short for Networked Application Logger), which includes: —tools to make it easy for distributed applications to log interesting events at every critical point NetLogger client library (C, C++, Java, Perl, Python) —tools for host and network monitoring —event visualization tools that allow one to correlate application events with host/network events —NetLogger event archive and retrieval tools (new) NetLogger combines network, host, and application-level monitoring to provide a complete view of the entire system. Open Source (

- GMA Athena (24mar03 - CHEP La Jolla, CA) GMASvc Service Typical Athena Abstract Interface design. —Dual Use Library Linking Algorithms, etc & Loading DL —Concrete implementation using NetLogger —Properties to adjust: NetLogger: On/Off/Level, Distinguished User Name, Activation Service —Controlled by Environment Variables. —Use in Algorithms, Converters, StoreGate Store/Retreive, etc. GMAAuditor —Typical Athena Auditor bracketing standard Algorthm methods (initialize, execute, finalize)

- GMA Athena (24mar03 - CHEP La Jolla, CA) Atlas Athena Monitoring Activation: SC02 Demo

- GMA Athena (24mar03 - CHEP La Jolla, CA) Activation Service Architecture

- GMA Athena (24mar03 - CHEP La Jolla, CA) Activation Service GUI

- GMA Athena (24mar03 - CHEP La Jolla, CA) NetLogger Analysis: Key Concepts NetLogger visualization tools are based on time correlated and object correlated events. —precision timestamps (default = microsecond) If applications specify an “object ID” for related events, this allows the NetLogger visualization tools to generate an object “lifeline” In order to associate a group of events into a “lifeline”, you must assign an “Event ID” to each NetLogger event —Sample Event ID: file name, block ID, frame ID, etc.

- GMA Athena (24mar03 - CHEP La Jolla, CA) NLV Athena Example

- GMA Athena (24mar03 - CHEP La Jolla, CA) Completed Tasks Instrumented several Athena components with NetLogger Developed prototype activation service Developed prototype interface to the activation service for Athena monitoring events Demonstrated at SC02

- GMA Athena (24mar03 - CHEP La Jolla, CA) Current Work We are now working on expanding on the components used in the SC02 demo —Develop a “proof of concept” general purpose Grid troubleshooting architecture in concert with GANGA, Athena, DOE Science Grid Tasks include —Further integration of Atlas Software with Globus (Large ITR work related) —Further NetLogger instrumentation of Globus, GANGA, and Athena —Redesign of activation service for increased performance —Integration with Karlo Berket’s scalable and secure peer-to-peer resource discovery service will be used to locate producers

- GMA Athena (24mar03 - CHEP La Jolla, CA) For More Information NetLogger: SC02 Demo: Athena: RE/OO/architecture/General/index.html RE/OO/architecture/General/index.html

- GMA Athena (24mar03 - CHEP La Jolla, CA) Extra Slides if you want more details

- GMA Athena (24mar03 - CHEP La Jolla, CA) Monitoring Components

- GMA Athena (24mar03 - CHEP La Jolla, CA) Activation Service

- GMA Athena (24mar03 - CHEP La Jolla, CA) Ganglia Cluster Monitoring