David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
The map and reduce functions in MapReduce are easy to test in isolation, which is a consequence of their functional style. For known inputs, they produce.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL June 23, 2003 GAE workshop Caltech.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
K.Harrison CERN, 21st November 2002 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Background and scope - Project organisation - Technology survey - Design -
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
K.Harrison CERN, 6th March 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Aims and design - Progress with low-level software - Progress with Graphical User.
David Adams ATLAS AJDL: Analysis Job Description Language David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
ATLAS DIAL: Distributed Interactive Analysis of Large Datasets David Adams – BNL September 16, 2005 DOSAR meeting.
David Adams ATLAS DIAL status David Adams BNL July 16, 2003 ATLAS GRID meeting CERN.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
Event Data History David Adams BNL Atlas Software Week December 2001.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
INFSO-RI Enabling Grids for E-sciencE ATLAS Distributed Analysis A. Zalite / PNPI.
David Adams ATLAS Architecture for ATLAS Distributed Analysis David Adams BNL March 25, 2004 ATLAS Distributed Analysis Meeting.
David Adams ATLAS DIAL status David Adams BNL November 21, 2002 ATLAS software meeting GRID session.
Early Thinking on ARDA in the Applications Area Torre Wenaus, BNL/CERN LCG Applications Area Manager PEB Dec 9, 2003.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
Metadata Mòrag Burgon-Lyon University of Glasgow.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL June 7, 2004 BNL Technology Meeting.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
LCG ARDA status Massimo Lamanna 1 ARDA in a nutshell ARDA is an LCG project whose main activity is to enable LHC analysis on the grid ARDA is coherently.
David Adams ATLAS Virtual Data in ATLAS David Adams BNL May 5, 2002 US ATLAS core/grid software meeting.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
15 December 2015M. Lamanna “The ARDA project”1 The ARDA Project (meeting with the LCG referees) Massimo Lamanna CERN.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL November 17, 2003 SC2003 Phoenix.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
David Adams ATLAS ATLAS Distributed Analysis: Overview David Adams BNL December 8, 2004 Distributed Analysis working group ATLAS software workshop.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
ADA Job Builder A Graphical Approach to Job Building ATLAS Software and Computing Workshop May 2005 Chun Lik Tan
David Adams ATLAS Datasets for the Grid and for ATLAS David Adams BNL September 24, 2003 ATLAS Software Workshop Database Session CERN.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
K. Harrison CERN, 21st February 2005 GANGA: ADA USER INTERFACE - Ganga release Python client for ADA - ADA job builder - Ganga release Conclusions.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
David Adams ATLAS AJDL: Abstract Job Description Language David Adams BNL June 29, 2004 PPDG Collaboration Meeting Williams Bay.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
The Ganga User Interface for Physics Analysis on Distributed Resources
SEAL Project Core Libraries and Services
Presentation transcript:

David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Contents Definitions Architecture AJDL Application Task Dataset Job High-level services Analysis service Job management service Catalog services Implementation Strategy Effort providers ARDA Role of GANGA Connection to LHCb More information

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Definitions Analysis (not necessarily distributed) Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data –AOD, ESD, … Supports user-level production of event data –e.g. MC generation, simulation and reconstruction Distributed analysis Extends the extraction and production support to include distributed users, data and processing. Natural extension of non-distributed analysis Easily invoked from any ATLAS analysis environment –including Python, ROOT, command line –easily ported to any future environment (e.g. JAS)

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Architecture

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, AJDL Acronym: Analysis Job Definition Language Used to define interfaces for high-level services Components include: Application – executable to process data Task – user configuration of application Dataset – describes input and output data Job – Activity to perform on (or off) the grid –Typical: app, task and input dataset  output dataset Following diagram shows typical component interactions

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Analysis Framework Job 1 Job 2 ApplicationTask Dataset 1 Analysis Service 1. L ocate 2. select3. Create or select 4. select 5. submit(app,tsk,ds) 6. split Dataset Dataset 2 7. create e.g. ROOT e.g. athena Result 9. create 10. gather Result 9. create exe, pkgsscripts, code ADA/DIAL user interface

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, AJDL (cont) Components must be extensible Use subtypes –E.g. HistogramDataset, EventDataset, AtlasEventDataset Generic interface –For use by (shared) generic high-level services Experiment-specific interface –For application and users Nature of components Persistent representation of data (e.g. XML) Classes to interpret this data (C++, Python, java,…) –Language bindings or re-implementations Service or resource (as in WSRF)

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Application Application specifies executable used to process data Two entry points Extract and build task Process input dataset to produce output dataset –Application + Task = Dataset transformation Carries enough information to Locate entry points –Or carry the corresponding scripts Enable installation of all required software –E.g. list of packages for use with package management system –Might be subtypes for different package management systems

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Task Task carries the user configuration for an application E.g. runtime configuration or code for shared library Nature of the task specified by the corresponding application At present the task is a collection of embedded text files Task plus application (transformation) should specify the content of input and output datasets Enable users and processing system to –Verify transformation is suitable for given input dataset –Avoid staging unneeded parts of input dataset –Predict the content of output dataset

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Dataset Provides data view Generic properties for use in high-level services: Location of data (files, DB, …) –So data can be staged Content –E.g. for ATLAS events: event ID’s and type-keys (e.g. good electrons) for each event –EventDataset is an important generic subtype Constituents for compound dataset –Natural boundaries for dataset splitting Subtypes provide interface for users and applications to access the data

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Job Interface enables users (and high-level services) to monitor and manage jobs on the grid Generic properties State: running, succeeded, failed, paused, … Input parameters (e.g. application, task and dataset) Result (e.g. output dataset) after completion Management Pause/resume Kill Update status Job management service to implement these

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, High-level services High-level services use AJDL components Middleware does not Typically high-level services are generic Only use generic properties of AJDL components Same service for different applications and datasets Different experiments or realms can share services –E.g. LHCb and ATLAS Examples Analysis (transformation) service Job management Catalogs

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Analysis service Transformation service might be a better name Provides means to create a concrete dataset Interface functions Request dataset –Input is application, task and dataset –Output is job ID –Associated job carries ID for output dataset Fetch job description –Input is job ID –Output is job

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Analysis service (cont) Example scenario for processing a high-level job Input is application, task, dataset and job configuration Map input virtual dataset to concrete representation Split into sub-datasets Create sub-job for each sub-dataset Stage files for each sub-job Locate and possibly install application Build (e.g. compile) task Run sub-jobs Gather and merge results to create output dataset Register output dataset (including replica) Job provides connection to output dataset and detailed job provenance

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Job management service Provide means to manage jobs Analysis service creating the job provides this May also want this functionality elsewhere Accessed from job interface to implement management functions Might create job service (OGSI) Or job is a resource (WSRF)

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Catalog services Repositories Store AJDL components indexed by ID Selection (metadata) catalogs Help user to select input data, task, … VDC – Virtual Dataset Catalog Prescriptions for creating datasets –Application, task input dataset DRC – Dataset Replica Catalog Mapping between virtual and concrete datasets Job catalog Detailed provenance for concrete datasets

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Implementation strategy Define AJDL Components, nature, interfaces Implement catalogs Tables in AMI Programmatic interface –(C++ with Python binding) Analysis services Start with existing services or analogs –DIAL, ATCOM, Capone, GANGA, … Different implementations for different strategies At least one using ARDA middleware

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Implementation strategy (cont) User interface Programmatic interface to high-level services and AJDL components –C++, python and eventually java bindings GANGA will provide python binding and use it to deliver a GUI –Extensible design: client tools plug into python bus Middleware Whatever works to begin ARDA services will be used in that context –Like to see better integration with other middleware efforts

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Implementation strategy (cont) Web service infrastructure Short term use independent persistent services Mid-term follow ARDA strategy –GAS – grid access service Long term follow standards such as WSRF –Dataset and job become resources? Releases Deliver working prototype in May –Robust enough for average physicist Regular releases adding functionality, improving performance and incorporating new middleware

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Effort providers Look to the following for effort: GANGA for user interface and more DIAL for interactive analysis service ARDA integration team for ARDA analysis service ARDA/EGEE and US grid projects for middleware POOL for datasets and metadata? SEAL for python-C++ integration –Later java as well? ATLAS physics and computing groups for ATLAS- specific pieces –ATLAS applications and datasets –System testing and evaluation

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, ARDA ARDA begins April 1 Two areas in LCG: Middleware development (1 st report delivered) Integration team ATLAS ARDA prototype Collaboration in context of integration team Deliver at least one analysis service base on ARDA middleware We would also like to collaborate on AJDL and other high-level services

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Role of GANGA Look to GANGA to provide Python binding (or implementation) for AJDL Client tools –Job submission –Job monitoring and management –Task management >Including JOE Comprehensive graphical analysis environment –Including the above client tools LCG analysis service? Help with system integration and testing And more…

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, Connection to LHCb To be determined This meeting? My ideal is that ATLAS and LHCB share a system Along lines of the architecture described here Most GANGA effort directed toward delivering generic high-level services and client tools Implications Most of the effort expended by GANGA developers is directly usable by both experiments Easy for others outside GANGA to contribute pieces Use by two experiments validates the idea of generic tools and services

David Adams ATLAS ATLAS dist analysis ATLAS_LHCb-GANGAMarch 22, More information ADA home page: This page has links to other projects