Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.

Similar presentations


Presentation on theme: "David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN."— Presentation transcript:

1 David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN

2 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20032 Contents DAC mandate Scope Strategy Scenario for first release Plans for the first release GANGA status DIAL status Deliverables for the first release Conclusions

3 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20033 DAC Mandate Distributed Analysis Coordinator Is responsible for coordinating the development of software tools for distributed analysis and their integration into the ATLAS software environment Start with the analysis of existing tools such as GANGA, DIAL, AtCom… Provide users with transparent access to metadata of different sorts as well as to event data in all stages of processing Participate actively in the definition of LCG projects such as ARDA Is a member of relevant LCG committees and working groups

4 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20034 Scope Analysis (not necessarily distributed) Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data –AOD, ESD, … Supports user-level production of event data –e.g. MC generation, simulation and reconstruction Distributed analysis Extends the extraction and production support to include distributed processing and distributed data Natural extension of non-distributed analysis Easily invoked from any ATLAS analysis environment –including Python, ROOT, command line –easily ported to any future environment (e.g. JAS)

5 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20035 Strategy Implement DA as a collection of grid services As described in ARDA document Use ARDA components where possible Add missing and ATLAS-specific pieces Provide clients for ATLAS analysis environments Python, ROOT, command line Regular releases Perhaps for each SW week and ATLAS X.0 Provide useful tool Demonstrate functionality Expand functionality with each release

6 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20036 Strategy (cont) Look to common projects for most of the pieces ARDA, GANGA, DIAL, … Share as much as possible with ATLAS production –Also distributed –Similar interfaces and code for bulk and user-level production ADA (ATLAS distributed analysis) must identify these pieces and tie them together Deployment ADA services must be deployed at relevant sites Provide testing and monitoring of these services Work with facilities to deploy and maintain –Also to develop facility-specific features

7 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20037 Scenario for first release Here is a scenario for user interaction with the first release of ADA Authenticate –Proxy from authentication service Choose application –E.g. PAW to process DC1 ntuples –Or Athena to process DC2 AOD –Also Athena reconstruction? Define task –Analysis: provide code to define and fill histograms –Production: athena job options, maybe code –Perhaps select starting point from provenance catalog Select input dataset –From dataset metadata catalog service

8 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20038 Scenario for first release (cont) Create job configuration –Response time, role, … Locate processing service Submit job –Application, task, dataset, configuration While job is running –Query service for status and partial results –Examine partial results (e.g. histograms) –Kill job if results are bad When job is finished –Examine complete result –Modify task or select new dataset and repeat

9 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 20039 Plan for first release Schedule Implement and deploy in advance of March 2004 software workshop Provide starting point for discussion at that meeting Building blocks Code and developers in GANGA and DIAL –Following sections summarize current status LCG project following from ARDA –Just starting; so don’t wait but –Stay closely coupled to that project Open to contributions (especially effort) from others

10 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200310 Ganga: status update (1) Work since September software week has focused on refactoring, to create a system that is more modular and more flexible In short-term (next 1-2 months), changes will mainly affect developers In longer term (in time for DC2) will see significant gains for users: improvements in functionality, ease of use and stability Have introduced PyBus software bus, developed by W. Lavrijsen with contributions from K. Harrison Allows association between module and logical name to be made at run time Makes system more configurable: supports ATLAS/LHCb customizations and user add-ons Moving to XML-based job description Mechanics have been worked out, but still defining details of XML schema Aim to have job description consistent with DIAL (and others?)

11 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200311 Ganga: status update (2) Job-options editor (JOE) is evolving to become a more powerful, standalone component, which will be loaded by Ganga Assist user in the creation/modification of Gaudi/Athena job options by presenting the user with a hierarchical view of available options files and helping the user with value entry In process of creating Job Options Information Resource (JOIR) database – JOIR database of job options will facilitate validation by providing valid ranges, valid option choices, and option descriptions –Considering suggestions from LHCb for improving automated job-option extraction

12 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200312 Ganga: job definition and submission to LCG

13 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200313 Ganga: future plans Plans well defined up to March 2004 Work towards Ganga/DIAL integration within ADA Enable job submission to LCG Release improved version of JOE Include interface to Pacman 3 for package installation –Informal Pacman workshop pencilled in for January 2004 More tentatively, looking at possibilities for interfacing to Atlantis for displaying event data Request for GridPP funding beyond December 2004 requires ATLAS/LHCb work plan for Ganga up to September 2007 Need to ensure ATLAS priorities are taken into account

14 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200314 DIAL status Release 0.60 Made in November Has application to process combined ntuple datasets with PAW Command line and ROOT clients Processing can be done by instantiating a private scheduler or by contacting a persistent web service Dataset catalogs have been implemented –DSC – dataset selection catalog –DRC – dataset replica catalog –Datasets created for all DC1 combined ntuples

15 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200315 DIAL status (cont) High-level JDL DIAL envisions a hierarchy of schedulers Interface to these schedulers constitutes a high-level JDL (job definition language) –Job submission, monitoring and gathering of results –See figure Would like to standardize this JDL so schedulers can be shared between projects and experiments –See figure

16 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200316 User Analysis Job 1 Job 2 ApplicationTask Dataset 1 Scheduler 1. Create or locate 2. select3. Create or select 4. select 5. submit(app,tsk,ds) 6. split Dataset Dataset 2 7. create e.g. ROOT e.g. athena Result 9. fill 10. gather Result 9. fill ResultCode Components of DIAL high-level JDL

17 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200317 DIAL status: sharing via JDL

18 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200318 Deliverables for first release Comments Goal is to support the scenario outlined earlier Build on current GANGA and DIAL implementations and plans Emergence of ARDA project may change plans Coordination with ATLAS production may also lead to changes Add more tasks if more ideas and effort are found

19 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200319 Deliverables for first release (cont) Authentication service GSI based Support both EDG and US certificates High-level JDL Start from current DIAL interface Incorporate ideas from PPDG, ARDA, … –If available in time This defines the interface (WSDL) for the following analysis and production services

20 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200320 Deliverables for first release (cont) Interactive analysis service Build on existing DIAL scheduler service –Add authentication –Deploy as web or grid service Client schedulers –Keep command line and ROOT clients –Add Python (GANGA) client >Possibly with associated GUI Application/task/dataset –Keep PAW with fortran task to fill histograms from HBOOK combined ntuples –Add ROOT with C++ task to fill from ROOT ntuples? –Add athena with C++ task to fill from AOD?

21 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200321 Deliverables for first release (cont) User-level batch production service? Start from GANGA LCG submission service –Add high-level JDL –Requires GANGA to support client-server Other candidates for production services: –GCE/Chimera –DIAL –New ATLAS production model –Switch to choose between these Supported production tasks –Reconstruction –Simulation? –Event generation? –Fill histograms from AOD?

22 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200322 Deliverables for first release (cont) Dataset and file catalog services Functionality: –Means for users to select an input dataset –Means for production to register output dataset –Means for system (e.g. DIAL scheduler) to turn dataset specification into accessible physical files Start from AMI and DIAL Need file catalog and replication services –Magda, RLS1, RLS2, …

23 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200323 Conclusions Distributed analysis is a new project for ATLAS Philosophy Tightly integrate with non-distributed analysis Be neutral  use client-server mechanism to support different analysis environments and different processing systems Be flexible  capabilities (and hence demands) will change as technology evolves Be responsive to evolving user requirements Build on existing ideas and projects including GANGA, DIAL, ATLAS production and ARDA

24 David Adams ATLAS ADA Plans ATLAS SW – Grid sessionDecember 2, 200324 Conclusions (cont) Plan of action Define interface (high-level JDL) Quickly implement services for analysis, user-level production and dataset catalogs Expose to users, learn lessons and re-implement Repeat More information Web site coming soon Mail to dladams@bnl.gov


Download ppt "David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN."

Similar presentations


Ads by Google