Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences.

Similar presentations


Presentation on theme: "David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences."— Presentation transcript:

1 David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences

2 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20042 Contents Goals Key concepts Datasets Transformations Jobs AJDL Service architecture Analysis services DIAL ATPROD ARDA Catalog services Data management services Clients Status ARDA Conclusions Contributors More information

3 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20043 Goals Provide to globally distributed users: Access to globally distributed data that is –Comprehensible –Enables selection of relevant data –Enables sensible placement of data Means to perform globally distributed processing on this data –High-level view that hides details of underlying middleware –But enables monitoring and debugging –Automatic, complete and accurate provenance All the above must be easy to use Well-integrated with analysis environments –Root, python, etc. Graphical views where appropriate –Browse and examine data, –Monitor jobs, …

4 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20044 Key concepts Dataset Describes a collection of data –E.g. a collection of reconstructed events, –A collection of histograms, … Transformation Defines an operation to be performed on the data Dataset  Dataset Application + task (user configuration of application) Job Instance of a transformation Typical user request processed as a collection of sub-jobs –Same transformation acting on sub-datasets –Plus dataset splitting of input and merging of output

5 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20045 Key concepts (cont)

6 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20046 Datasets Dataset includes Identifier Location of data, e.g. list of logical files –Absent for virtual datasets Content (i.e. description of the content) –E.g. list of event ID’s and the type of data for each event –Or a list of histogram names List of constituent datasets –Usually their ID’s –When dataset is composite, access to location and content may require use of the constituent datasets Dataset selection catalog holds metadata Dataset replica catalog holds replica mapping 1 Virtual  N concrete dataset mapping

7 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20047 Datasets (cont) For ATLAS data, we identify Types of data –Used to define dataset categories –Category will be part of the content specification Types of datasets –Currently C++ classes with XML data representation –Third column indicates if this class exists –Likely will move to XML schema as the primary definition See table 

8 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20048 Datasets (cont) NameType?Description EVIDSEventDataset×List of event ID’s EVGENAtlasPoolEventDataset×From event generator HITSAtlasPoolEventDataset×Hits, e.g. from GEANT DIGITSAtlasPoolEventDataset×Digitization of hits RAWAtlasByteStreamEventDatasetRaw data ESDAtlasPoolEventDataset×Event summary data AODAtlasPoolEventDataset×Analysis oriented data TAGAtlasPoolTagEventDatasetEvent metadata NTUPRootNtupleDatasetNtuples HISTORootHistogramDataset×Histograms CBNTCbntDataset×DC1 combined ntuples TEXTTextDatasetText data, e.g. log files

9 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 20049 Transformations Transformation Describes an operation to act on a dataset to produce a new dataset Has two components –Application = code shared by multiple transformations >Usually scripts to locate and run code in software packages –Task = user-supplied configuration (parameters or code) Task List of files –Presently embedded in task –Later could also be logical files Named parameters –Add this soon Typically created by user submitting the job

10 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200410 Transformations (cont) Application Two entry points (presently scripts) –Build_task to fetch task files, compile, etc –Run creates output dataset from input dataset and built task Typically created by application developer Software package management Need an interface to enable build_task and run scripts to locate software on any machine E.g. “locate mypkg 1.2.3” returns /usr/contrib/mypkg/1.2.3/rh73_gcc73 Also support querying and installation Implement as thin layer on existing package management systems –Pacman, RPM, local build, … Use service to handle installation and removal of packages

11 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200411 Transformations (cont) For ATLAS we identify the above transformations Characterized by input and output dataset categories Most common ones listed—others are possible

12 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200412 Jobs A job is an instance of a transformation acting on a dataset Output result is another dataset Partial result may be available before job is complete Typical user-submitted job is split into sub-jobs By splitting input dataset and applying the same transformation to each sub-dataset Strategies for splitting and merging results must be provided Provenance Dataset provenance is specified by recording the input dataset and transformation More complete information is available from the job: –Site, CPU, submission, start and stop times, … –Log files maintained for some period, perhaps as datasets

13 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200413 AJDL AJDL = Abstract Job Definition Language Components are representations of Dataset Transformation = Application + Task Job JobPreferences File Identifiers for all the above Presently defined as C++ classes With methods to write to and read from XML –Different for each subclass of Dataset –Same for subclasses of Job XML specified in DTD files

14 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200414 AJDL (cont) Look at moving to XML schema Automatically derive classes from XML definitions –Automatic support for other languages (python, java, …) In collaboration with GANGA and others At the same time Try to find one representation for all datasets Introduce separate type for event ID lists –Often too large to carry around in a dataset Also interested in specifying interfaces for AJDL services Those that operate on AJDL components Services listed later Interested in working with others on these specifications

15 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200415 Service architecture ADA itself is distributed Allows data access and job management to be distributed –Important for scaling to a large number of users Collection of web services –Analysis service for job processing –Job monitoring –Catalog services >Metadata >Repository >Replica (not only for files) Users interact through clients –Root client from DIAL –Python client from GANGA

16 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200416 Service architecture

17 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200417 DIAL analysis service Two instances running at BNL Long running jobs using condor job submission Interactive response using fast LSF queue Working to improve interactive response Submit jobs to perform result merging –Presently done on service host Use parallel jobs for merging Long term, look at the use of job agents –Possibly as part of ARDA Add service to act as switch Delegate jobs based on –Job requirements –Desired response time –Resource availability

18 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200418 ATPROD analysis service Enable submission to the existing ATLAS production system At least for user-level production Strategy Split input dataset Make an entry in the production catalog for each sub-job Monitor catalog and gather and merge results as jobs finish Same for the other analysis services Not yet implemented

19 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200419 ARDA analysis service Enable submission to the gLite WMS Let EGEE do the work of matchmaking, brokering, job tracking, monitoring, error reporting, … There is a service to submit to the existing prototype system Expect first release of GLite next month Quickly deploy an analysis service based on this Make regular updates taking advantage of more gLite features

20 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200420 Catalog services Goals of ADA cataloging: Provide a repository for AJDL objects indexed by ID –Insert at site A and extract with ID at site B Enable users to assign metadata to objects and retrieve with queries Record dataset provenance Provide job monitoring Identify three types of catalogs Repository –Map ID to XML string Metadata catalog –Map ID to named attributes Replica catalog –Map ID to a list of ID’s

21 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200421 Catalog services (cont) Required global catalog instances Repositories for Dataset, Application, Task, Job Metadata catalog for Dataset –Same as that used for production? Replica catalog for Dataset More later First choice is to host these in AMI (soon) Next add local job catalog to record analysis service state So service can be restarted without losing jobs Later look at issues such as Distributed cataloging Private catalogs

22 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200422 Data management services DQ (Don Quijote) was developed as part of production Provides access to file replica catalogs from all three grids Enables file movement including between grids ADA will adopt this for replica management and movement ATLAS has plan to add a file transfer service Adopt this as well when available SRM provides file management at the site level ATLAS expects sites to deploy this service DQ and ADA will use this as it is deployed GLite has a suite of data management services Including SRM Rest of service model is complex—hide it behind DQ –Already have DQ interface to AlieEn file catalog

23 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200423 Clients DIAL provides a ROOT client ACLiC used to build dictionaries for DIAL classes –All DIAL classes available on the ROOT command line –Enables catalog browsing, job submission, monitoring, etc. GANGA provides a python client PyLCGDict used to build python wrappers for DIAL classes –All DIAL classes available on the python command line Later build python-only client –Restricted functionality but –Greater portability GUI GANGA is developing a GUI –Data browsing –Configure, submit and monitor jobs

24 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200424 Status Present system includes Root and Python command line clients DIAL analysis services running –Interactive service at BNL –Batch service at BNL Datasets –Classes for combined ntuples, ATLAS-POOL event collections –All DC1 CBNT data –Few DC2 samples Transformations –DC1 CBNT  histograms –DIGI: atlasdigi-8.5.0 –RECO: atlas-reco-8.x.0. x= 3, 4, 5

25 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200425 ARDA ATLAS-ARDA prototype ARDA is a CERN project to deliver prototype distributed analysis systems for the LHC experiments –Based on gLite (EGEE middleware) The ATLAS ARDA prototype makes use of the components shown in the figure Expect functional system this year

26 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200426 Conclusions Status ADA is coming together but there is still much to do Still in demo mode; for serious use we must add –Dataset description of DC2 data –Repositories for applications, tasks, datasets and jobs in AMI –Dataset selection catalog in AMI –Dataset replica catalogs in AMI –Transformations for the full DC2 production/analysis chain –Means to move output data to a storage element Expect all this year Future developments (beyond those above) Update AJDL moving to XML schema and adding WSDL GUI (expect this soon) ATPROD service to access more compute resources ARDA service to try out EGEE middleware Improvements to DIAL service to improve interactive response

27 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200427 Contributors DIAL D. Adams, W. Deng, V. Sambamurthy, N. Chetan, C. Kannan GANGA K. Harrison, C. Tan, A. Soroko ARDA D. Liko, F. Orellana AMI S. Albrand, J. Fulachier ATLAS C. Haeberli, J. Bahilo, F. Fassi, G. Rybkine, M. Branco Many useful discussions All the above and PPDG, GAG, gLite,…

28 David Adams ATLAS CHEP2004 Atlas Distributed AnalysisSept 30, 200428 More information For more information on ADA, see the home page http://www.usatlas.bnl.gov/ADA Includes status of subprojects, relevant talks and documents, and links to associated projects To try it out, run root demo 3 in the latest DIAL release http://www.usatlas.bnl.gov/~dladams/dial/releases/0.92 See the ADA paper in the CHEP2004 proceedings


Download ppt "David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences."

Similar presentations


Ads by Google