Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007

Similar presentations


Presentation on theme: "Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007"— Presentation transcript:

1 Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007 http://cern.ch/ganga

2 6 February 20072/16 Starting point for using Ganga to run analysis jobs Need setup for running Athena jobs with Ganga Need steering package that defines the physics analysis –This is any package where cmt/requirements defines all dependencies –In the hands-on exercises, and for anyone who’s followed the analysis examples in the ATLAS Workbook, the steering package is UserAnalysis Work from /run subdirectory of steering package Three possibilities for submitting analysis jobs –Use Ganga’s athena script One-line command, with many options –Use CLIP commands, interactively or in script Provides greatest flexibility –Use GUI Dealt with in separate session

3 6 February 20073/16 Submitting analysis job to LCG from the Linux shell From the Linux shell, job can be submitted to LCG using the syntax: ganga athena \ --inDS misalg_csc11.005300.PythiaH130zz4l.recon.AOD.v12003104 \ --outputdata AnalysisSkeleton.aan.root \ --split 3 \ --maxevt 100 \ --lcg \ --ce ce102.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas \ AnalysisSkeleton_topOptions.py Use Ganga’s athena script Input dataset Output data Split job into 3 subjobs Limit analysis to 100 events per subjob Submit to LCG Force use of particular compute element Job options Replace --lcg with --lsf, and omit --ce, to submit to LSF –Trivial switching between running locally and running on Grid Help available on options accepted by Ganga’s athena script ganga athena --help

4 6 February 20074/16 Monitoring job progress and retrieving output To monitor job progress, you should start a Ganga CLIP or GUI session In CLIP, changes in the status of jobs/subjobs are buffered, and are printed when you hit return At any time, you can also explicitly request status information # print status information for all jobs jobs # Print status information for particular subjob print jobs[5].subjobs[27].status When a job completes, the Ganga monitoring loop takes care of storing the output, and registers it with DQ2 with a datasetname of the form user.username.ganga.jobid Output can be listed and retrieved using DQ2 client tools dq2_ls -f user.username.ganga.jobid dq2_get -r user.username.ganga.jobid

5 6 February 20075/16 Ganga plugins for ATLAS jobs Athena GangaObject IApplication IBackend IDatasetISplitterIMerger LCG ATLASCastorDataset DQ2Dataset ATLASDataset ATLASLocalDataset ATLASOutputDataset DQ2OutputDataset AthenaMC AthenaMCpyJY AthenaSplitterJob AthenaMCSplitterJob AthenaMCpyJTSplitterJob AthenaOutputMerger LSF LocalAnalysis Production Input data Output data ATLAS plugins used in background in Ganga’s athena script, and used explicitly for job submission in CLIP Plugins for production jobs covered in separate session

6 6 February 20076/16 Building an analysis job in CLIP In CLIP, constructing an analysis job is the same as constructing a “Hello World” job, except that there are more properties to set Merger Application Backend Input Dataset Output Dataset Splitter AthenaSplitterJob AthenaOutputMerger Athena DQ2DatasetATLASLocalDataset ATLASCastorDatasetATLASDataset DQ2OutputDatasetATLASOutputDataset LCGOther LCGOther Executable None “Hello World” jobAnalysis job

7 6 February 20077/16 Setting the Application An analysis job uses the Athena application Athena properties Athena methods

8 6 February 20078/16 Setting the input Dataset (1) Ganga provides support for two types of currently produced input datasets ATLASLocalDataset: files on local file system DQ2Dataset: datasets in DQ2/DDM system

9 6 February 20079/16 Setting the input Dataset (2) Ganga provides support for two types of legacy input dataset ATLASDataset: old mc10 data in old LFC ATLASCastorDataset:older data on CASTOR at CERN

10 6 February 200710/16 Setting the output Dataset (1) Ganga provides support for two types of output dataset ATLASOutputDataset: stored on local filesystem

11 6 February 200711/16 Setting the output Dataset (2) DQ2OutputDataset: stored on Grid SE and registered in DQ2

12 6 February 200712/16 Setting the Splitter and Merger Ganga provides for splitting an Athena job into subjobs, and a merger for combining output files Merging of ROOT files requires ROOT setup on machine where Ganga is run AthenaSplitterJob AthenaOutputMerger

13 6 February 200713/16 Running an analysis job from CLIP (1) Create application object, set job options and prepare tar file of user area –Other properties filled automatically, based on user setup app = Athena() app.application.option_file = ‘myOpts.py’ app.prepare( athena_compile = False ) Define the input dataset inData = DQ2Dataset() inData.dataset = ‘interestingDataset.AOD.v12003104’ inData.type = ‘DQ2_Local’ Define the output dataset outData = AthenaOutputDataset() outData.outputdata = ‘myOutput.root’

14 6 February 200714/16 Running an analysis job from CLIP (2) Define splitter, merger and backend splitter = AthenaSplitterJob( numsubjobs = 2 ) merger = AthenaOutputMerger() backend = LCG( CE = ‘reliableCE’ ) Create job template from defined objects t = JobTemplate( name = ‘TestAnalysis’ ) t.application = app t.backend = backend t.inputdata = inData t.outputdata = outData t.splitter = splitter t.merger = merger

15 6 February 200715/16 Running an analysis job from CLIP (3) Create job from the template and submit the job j = Job( t ) j.submit() Check job status jobs When job has completed, check standard outputs of subjobs, then retrieve and merge ROOT output files j.subjobs[0].peek( “stdout” ) j.subjobs[1].peek( “stdout” ) j.outputdata.retrieve() j.merge()

16 6 February 200716/16 Hands-on exercises https://twiki.cern.ch/twiki/bin/view/Atlas/GangaTutorial427 linked to agenda pagehttps://twiki.cern.ch/twiki/bin/view/Atlas/GangaTutorial427 –You should try exercise 3 from this Exercise 3.1: Using Ganga to submit Athena jobs from the Linux shell Exercise 3.2: Running Athena jobs locally Exercise 3.3: Running Athena jobs on LCG –Consider different types of input and output datasets Exercise 3.4: Running Athena Tag analysis on LCG


Download ppt "Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007"

Similar presentations


Ads by Google