CMS Grid Batch Analysis Framework

Slides:



Advertisements
Similar presentations
CMS Applications – Status and Near Future Plans
Advertisements

CMS Report – GridPP Collaboration Meeting IX Peter Hobson, Brunel University4/2/2004 CMS Status Progress towards GridPP milestones Data management – the.
IEEE NSS 2003 Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Réunion DataGrid France, Lyon, fév CMS test of EDG Testbed Production MC CMS Objectifs Résultats Conclusions et perspectives C. Charlot / LLR-École.
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova Grape for analysis M.Corvo, F.Fanzago, N.Smirnov INFN Padova.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
CMS-ARDA Workshop 15/09/2003 CMS/LCG-0 architecture Many authors…
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL June 23, 2003 GAE workshop Caltech.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
CMS Report – GridPP Collaboration Meeting VIII Peter Hobson, Brunel University22/9/2003 CMS Applications Progress towards GridPP milestones Data management.
Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
Use of R-GMA in BOSS Henry Nebrensky (Brunel University) VRVS 26 April 2004 Some slides stolen from various talks at EDG 2 nd Review (
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
CMS Report – GridPP Collaboration Meeting V Peter Hobson, Brunel University16/9/2002 CMS Status and Plans Progress towards GridPP milestones Workload management.
CMS Stress Test Report Marco Verlato (INFN-Padova) INFN-GRID Testbed Meeting 17 Gennaio 2003.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
David Stickland CMS Core Software and Computing
The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.
RefDB: The Reference Database for CMS Monte Carlo Production Véronique Lefébure CERN & HIP CHEP San Diego, California 25 th of March 2003.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
CLRC Grid Team Glenn Patrick LHCb GRID Plans Glenn Patrick LHCb has formed a GRID technical working group to co-ordinate practical Grid.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
CMS Production Management Software Julia Andreeva CERN CHEP conference 2004.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Real Time Fake Analysis at PIC
(on behalf of the POOL team)
U.S. ATLAS Grid Production Experience
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
INFN-GRID Workshop Bari, October, 26, 2004
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Scalability Tests With CMS, Boss and R-GMA
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
N. De Filippis - LLR-Ecole Polytechnique
Presentation transcript:

CMS Grid Batch Analysis Framework HughTallini David Colling Barry Macevoy Stuart Wakefield

Contents Project objectives Requirements analysis Design outline Implementation Future plans

Objective Build an analysis framework to enable CMS physicists to perform end user analysis of DC04 data in batch mode on the Grid. i.e. run a private analysis algorithm over a set of reconstructed data i.e. not interactive – submit once and leave to run. Deliver by October ‘03

Approach Build tactical solution using (as far as possible) tools that already exist. Iterative development – deliver first prototype quickly covering major use cases, iterating design and implementation later to refine and extend the solution. Build scalable architecture that will inevitably have to evolve as requirements change (or get better determined). Use good software design practices (i.e. distinct analysis-design-implementation phases).

DC04 Calibration challenge Fake DAQ (CERN) Higgs DST Event streams Calibration sample Tier-0 challenge Data distribution Calibration challenge Analysis challenge Calibration Jobs MASTER Conditions DB TAG/AOD (replica) (10-100 kB/evt) Replica Conditions DB DC04 Analysis challenge 1st pass Recon- struction 25Hz 2MB/evt 50MByte/s 4 Tbyte/day DC04 T0 challenge HLT Filter ? CERN disk pool ~40 TByte (~10 days data) Archive storage CERN Tape archive Disk cache 25Hz 1MB/evt raw 0.5MB reco DST 50M events 75 Tbyte 1TByte/day 2 months PCP CERN Tape archive Higgs background Study (requests New events) Event server SUSY Background DST

What’s involved… Data handling Job splitting Job submission Job monitoring Job information archiving

Requirements Analysis (1) Typical Use Case 1 Private Single Step Analysis User wishes to run their analysis algorithm over 10Million MC events from a particular dataset. S/he compiles code and tests over a small sample of data on a local machine. Configures a JDL listing data sample to run over, ORCA libraries and ORCA executable to use, plus any steering files required. Submits to the framework. Uses AF to monitor jobs as they run. Uses AF to locate output data stored on Grid. Uses AF to keep record of analysis details, details of which are stored locally (privately).

Requirements Analysis (2) Typical Use Case 2 Private Multi Step Analysis As Use Case 1 but user wishes input data for analysis task to be the output data from a previous analysis. Typical Use Case 3 Group Multi Step Analysis Group of physicists wish to share input and output data Must also share details of how output data is created.

Requirements Analysis (3) Some other important requirements User’s analysis code should be identical whether running locally or on the Grid. No constraint on size of data sample to run over. Interface to Analysis Framework must be simple to use (users are physicists, not software developers) Single configuration file Single step submission

Batch Object Submission System Wrapper BOSS Local Scheduler farm node boss submit boss query boss kill BOSS DB farm node Accepts job submission from users Stores info about job in a DB Builds a wrapper around the job (jobExecutor) Sends the wrapper to the local scheduler The wrapper sends to the DB info about the job

COMMON BOSS/AF DATABASE System Design Schematic Architecture USER PHYSICS META-CATALOG UI MONITORING MODULE JOB SUBMISSION MODULE DATA INTERFACE RB BOSS GRID COMMON BOSS/AF DATABASE WN

Implementation 3 Development Areas Job preparation/submission module Data handling interface Monitoring module

Job preparation and submission Split analysis task into multiple jobs Prepare each job for submission to the Grid create JDL create job wrapper script Archive details of each task and job for future reference by user and to enable resubmission (input/output files, software versions, etc).

Prototype Tested on CMS-LCG-0 Simple shell scripts written to emulate framework ORCA 7.1.1 installed on CMS-LCG-0 WNs at Imperial Simple ORCA analysis code written and compiled on a local machine. Successfully submitted and ran analysis jobs to Grid. Important for development of job wrapper script

Job Preparation and Submission Object Model TASK TASKSPEC 1 -UserExecutable: string -InputFiles: vector<File> -OutputFiles: vector<string> -DataQuery: string -OrcaVersion : string -JDLRequirements: string 1 +createTask(UserSpecFile) +createJobs(..) +submitJobs(..) +killJobs(..) described by 1 1 +set(UserSpecFile) +getXYZ() creates 1 creates 1..* uses 1 uses JOB 1..* 1 1..* -DataSelection: int -UniqID: string -Executable: string -OutGUIDs: vector<string> -LocalInFiles: vector<File> -OutSandbox: vector<string> -InGUIDs: vector<string> PHYS_CAT WRAPPER 1 queries 1 1 Composed of +getRuns(DataQuery):vector<int> +getGUIDS(run):vector<string> +write(filename) Composed of 1 JDL BOSS +AddOutGUID(File) +AddOutSandbox(File) +AddInSandbox(File) +Submit() +Kill() +getXYZ() 1 uses 1 +write(filename) +submitJob(..) +getJobStatus(..)

Data Handling Input Data Output Data Physics catalogue (REFDB) contains metadata (user makes selection on) and data file GUIDs. Interface to REFDB exists (currently PHP) Job will be split to analyse one run at a time (ensures all data for a job is co-located). RB will send job to where data for the run is. Output Data Output file GUIDs and analysis details stored: Private analysis use case: local mySQL database Group analysis use case: centralised database (RefDB)

Monitoring Using Boss on the GRID BOSS DB Local BOSS gateway gatekeeper farm node BOSS DB farm node boss submit boss query boss kill GRID Scheduler gatekeeper boss registerScheduler Can be used now using native MySQL calls – tested and used for CMS productions on EDG and CMS/LCG-0 More scalable transport mechanism being investigated using RGMA (IC/Brunel team – see Peter Hobson’s talk).

Implementation Roadmap Requirements analysis Gain experience running ORCA locally Run ORCA on CMS-LCG-0 using simple prototype Design Implementation Build job submission module Build monitoring module Build data catalogue interface Commence testing on LCG-1 Iterate design/implementation as users give feedback new requirements. Completed 1-Oct