CMS Applications – Status and Near Future Plans

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Advertisements

GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
CMS Report – GridPP Collaboration Meeting IX Peter Hobson, Brunel University4/2/2004 CMS Status Progress towards GridPP milestones Data management – the.
IEEE NSS 2003 Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what.
CMS Grid Batch Analysis Framework
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Réunion DataGrid France, Lyon, fév CMS test of EDG Testbed Production MC CMS Objectifs Résultats Conclusions et perspectives C. Charlot / LLR-École.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CMS Report – GridPP Collaboration Meeting VIII Peter Hobson, Brunel University22/9/2003 CMS Applications Progress towards GridPP milestones Data management.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
Use of R-GMA in BOSS Henry Nebrensky (Brunel University) VRVS 26 April 2004 Some slides stolen from various talks at EDG 2 nd Review (
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
CMS Report – GridPP Collaboration Meeting V Peter Hobson, Brunel University16/9/2002 CMS Status and Plans Progress towards GridPP milestones Workload management.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
…building the next IT revolution From Web to Grid…
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
David Stickland CMS Core Software and Computing
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
CMS Production Management Software Julia Andreeva CERN CHEP conference 2004.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.
BaBar-Grid Status and Prospects
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Data Management and Database Framework for the MICE Experiment
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Scalability Tests With CMS, Boss and R-GMA
US ATLAS Physics & Computing
R. Graciani for LHCb Mumbay, Feb 2006
Integrating SRB with the GIGGLE framework
Offline framework for conditions data
The LHCb Computing Data Challenge DC06
Presentation transcript:

CMS Applications – Status and Near Future Plans Dr Henry Nebrensky for Bristol University Brunel University Imperial College London

Contents Project objectives Implementation status Future plans (GRIDPP1/GRIDPP2)

Overall Objective for GRIDPP Build and test under realistic conditions* an analysis framework to enable CMS physicists to perform end user analysis of data in batch mode on the Grid. i.e. run a private analysis algorithm over a set of reconstructed data i.e. not interactive – submit once and leave to run. * i.e. Data Challenges

Approach Build tactical solution using (as far as possible) tools that already exist. Iterative development – deliver first prototype quickly covering major use cases, iterating design and implementation later to refine and extend the solution. Involve middleware developers in ensuring their general toolkit works for our applications

Data Challenges Data Challenges (DC) have been used as the prime motivator for the UK contribution to CMS Grid computing. DC04 is the most recent and ambitious – 75 million events simulated and 150 TB of data stored worldwide. We tried to bring together all the aspects of T0-T1 data transfer, replication, Grid Job submission and job status monitoring. We also aimed to contribute to the generation of CMS Monte Carlo data with UK resources.

DC04 Calibration challenge Fake DAQ (CERN) Higgs DST Event streams Calibration sample Tier-0 challenge Data distribution Calibration challenge Analysis challenge Calibration Jobs MASTER Conditions DB TAG/AOD (replica) (10-100 kB/evt) Replica Conditions DB DC04 Analysis challenge 1st pass Recon- struction 25Hz 2MB/evt 50MByte/s 4 Tbyte/day DC04 T0 challenge HLT Filter ? CERN disk pool ~40 TByte (~10 days data) Archive storage CERN Tape archive Disk cache 25Hz 1MB/evt raw 0.5MB reco DST 50M events 75 Tbyte 1TByte/day 2 months PCP CERN Tape archive Higgs background Study (requests New events) Event server SUSY Background DST

Data Challenge 04 Processed about 30M events Got above 25Hz on many short occasions… but only one full day T1s generally kept up See details in slides from GridPP 10

SRB in CMS DC04 SRB was used as the main transfer and data management tool for 3 of the 6 CMS T1 sites Mcat (metadata catalog) database hosted between RAL and Daresbury Transfers controlled by agents communicating through TMDB Difficulties communicating with the Mcat caused severe problems SRB is being phased out in favour of LCG solutions may continue to be used “behind the scenes” at some sites

Tim Barrass, Simon Metson, Dave Newbold (Bristol) SRB Tim Barrass, Simon Metson, Dave Newbold (Bristol) SRB provided a valuable service, and useful experience. The problems seen in the Data Challenge were a result of the short time-scale involved in setting up for the DC and a lack of a "burn in" of the system at the scale of the DC. SRB continues to increase in impact in many areas of UK e-science: other projects beginning to use SRB, both in HEP and outside of it. However, CMS now wants to converge on a unified solution to bulk data management. At the physical layer in each Tier-1, this will be provided products such as dcache and castor. At the top-level data management layer, Phedex is our current solution.

Tim Barrass, Simon Metson, Dave Newbold (Bristol) PhEDEx Tim Barrass, Simon Metson, Dave Newbold (Bristol) CMS’ system for managing the large-scale movement of data (ex-DC04). EGEE very interested- no similar component yet. Three basic use cases: Push (for vital - e.g. detector - data). Subscription pull (large sites subscribe to datasets e.g. as they are produced). These two are missing from existing grid use cases, but are what experiments traditionally do. Random pull (“shift this file over there”). Relates to grid-style optimisation. “User” not physicist, but replica managers, &c

PhEDEx (2) V2 close to deployment Some teething troubles moving from ad-hoc project to a more managed one (Tim Barrass/Lassi Tuura project and technical leads). Gaining experience quickly. PhEDEx extended to accommodate multiple data sources- to enable transfers of MC produced around the world. Will help production meet PRS goal of delivering 10M events per month from the Autumn. Finding/helping solve differences in approach of grid and traditional distributed computing… Plenty of feedback to other developer groups- LCG, EGEE, SRM, SRB … Hands-on Workshop at CERN before CMS week. Several talks at CHEP04. (See Tim’s Data Management talk from UK CMS meeting in Bristol: http://agenda.cern.ch/fullAgenda.php?ida=a042616)

Job submission - GROSS Imperial College GROSS is designed as an interface to the Grid for end users wishing to run distributed analysis. User authenticates themselves with the Grid, and then interacts solely with GROSS. User defines the details of an analysis task GROSS will carry out the job splitting, preparation, submission, monitoring and output file collection. The same functionality is implemented for local batch submission systems (were appropriate).

GROSS (2) Uses an Abstract Factory to build the appropriate family of classes for the specific job type which will be submitted. Currently two types have been implemented: ORCA job on LCG2 (ORCA is the full reconstruction code for CMS data) ORCA job using local batch submission system GROSS currently uses a CLI Users also have to define some configuration files.

Job monitoring - RGMA Brunel University R-GMA finally released in LCG 2.2.0 and deployed on the LCG We need to confirm that it can handle load of applications monitoring under “real-world” conditions Will be testing BOSS with R-GMA over next couple of months

R-GMA is an EDG WP3 Middleware product BOSS-RGMA Use of R - GMA in BOSS BOSS DB UI (BOSS) WN Sandbox BOSS wrapper Job Tee OutFile R GMA API Farm servlets Receiver Registry Grid infrastructure R-GMA is an EDG WP3 Middleware product

Assessing R-GMA Use the CMSIM “emulator” from last year’s tests Submit jobs to the Grid; see if messages get back how many come back Don’t need to do the number crunching in between – small number of Grid jobs can put large load on R-GMA Results don’t apply just to BOSS – any applications monitoring framework would have to shift same amount of data from those jobs.

“Future” Plans CMS CMS’ submission to the ARDA Middleware mtg is being put together Right Now.